How to convert CHM files under Linux

Post author By Darren
Post date September 24, 2006
32 Comments on How to convert CHM files under Linux

CHM files, known as Microsoft Compressed HTML Help files, are a common format for eBooks and online documentation. They are basically a collection of HTML files stored in a compressed archive with the added benefit of an index.

Under Linux, you can view a CHM file with the xchm viewer. But sometimes that’s not enough. Suppose you want to edit, republish, or convert the CHM file into another format such as the Plucker eBook format for viewing on your Palm. To do so, you first need to extract the original HTML files from the CHM archive.

This can be done with the CHMLIB (CHM library) and its included helper application extract_chmLib.

In Debian or Ubuntu:

$ sudo apt-get install libchm-bin
$ extract_chmLib book.chm outdir

where book.chm is the path to your CHM file and outdir is a new directory that will be created to contain the HTML extracted from the CHM file.

In other Linuxes, you can install it from source. First download the libchm source archive from the above website. I couldn’t get the extract_chmLib utility to compile under the latest version 0.38, so I used version 0.35 instead.

$ tar xzf chmlib-0.35.tgz
$ cd chmlib-0.35/
$ ./configure
$ make
$ make install
$ make examples

After doing the “make examples“, you will have an executable extract_chmLib in your current directory. Here is an example of running the command with no arguments and the output it produces:

$ ./extract_chmLib
usage: ./extract_chmLib <chmfile> <outdir>

After running the utility to extract the HTML files from your CHM file, the extracted files will appear in <outdir>. There won’t be an “index.html” file, unfortunately. So you’ll have to inspect the filenames and/or their contents to find the appropriate main page or Table of Contents.

Now the HTML is yours to enjoy!

Resources

I got help in writing this article from here and here.

32 replies on “How to convert CHM files under Linux”

thanks, it really help me to get out of rut 🙂

Thanks!

I found that the chm extrator breaks often links. It extracts for a Windows filesystem where lower and upper case are the same. In linux, this breaks of course. Hence, this little perl script to fix it by adding Syslinks. I used it to make links in the Perl Best practices book:

#!/usr/local/perl

use strict;
use warnings;

chdir $ARGV[0] or die “$!”;
my @html_files = glob(“*.html”);

foreach (@html_files) {
my $new_name = $_;
$new_name =~ s/(.+)\.html/\U$1\E.html/; #put file name in capitals
$new_name =~ s/(PERLBP)(.+)/\L$1\E$2/; #put perlb in lower case
symlink $_, $new_name;
}

thankx a lot for the tip

now i can convert chm2html on terminal, a then read it with lynx !

I’ve allways said that mad people is wiser than the so called “sane people”.

Thank you a lot for this article!

Hey, thanks everyone for your comments. This chm conversion problem always bugged me, and Google wasn’t helping me too much. But when I finally found the right resources to help me, I decided to write it up so others could find out how to do this too. I’m glad I could help.

I tried to install chmlib-0.38 but when I couldn’t get it to work went for 0.35, everything went o.k untill when I tried to run the extract_chmLib.
It gives the following errors.
——————————————————————————-
./extract_chmLib: error while loading shared libraries: libchm.so.0: cannot open shared object file: No such file or directory
——————————————————————————-
Please help me if possible.

Thanks in advance

Is there anything in Linux that will do the reverse? ie; convert a bunch of HTML files to chm?

There is a more easy method that one specified here.
Open kchmviewer or install it (apt-get install kchmviewer)
File->Extract chm content … and select folder you want it to be extracted and you are done

great tool….

And then you can use htmldoc (install from Synaptic) to convert it to pdf.
Really useful to create a pdf e-book

I got 0.39 to work on FC5 by passing the –enable-examples option to ./configure.
So, run it like this:
# ./configure –enable-examples
# make
# make install

No need to run “make examples” as ./configure –enable-examples takes care of that.

Make takes what ./configure had available. If you just do ./configure it runs plain, vanilla chmLib. However, by passing the examples option to ./configure, when make is run it puts examples in the program already. When make install is run, extract_chmLib is placed in your /usr/local/bin/ folder.

I’ve tried both methods mentioned here… CHMLIB (CHM library) and KChmViewer and both work equally well. (As you would expect since KChmViewer uses chmLib to extract the CHM files)

The only caveat I’d mention is that KChmViewer creates files that seem to be about 3 times larger than using chmLib alone. KChmViewer is a lot easier for those who might be afraid of the command line, but there is a price to pay. 🙂

Thank you for taking the time to write this article.

I searched a method to extract chm files under Linux for a long time… so thank you very much !

Thank you so much for this information!

One of the most useful article I ever read!

Many thanks for this advice. I’ve been looking for a quick & easy way to do this for quite a while.

Cool tool! Thanks for all, who make it! OpenSource forever!

Thank You!

I did not notice this tool sitting right there all of the time when I still use chm_http before mirroring the file using all the switches in wget nor pavuk you could possibly imagine, resulting in all kinds of trouble. Imagine doing all trouble the first day you learn to appreciate chm documents only to find out after SEVERAL YEARS that there is a better tool sitting right next to you. Thanks for the tip.

That’s nice but I need to extract the menu. Without it, it’s very hard to navigate into the extracted html files.

Anyone have an idea?

Thank you very much for this one!
See you!

(Do not forget `life is too short for reboot` ;))

if you need the “left side navigation index” provided by the CHM file and don’t want to mess with fixing broken links, try using archmage. it’s a different tool to use to turn CHM files back into html files.

http://archmage.sourceforge.net/

in Ubuntu all I had to do was “sudo apt-get install archmage”

to use it: archmage my.chm outdir

it created a “arch_contents.html” that reproduced the “left side navigation” in the CHM file