Showing posts with label a20. Create an HTML Table of Contents from PDF Bookmarks. Show all posts
Showing posts with label a20. Create an HTML Table of Contents from PDF Bookmarks. Show all posts

Tuesday, March 18, 2008

How to create a PDF Table of Contents in HTML with pdftk and pdftoc?

First, download and install pdftk.
Pdftk can report on PDF data, including bookmarks. pdftoc converts this plain-text report into HTML. Visit http://www.pdfhacks.com/pdftoc/ and download pdftoc-1.0.zip. Unzip, and move pdftoc.exe to a convenient location, such as C:\Windows\system32\. On other platforms, build pdftoc from the source code.

Use pdftk to grab the bookmark data from your PDF, like so:

pdftk mydoc.pdf dump_data output mydoc_data.txt

Next, use pdftoc to convert this plain-text report into HTML:

pdftoc mydoc.pdf <> mydoc_toc.html

Alternatively, you can run these two steps together, like so:

pdftk mydoc.pdf dump_data | pdftoc mydoc.pdf > mydoc_toc.html

The first argument to pdftoc is the document location that you want pdftoc to use in its hyperlinks. The previous example assumes that mydoc.pdf and mydoc_toc.html will be in the same directory. You can also give a relative path to your PDF, like so:

pdftoc ../pdf/mydoc.pdf <> mydoc_toc.html

or a full URL:

pdftoc http://pdfhacks.com/pdf/mydoc.pdf <> mydoc_toc.html

Once readers enter the PDF, they can use its bookmarks for further navigation. To ensure they see your bookmarks, set your PDF to display them upon opening.

You can also add a download link on the web page that prompts the user to save the PDF on her local disk. As a courtesy to the user, mention the download file size, too.