Tuesday, January 29, 2008

Preparation for Rasterizing Intricate Artwork

When distributing a PDF online, some vector drawings outweigh their usefulness.

Vector drawings yield the highest possible quality across all media. For simple illustrations such as charts and graphs, they are also more efficient than bitmaps. However, when preparing a PDF for online distribution, you will sometimes find an intricate vector drawing that has tripled your PDF's file size. With Acrobat and Illustrator (or Photoshop), you can rasterize this detailed drawing in-place and reduce your PDF's file size.

Big Drawings in Little Spaces
How does this happen? Vector artwork scales easily without altering its quality. This means a big, detailed, 2 MB vector drawing can be scaled down perfectly to the size of a postage stamp. Even though most of its detail might no longer be visible on a paper printout or on-screen, the drawing is still 2MB in size. Again, this becomes an issue only when you go to distribute this file online and you want to reduce the document's file size.

Rasterize Intricate Artwork
If you have Adobe Acrobat 6 Pro or Acrobat 5 and Adobe Illustrator or Adobe Photoshop, you can rasterize a PDF's drawings. First you must configure Acrobat's TouchUp Object tool to open your PDF selections in Illustrator or Photoshop.

In Acrobat, select Edit >Preferences >General . . . TouchUp. Click Choose Page/Object Editor and then browse over to Illustrator.exe, which might be located somewhere such as C:\Program Files\Adobe\Illustrator 9.0.1\. Or, use Photoshop instead of Illustrator by browsing over to Photoshp.exe, which might be located somewhere such as C:\Program Files\Adobe\Photoshop 6.0\. Click Open and then click OK to confirm your new Preferences setting.

Improvement

Pass the name of your PDF document and the kw_catcher window size to make_index.sh like so:

make_index.sh mydoc.pdf 12

The script will create a document index named mydoc.index.pdf. Review this index and append it to your PDF document if you desire. The script also creates two intermediate files: mydoc.data.txt and mydoc.txt. If the PDF index is faulty, review these intermediate files for problems. Delete them when you are satisfied with the PDF index.

The second argument to make_index.sh controls the keyword detection sensitivity. Smaller numbers yield fewer keywords at the risk of omitting some keywords; larger numbers admit more keywords and also more noise.

The Code

Of course, the thing to do is to wrap this procedure into a tidy script. Copy the following Bourne shell script into a file named make_index.sh, and make it executable by applying chmod 700. Windows users can get a Bourne shell by installing MSYS .

#!/bin/sh

# make_index.sh, version 1.0

# usage: make_index.sh

# requires: pdftk, kw_catcher, page_refs,

# pdftotext, enscript, ps2pdf

#

# by Ross Presser, Imtek.com

# adapted by Sid Steward

# http://www.pdfhacks.com/kw_index/

fname=`basename $1 .pdf`

pdftk ${fname}.pdf dump_data output ${fname}.data.txt && \

pdftotext ${fname}.pdf ${fname}.txt && \

kw_catcher $2 keywords_only ${fname}.txt \

| page_refs ${fname}.txt - ${fname}.data.txt \

| enscript --columns 2 --font 'Times-Roman@10' \

--header '|INDEX' --header-font 'Times-Bold@14' \

--margins 54:54:36:54 --word-wrap --output - \

| ps2pdf - ${fname}.index.pdf

Thursday, January 24, 2008

The Procedure

First, set your PDF's logical page numbering [Hack #62] to match your document's page numbering. Then, use pdftk to dump this information into a text file, like so:

pdftk mydoc.pdf dump_data output mydoc.data.txt

Next, convert your PDF to plain text with pdftotext:

pdftotext mydoc.pdf mydoc.txt

Create a keyword list from mydoc.txt using kw_catcher, like so:

kw_catcher 12 keywords_only mydoc.txt > mydoc.kw.txt

Edit mydoc.kw.txt to remove duds and add missing keywords. At present, only one keyword is allowed per line. If two or more keywords are adjacent in mydoc.txt, our page_refs program will assemble them into phrases.

Now pull all these together to create a text index using page_refs:

page_refs mydoc.txt mydoc.kw.txt mydoc.data.txt > mydoc.index.txt

Finally, create a PDF from mydoc.index.txt using enscript and ps2pdf:

enscript --columns 2 --font 'Times-Roman@10' \

--header '|INDEX' --header-font 'Times-Bold@14' \

--margins 54:54:36:54 --word-wrap --output - mydoc.index.txt \

| ps2pdf - mydoc.index.pdf

Preparation

Creating a good document Index section is a difficult job performed by professionals. However, an automatically generated index still can be very helpful. Use automatic keywords or select your own keywords. This section will locate their pages, build a reference, and then create PDF pages that you can append to your document. It even uses your PDF's page labels (also known as logical page numbering) to ensure trouble-free lookup.
Download and install pdftotext, kw_index, and pdftk . You must also have enscript (Windows users visit http://gnuwin32.sf.net/packages/enscript.htm) and ps2pdf. ps2pdf comes with Ghostscript. The kw_index package includes the kw_catcher and page_refs programs (and source code) that we use in the following posts.

Copy Tables into a New Document

In Microsoft Word, use the macro to copy a document's tables into a new document. In Word, create the macro like so.

Open the Macros dialog box (Tools >Macro >Macros . . . ). Type CopyTablesIntoNewDocument into the "Macro name:" field, set "Macros in:" to Normal.dot, and click Create.

A window will open where you can enter the macro's code. It already will have two lines of code: Sub CopyTablesIntoNewDocument() and End Sub. You don't need to duplicate these lines.

You can download the code from http://www.pdfhacks.com/copytables/:

Run this macro from Word by selecting Tools >Macro>Macro . . . , selecting Copy Tables Into New Document, and clicking Run. A new document will open that contains all the tables from your current document. It will also include the paragraphs immediately before and after each table. This feature was added to help readers find the table they want. Modify the macro code to suit your requirements.

Saturday, January 19, 2008

Attachments and Encryption

When you encrypt a PDF, you also encrypt its attachments. The permissions you apply can affect whether users can unpack these attachments.
Once the PDF is open in Acrobat/Reader (which might require a password), any files attached to PDF pages can be unpacked, regardless of the PDF's permissions. This enables you to disable copy/paste features, yet still make select data available to your readers.

Document attachments are more restricted than page attachments. You must grant the ModifyAnnotations permission if you want your readers to be able to unpack and view document attachments.

Attach Files to PDFs with pdftk

pdftk can attach files to PDF documents and pages.

When attaching files to an existing PDF, call pdftk like so:

pdftk attach_file \

[to_page ] output

The output filename must be different from the input filename. For example, attach the file data.xls to the first page of the PDF report.pdf like so:

pdftk report.pdf attach_file data.xls to_page 1 output report.page_attachment.pdf

Attach data.xls to report.pdf as a document attachment instead of a page attachment by simply omitting the to_page parameter:

pdftk report.pdf attach_file data.xls output report.doc_attachment.pdf

You can include additional output parameters, too, such as PDF encryption options.

PDF Attachment

PDF provides a convenient package for your document. A typical PDF contains fonts, images, page streams, annotations, and metadata. It turns out that you can pack anything into a PDF file, even the source document used to create the PDF! These attachments enjoy the benefits of PDF features such as compression, encryption, and digital signatures. Attachments also enable you to provide your readers with document data, such as tables, in a native file format that they can easily use. People often ask]. Attach your document data as HTML or Excel files and give your readers exactly what they need.

Page Attachments Versus Document Attachments
You can attach a file to a particular PDF page, where it is visible as an icon. Or, you can attach a file to the PDF document so that it keeps a lower profile. After encrypting your PDF, document attachments can't be unpacked without the ModifyAnnotations permission. Page attachments, on the other hand, can be unpacked at any time, regardless of the security permissions you imposed. Of course, the PDF must be opened first, which could require a user password.

Attach Files to a PDF with Acrobat
Attach your file to a PDF page using the Attach File commenting tool. In Acrobat 6, access this tool using the Advanced Commenting toolbar or from the Tools>Advanced> Commenting Attach menu. In Acrobat 5, access this tool using the Commenting toolbar. The Attach File tool button hides under the Note tool button; click the little down arrow to discover it.

Activate the Attach File tool and the cursor becomes a push pin. Click the page where you want the attachment's icon to appear and a file selector dialog opens. Select the file to attach. A properties dialog will open, where you can customize the appearance of your attachment's icon.

As we noted, document attachments are different from page attachments. In Acrobat 6, access document attachments by selecting Document>File Attachments . . . . Select Document>File Attachments and click Import . . . to add an attachment. In Acrobat 5, select File Document Properties>Embedded Data Objects . . . . Click Import . . . to add an attachment.

Wednesday, January 16, 2008

How to Add the Decrypt PDF Context Menu Item?

Make sure you have downloaded pdftk.
Follow all the steps in the previous post, except name the action Decrypt and replace the cmd.exe arguments in step 4 with:


C:\windows\system32\cmd.exe

/C C:\windows\system32\pdftk.exe "%1" input_pw PROMPT

output "%1.decrypted.pdf"

How to Add the Encrypt PDF Context Menu Item?

Make sure you have downloaded pdftk

Windows XP and Windows 2000:

  1. In the Windows File Explorer menu, select Tools > Folder Options . . . and click the File Types tab. Select the PDF file type and click the Advanced button.
  2. Click the New . . . button and a New Action dialog appears. Give the new action the name Encrypt.
  3. Give the action an application to open by clicking the Browse . . . button and selecting cmd.exe, which lives somewhere such as C:\windows\system32\ (Windows XP) or C:\winnt\system32\ (Windows 2000).
  4. Add these arguments after cmd.exe, changing the path to suit, like so:

  5. C:\windows\system32\cmd.exe

    /C C:\windows\system32\pdftk.exe "%1" output "%1.encrypted.pdf"

    encrypt_128bits user_pw PROMPT

  6. Click OK, OK, OK and you should be done with the configuration.

PDF Encryption with pdftk

You can encrypt any PDF created with pdftk by simply adding encryption parameters after the output filename, like so:

... output
\
[encrypt_40bit | encrypt_128bit] [allow ] \
[owner_pw ] [user_pw ]

Here are the details:

[encrypt_40bit | encrypt_128bit]
Specify an encryption strength. If this strength is not given along with other encryption parameters, it defaults to encrypt_128bit.

[allow ]
List the permissions to grant users. If this section is omitted, no permissions are granted. See Tables Table 5-1 and Table 5-2 for a complete list of available permissions.

[owner_pw ]
Use this combination to set the owner password. It can be omitted; in which case no owner password is set.

[user_pw ]
Use this parameter to set the user password. It can be omitted; in which case no user password is set.

Adding these parameters yields :

pdftk A=in1.pdf B=in2.pdf C=in3.pdf \
cat A1 B1-end C5 output out.pdf \
encrypt_128bit allow CopyContents Printing \
owner_pw ownpass

Monday, January 14, 2008

Standard Security Permissions

Set the user password if you don't want people to see your PDF. If they don't have the user password, it simply won't open.

You also have some control over what people can do with your document once they have it open. The permissions associated with 128-bit security (Acrobat 5 and 6) are more precise than those associated with 40-bit security (Acrobat 3 and 4). Tables Table 1 and Table 2 list all available permissions for each security model.

Table 1. Permissions available under 40-bit security

To allow readers to . . .

Apply this pdftk permission

Print—pages are top quality

Printing

Modify page or document contents,insert or remove pages, rotate pages or add bookmarks

ModifyContents

Copy text and graphics from pages, extract text and graphics data for use by accessibility devices

CopyContents

Change or add annotations or fill form fields with data

ModifyAnnotations

Reconfigure or add form fields

ModifyContents and ModifyAnnotations

All of the above

AllFeatures

2. Permissions available under 128-bit security

To allow readers to . . .

Apply this pdftk permission

Print—pages are top quality

Printing

Print—pages are of lower quality

DegradedPrinting

Modify page or document contents, insert or remove pages, rotate pages or add bookmarks

ModifyContents

Insert or remove pages, rotate pages or add bookmarks

Assembly

Copy text and graphics from pages

CopyContents

Extract text and graphics data for use by accessibility devices

ScreenReaders

Change or add annotations or fill form fields with data

ModifyAnnotations

Fill form fields with data

FillIn

Reconfigure or add form fields

ModifyContents and ModifyAnnotations

All of the above, and top-quality printing

AllFeatures

Comparing these two tables, you can see that Assembly is a weaker version ofModifyContents and FillIn is a weaker version of ModifyAnnotations.

DegradedPrinting sends pages to the printer as rasterized images, whereas Printing sends pages as PostScript. A PostScript stream can be intercepted and turned back into (unsecured) PDF, so the Printing permission is a security risk. However, DegradedPrinting reduces the clarity of printed pages, so you should test your document to make sure DegradedPrinting yields acceptable, printed pages.

After setting these permissions and/or a user password, changing them requires the owner password, if it is set.



PDF Passwords

Acrobat Standard Security enables you to set two passwords on a PDF: the user password and the owner password. In Acrobat 6, these are also called the Open password and the Permissions password, respectively.

The user password, if set, is necessary for viewing the document pages. The PDF encryption key is derived from the user password, so it really is required. When a PDF viewer tries to open a PDF that was secured with a user password, it will prompt the reader to supply the correct password.

The owner password, if set, is necessary for changing the document security settings. A PDF with both its user and owner passwords set can be opened with either password, so you should choose both with equal care.

An owner password by itself does not provide any real PDF security. The content is encrypted, but the key, which is derived from the (empty) user password, is known. By itself, an owner password is a polite but firm request to respect the author's wishes. A rogue program could strip this security in a second.

About PDF Encryption

You can use PDF encryption to lock a file's content behind a password, but more often it is used to enforce lighter restrictions imposed by the author. For example, the author might permit printing pages but prohibit making changes to the document. Here, we continue from and explain how pdftk can encrypt and decrypt PDF documents. We'll begin by describing the Acrobat Standard Security model (called Password Security in Acrobat 6) and the permissions you can grant or revoke.

PDF file attachments get encrypted, too. After opening an encrypted PDF, document file attachments can be opened, changed, or deleted only if the owner granted ModifyAnnotations permission.

Page file attachments behave differently than document file attachments. Once you open an encrypted document, you can open files attached to PDF pages regardless of the permissions. Changing or deleting one of these attachments requires the ModifyAnnotations permission. Of course, if you have the owner password, you can do anything you want.


Friday, January 11, 2008

How to Convert Incoming Faxes to PDF on Linux?

Wrap an incoming fax in PDF and deliver it by email.

Before PDF and before email, we had fax. Today, we still have fax. Integrate fax with your 21st-century lifestyle using HylaFAX. HylaFAX turns your Linux box into a fax server. For details, visit http://www.hylafax.org and http://www.ifax.com. Here, we discuss configuring HylaFAX so that it will deliver incoming faxes to a given email address as a PDF attachment.

Install the HylaFAX server package from your favorite Linux distribution. During installation, a FaxMaster email alias should be created that points to the user responsible for maintaining the server. In this hack, all incoming faxes will be emailed to the FaxMaster as PDF. After installation, run faxsetup -server.

After a fax is received, HylaFAX's faxgetty invokes the faxrcvd script, which in turn executes FaxDispatch (typically located in /var/spool/hylafax/etc) to set configuration parameters. FaxDispatch is where you can control how incoming faxes are routed. Your installation might include a sample FaxDispatch file, or you might need to create one. Read man faxrcvd for details about FaxDispatch.

This sample FaxDispatch file configures HylaFAX to email all incoming faxes to the FaxMaster as PDF attachments. Additional, commented-out lines give an idea of what else is possible:

## Default FaxDispatch file - routes all inbound faxes to FaxMaster as PDF

## Consult the faxrcvd(8C) man page for more information

##



SENDTO=FaxMaster; # by default email to FaxMaster

FILETYPE=pdf; # in PDF format



## This excerpt from the man page gives you an idea of what's possible here

##

## You can route by sender's TSI

#case "$SENDER" in

# *1*510*526*1212*) SENDTO=sam;; # Sam's test rig in Berkeley

# *1*415*390*1212*) SENDTO=raster@asd;; # 7L Xerox room, used for scanning

# *5107811212) SENDTO=peebles@mti;; # stuff from home

#esac


## and/or by device

#case "$DEVICE" in

# ttyS1) SENDTO=john;; # all faxes received on ttyS1

# ttyLT0) SENDTO=mary@home;; # all faxes received on ttyLT0

#esac


## and/or by caller id

#case "$CIDNUMBER" in

# 435*) SENDTO=lee; FILETYPE=pdf;; # all faxes from area code 435

# 5059627777) SENDTO=amy; FILETYPE=tif;; # Amy wants faxes in TIFF

#esac

How to Print to Fax on Windows?

Treat fax machines like remote printers instead of remote copiers.

Faxing a document traditionally involves two fax machines: one that scans your document and one that prints your document. If the document in question is already stored on a computer, it makes more sense to print the document from the computer to the target fax machine. This yields a much higher-quality fax, and it is much more convenient. On a Windows machine with a fax modem, you can install a Fax printer that behaves like any other system printer.

Faxes tend to look bad because the process of scanning a document adds noise, skews text, and generally degrades the appearance. Artwork and photographs suffer the most corruption. Printing a document to the target fax machine, on the other hand, dispenses with scanning. Text looks sharp, and images are preserved with dithering.

Windows XP and Windows 2000 will create a Fax printer when you install a fax-capable modem (Start > Setting > Control Panel > Phone and Modem Options > Modems > Add . . . ). Using Acrobat or your authoring program, print your document to this Fax printer and a wizard will open. This fax wizard asks for the recipient's phone number and enables you to fill in a cover page. Upon completion, your modem will dial out to the destination fax machine and send your document.

A useful series of Windows fax articles is available from http://labmice.techtarget.com/windows2000/printing/fax.htm.

If you fax PDFs frequently, consider adding a Print to Fax item to the PDF right-click context menu.

Windows XP and 2000:

  1. In the Windows File Explorer menu, select Tools > Folder Options . . . and click the File Types tab. Select the PDF file type and click the Advanced button.
  2. Click the New . . . button and a New Action dialog appears. Give the new action the name Print to Fax.
  3. Give the action an application to open by clicking the Browse . . . button and selecting Acrobat.exe, which lives somewhere such as C:\Program Files\Adobe\Acrobat 6.0\Acrobat\. Or, use Reader (AcroRd32.exe) instead of Acrobat.
  4. Add arguments after Acrobat.exe or AcroRd32.exe like so: "C:\Program Files\Adobe\Acrobat 6.0\Acrobat\Acrobat.exe" /t "%1" Fax
  5. Click OK, OK, OK and you should be done with the configuration.
To integrate fax features into your network, use HylaFAX. Visit http://www.hylafax.org and http://www.ifax.com, and consult the fa.hylafax newsgroup.

Share a PDF Network Printer with Samba

Share a PDF printer with your entire network using Ghostscript, Samba, and Linux.

Ghostscript lets you freely print to PDF. However, maintaining Ghostscript on every client in your enterprise can be a nuisance. Consider installing it on a single Linux server instead. Then, use Samba to share it as a PDF printer to your entire network.

Before creating a PDF printer server, install a local PDF printer to test Ghostscript and make sure it fits your requirements. Note that some Linux distributions provide GNU Ghostscript (Version 7) instead of the more recent AFPL Ghostscript (Version 8). Factor this into your testing. You will probably want to compile AFPL Ghostscript for your Linux server, later.

The Server
Every Linux distribution should have Samba and Ghostscript packages that you can install painlessly. Use them. Later, consider downloading and compiling the latest AFPL Ghostscript.

Samba is powerful, so its configuration requires some skill and patience. Consult man smb.conf and edit smb.conf to suit your network. Exercise your favorite Internet search engine, and drop by http://us3.samba.org/samba/docs/using_samba/toc.html. When things aren't working, consult the log files (e.g., /var/log/samba). Don't forget to restart the samba service (e.g., /etc/init.d/samba restart) after changing smb.conf.

Create the directory /home/pdf_printer/output, and chmod it to 777. This is where new PDFs will be delivered. Share this directory with your network by adding this section to smb.conf and restarting Samba:

[pdf_output]

comment = Shared PDF Printer Output

path = /home/pdf_printer/output

; this next line is necessary only when security = share

guest ok = yes

browseable = yes

writeable = yes

In Windows, this share should be visible from the Network Neighborhood or My Network Places. If not, try digging into Entire Network Microsoft Windows Network. Also try the Search for Computers or Find Computer features. Sometimes, new resources aren't visible immediately. Sometimes, client configurations must be reviewed and changed, too.

Now, let's add a PDF Printer to Samba. Once you get it working, adapt the settings to your requirements. Maybe these settings are all you will need.

Download samba-print-pdf from http://ranger.dnsalias.com/mandrake/samba/, copy it into your server's /usr/local/bin directory, and chmod it to 755. Open this script in an editor to see what it does, and possibly change things, such as its Ghostscript OPTIONS.

Add the following section to smb.conf. It should work with Samba's share security model (security = share) or user security model (security = user). The user security model requires that a user provide a name and password before accessing the printer.

[pdf_printer]

comment = Shared PDF Printer

path = /tmp

; this next line is necessary only when security = share

guest ok = yes

printable = yes

use client driver = yes

print command = /usr/local/bin/samba-print-pdf %s \

/home/pdf_printer/output //%L/pdf_output %m %I "%J" &

lpq command =

lprm command =

Restart Samba and then try accessing the file share pdf_output from a client machine. If that works, you are ready to install the client printer.

The Windows Client
Install the Virtual Printer Kit (VPK). Right-click our network printer, pdf_printer, under My Network Places in the File Explorer. Select Connect . . . , and click OK. The Add Printer Wizard will open and ask which printer driver to install. Click Have Disk, browse over to the VPK printer driver that suits your client platform, and click OK. Select the Virtual PostScript Printer driver and click OK. Your new PDF network printer will appear in the computer's Printers folder. Print a test page to make sure it works properly.

Later, copy these Virtual PostScript Printer files to the pdf_output share so that you can access them easily across your network.

How to print PDF over internet?

Printing over the Internet brings the way people like to read and write to the way we plumb information in the 21st century. The idea is to enable authors to create documents using their favorite editor and then print it to a web site. Once on the web server, the PostScript print stream can be converted to PDF and posted online for reading or downloading. In this scenario, the author controls the source document and is responsible for maintenance.

This tips uses HTTP file submission to transfer PostScript to a web server. A more formal solution would use CUPS (http://www.cups.org). For a CUPS-based PDF creation server, try Alambic (http://alambic.iroise.net). Alambic supports HTTP and SMTP interfaces.

This tips demonstrates how to "print" a PostScript print stream to a web server. In our examples, we won't be printing to an elaborate document hosting service. Instead, we will print to the simple http://www.ps2pdf.com web site.

Currently, http://www.ps2pdf.com uses an old version of Ghostscript, so printing to your own, local version of Ghostscript will yield a better PDF.

Download and Install
Visit http://www.pdfhacks.com/submit_file/ and download submit_file-1.0.zip. Unzip this archive, and then copy SubmitFile.exe to a convenient location. This is a simple program that uses the Windows WinInet API to submit a local file to a web server. It then opens the default web browser to view the server's response. The source code is available and you should consult it for HTTP submission details.

Install a ps2pdf.com Printer
The procedure for creating an Internet printer is the same as the procedure for creating the PDF printer in [Hack #39], except you don't need to install Ghostscript. The configuration is also a little different.

Follow the Print to PDF instructions, except:

  1. You don't need to install Ghostscript.
  2. Name the new printer ps2pdf.com Printer instead of GS Pdf Printer.
  3. Name the new Redirected Port RPTWEB: instead of RPTPDF:.
  4. When configuring this new Redirected Port, use the settings in Table below
  5. RedMon port properties

    Field

    Value

    Redirect this port to the program:



    C:\redmon17\


    RedRun.exe

    Arguments for this program are:



    C:\pdfhacks\


    SubmitFile.exe/convert/convert.cgiwww.ps2pdf.cominputfile%1

    Output:

    Program Handles Output

    Run:

    Minimized


  6. Name the Redirected Port log file C:\pdfhacks\web_printer.log instead of C:\gs\pdf_printer.log
  7. Click OK to accept the new port settings.
  8. Click OK to accept the new printer settings and close the dialog.
The RedRun program takes the PostScript print stream and creates a temp file for it. RedRun then runs the program SubmitFile, replacing the %1 with the temp filename. Note that you should not put quotes around this %1, because RedRun seems to pad the temp filename with whitespace that disrupts the SubmitFile arguments.



Test Your ps2pdf.com Printer
Open the ps2pdf.com Printer properties dialog, click the General tab, and click Print Test Page. When your PDF is ready for download from http://www.ps2pdf.com, a browser will open with a hyperlink to follow.

If an error occurs, check the log file for feedback from RedRun or SubmitFile.

Note that the previous configuration is tailored to the current state of http://www.ps2pdf.com. The site administrators might choose to alter it at any time, requiring you to change this printer's configuration.

Sunday, January 6, 2008

How to Print to Image and Other Rasterizing Options

Thumbnail the cover or rasterize the entire document.
You might sometimes need to convert PDF to other graphics formats. You can easily add a "Print to Image" printer by following [Hack #39] and changing a few ingredients. Alternatively, rasterize your PDF documents using Adobe Acrobat or Photoshop. Because Photoshop gives you the most power, you might prefer to "Print to PDF" and then open these pages in Photoshop.

Install a PNG (or JPEG or TIFF) Printer
The procedure for creating a bitmap (e.g., TIFF, JPEG, PNG) printer is the same as the procedure for creating the PDF printer. The configuration is just a little different. In this example, we'll configure a PNG printer, but you just as easily can create a JPEG or TIFF printer. The DEVICE option determines what gets created. We discuss alternative devices a little later.

Follow the PDF Printer instructions, except:

Name the new printer GS png16m Printer instead of GS Pdf Printer.

Name the new Redirected Port RPTPNG16M: instead of RPTPDF:.

When configuring this new Redirected Port, name the options file C:\gs\png16m_printer.cfg instead of C:\gs\pdf_printer.cfg.

When configuring this new Redirected Port, name the log file C:\gs\png16m_printer.log instead of C:\gs\pdf_printer.log.

Create the file png16m_printer.cfg, referenced earlier. It is a text file of additional arguments passed to Ghostscript. An example is included with our Virtual Printer Kit. Change the paths to suit your Ghostscript and system setup.

-dSAFER

-dBATCH

-dNOPAUSE

-Ic:\gs\gs8.14\Resource

-Ic:\gs\fonts

-Ic:\gs\gs8.14\lib

-sFONTPATH=c:\WINDOWS\FONTS

-sDEVICE=png16m

-r72

-dTextAlphaBits=4

-dGraphicsAlphaBits=4

-dAlignToPixels=0

Using this procedure, you can create one printer for each image file format you commonly use.

"Print to Image" devices and options
The documentation that comes with Ghostscript (C:\gs\gs8.14\doc\index.htm) explains the available output devices (Devices.htm) and general options (Use.htm) that you can use in the configuration file.

Image output filenames
When printing a multipage document to one of these bitmap printers, the output filename must include the %d page number variable so that each page gets a unique filename. To pad this variable with three leading zeros, use %03d. On the Windows command line, the % must be represented by %%.

Acrobat: Save As Image
Beginning with Acrobat 5, you can open a PDF and then Save As . . . to JPEG, PNG, or TIFF image files. From the Save As . . . dialog, click the Settings . . . button to configure image options. You can set the image resolution, color space, and compression, among other things.

Photoshop: Open PDF
Photoshop is an ideal place to manipulate bitmaps, so it makes sense to open your PDF right in Photoshop. If your original document isn't a PDF, print one using Acrobat Distiller or our GS Pdf Printer. Open it in Photoshop, then Save As . . . to whatever format you want.

Mac OS X: Preview
Preview application that comes with Mac OS X lets you open PDF files and save them in a variety of graphics formats.

Unlock the secret powers of Distiller and Ghostscript.

Acrobat Distiller creates PDF based on its current profile setting. On Windows, choose a profile when you print by changing the Print >Properties >Adobe PDF Settings tab >Default Settings drop-down box. On a Macintosh, choose PDF Options from the drop-down box that starts out saying Copies & Pages instead of selecting the Adobe PDF Settings tab. When using Ghostscript, you can reference a joboptions file in pdf_printer.cfg.

Whenever you print to an Acrobat PDF printer, you can select a profile that creates the best PDF for your purpose. You can view and edit these profiles using the graphical Distiller application. The surprise is that these profiles, or joboptions files, are plain-text PostScript snippets that give you more control over Distiller than the GUI does. They are also compatible with Ghostscript, although Ghostscript does not implement all the possible settings. Indeed, the joboptions file (and its specification) is a good place to get the straight dope on what Distiller and Ghostscript can really do.

Acrobat Distiller Parameters Tell the Full Story
To fully understand Distiller and Ghostscript features, you must read the Acrobat Distiller Parameters document from Adobe. It is also the definitive guide to joboptions file parameters.

If you have Acrobat on your computer, open Distiller and select Help Distiller Parameters Guide, or search your disk for distparm.pdf. On the Macintosh, this file is in the Extras folder on the installer CD. The Acrobat 6 version of distparm.pdf is not available online except to paying Adobe ASN Developer Program members. The next best thing is the Acrobat 5 version, which is bundled with the freely downloadable Acrobat 5 SDK:

http://partners.adobe.com/asn/acrobat/download.jsp
Ghostscript users should also read C:\gs\gs8.14\doc\Ps2pdf.htm or, online:

http://www.cs.wisc.edu/~ghost/doc/cvs/Ps2pdf.htm
If you plan to deliver PDF to a service bureau, find out if they have a joboptions file you should use when creating your PDF.

Distiller joboptions Profiles
Acrobat Distiller's joboptions files are easy to view and modify using the Distiller GUI. Launch the Distiller application, and set Default Settings (Acrobat 6) or Job Options (Acrobat 5) to the profile you want to view or edit. Then, select Settings Edit Adobe PDF Settings (Acrobat 6) or Settings Job Options (Acrobat 5).

As noted earlier, this graphical interface does not give you access to all the settings documented in Acrobat Distiller Parameters. Because joboptions files are plain text, you can also view or edit them using a text editor.

Ghostscript joboptions Profiles
joboptions files are written in PostScript, so you can pass them to Ghostscript just before your input file using the -f option. Add a joboptions file to your GS Pdf Printer by appending it to the end of the pdf_printer.cfg file you created, like so:

-dSAFER

-dBATCH

-dNOPAUSE

-Ic:\gs\gs8.14\Resource

-Ic:\gs\fonts

-Ic:\gs\gs8.14\lib

-sFONTPATH=c:\WINDOWS\FONTS

-sDEVICE=pdfwrite

-r1200

-c save pop

-f c:\gs\pdfhacks.gs.joboptions


The file pdfhacks.gs.joboptions comes with our Virtual Printer Kit . It is organized and commented to make parameters easy to read and understand. Open it in your text editor and take a look. Edit it to suit your needs. Parameters not supported by Ghostscript are commented out.

If you need to manage a collection of these profiles, consider creating one GS Pdf Printer for each profile. Each printer would have its own Redirected Port, each port using its own cfg file, each cfg file referencing its own joboptions file.

Thursday, January 3, 2008

How to Maximize PDF Portability?

PDF version differences can affect you and your readers.
To best serve your readers, you should ensure that your PDF is compatible with their viewers. What PDF viewers are they running? Assume that they have at least upgraded to the previous version of Acrobat/Reader (or another, compatible viewer). PDFs created with the newest Acrobat might be incompatible with previous versions.

PDF Versions Overview
With each new version of Acrobat, Adobe introduces an updated version of the PDF specification. They go together,
In many cases, an older viewer still can read a newer-version PDF (although the viewer will complain). Its behavior depends on which new features the PDF uses. Which viewers implement newer features? Here are some highlights, selected for their bearing on mass distribution. For complete details, consult the PDF Reference, Versions 1.3, 1.4, and 1.5.

PDF 1.3 (Acrobat 4) introduced:
  • Digital signatures
  • File attachments
  • JavaScript support
  • Logical page numbering

PDF 1.4 (Acrobat 5) introduced:
  • Additional 128-bit encryption option
  • Additional JavaScript trigger events (document close, will save, did save, will print, did print)
  • Enhanced interactive forms

PDF 1.5 (Acrobat 6) introduced:
  • Additional file compression options
  • Additional encryption options

An older viewer can simply ignore many of the things it doesn't understand. The showstoppers are the compression or encryption features, because the viewer can't show the document if it can't read the streams.

If your PDF relies on newer JavaScript or forms features to work properly, prevent older viewers from opening your PDF. Determine the minimum PDF version your document requires and then apply the corresponding encryption using an empty password. Older viewers simply won't be able to read it.

Create Compatible PDFs
Out of the box, Distiller or PDFMaker yields PDFs that are compatible with the previous version of Acrobat. No problem.

When you open a PDF in Acrobat, modify it, and then save it, your PDF's version is upgraded silently to match Acrobat's. It is no longer compatible with the previous versions of Acrobat/Reader. This happens regardless of whether your PDF uses any of the new features.

Install older versions of Acrobat Reader and test your PDFs, if you are worried about how they'll look or function. Download old installers from http://www.adobe.com/products/acrobat/reader_archive.html or http://www.oldversion.com/program.php?n=acrobat.

One solution is to use the Reduce File Size feature in Acrobat 6 (File Reduce File Size . . . Compatible with: Acrobat 5.0 and later), which enables you to also set the compatibility level of the resulting PDF. Another solution is to use the PDF Optimizer feature (Advanced PDF Optimizer . . . ) and set the "Compatible with" field to "Acrobat 5.0 and later." A third option is to refry your PDF.

Tuesday, January 1, 2008

How to Create PDFs quickly and easily from any Macintosh OS X program?

Apple built a "Save As PDF" capability right into the Macintosh OS X Print dialog box. Any time you go to print a document, you can choose Save As PDF . . . from the bottom of the Print dialog box. Unfortunately, this approach provides no options and tends to produce large files, but at least it is a quick solution to producing PDFs. This option is available at the bottom left of any Print dialog box.

If you click the Save As PDF . . . button, a file dialog box will ask you where to put the resulting PDF file. Select a location, click OK, and the Mac will print to a PDF file.

There aren't any obvious configuration options for Save as PDF . . . , but if you have Mac OS X 10.3 or later, you can choose settings through the Filters tab of the ColorSync utility's Preferences window. (The ColorSync utility is in MacintoshHD:Applications:Utilities.) If you check PDF Workflow in the Domains tab, you'll be able to change your PDF options from the Print dialog box as well.

Optimization for Acrobat Distiller Profile

Select the best Distiller profile for your purpose.
When you use Acrobat's "Print to PDF" or use the PDFMaker macro for Word, Adobe's Distiller is the engine that creates your PDF. What kind of PDF do you need? You can configure Distiller to create the best PDF for your purpose. The choice is usually between document fidelity and file size. File size becomes an issue only when distributing a PDF electronically. When in doubt, choose fidelity.

Changing the Distiller Profile
Acrobat Distiller comes with a few preconfigured profiles. They are also called Settings or joboptions files. When printing to PDF, change the profile by clicking the Print dialog's Properties button (Windows), or by selecting PDF Options from the drop-down box that starts out saying Copies & Pages (Mac). Select the Adobe PDF Settings tab and select a Distiller profile from the Default Settings: (Acrobat 6) or Conversion Settings: (Acrobat 5) drop-down box. This choice is not permanent. To change the default profile, consult the Adobe PDF (Acrobat 6) or Acrobat Distiller (Acrobat 5) printer properties.

When using PDFMaker on Windows, change its profile inside Word by selecting Adobe PDF Change Conversion Settings . . . (Acrobat 6) or Acrobat Change Conversion Settings . . . (Acrobat 5). This choice, shown in Figure 4-6, becomes the default.

When creating a PDF for a print shop or service bureau, ask them for a joboptions file to use. On Windows, move it to Distiller's Settings folder, which is located somewhere such as C:\Program Files\Adobe\Acrobat 5.0\Distillr\Settings\. On the Macintosh, open Distiller and use the Settings Add Adobe PDF Settings . . . menu option to add it to the list. Then, select the print shop or service bureau's profile when creating your PDF, as described earlier.