Expand my Community achievements bar.

Don’t miss the AEM Skill Exchange in SF on Nov 14—hear from industry leaders, learn best practices, and enhance your AEM strategy with practical tips.

Best Way to convert HTML to PDF?

Avatar

Level 10

I are doing some research about HTML to PDF conversions I know there are million ways to achieve this but based on some criteria’s (listed below) I have selected three solutions.

         

List of criteria’s

List of Solutions:

  1. Rendering
  2. Conversion Speed
  3. Memory usage,
  4. Streaming
  5. Easy to install
  1. Using open source plugins
  2. Using Adobe LiveCycle connector
  3. Using Java PDF API(uses Adobe LiveCycle PDF API)

Solution 1: Using open source plugins:

I have find out lots of open source APIS to suit our purpose but I zeroed down to PDFbox; This is an open source API from Apache foundation.

I have implemented one custom service to convert html to pdf in Adobe Experience Manager custom service that is able to modify PDF documents.

The custom service is developed by using Apache PDFBOX Java API. Read more about this here: http://pdfbox.apache.org/

This API has following features:

                                 

Extract Text

Extract Unicode text from PDF files.

Split & Merge

Split a single PDF into many files or merge multiple PDF files.

Fill Forms

Extract data from PDF forms or fill a PDF form.

Preflight Validate

PDF files against the PDF/A-1b standard.

Print

Print a PDF file using the standard Java printing API.

Save as Image

Save PDFs as image files, such as PNG or JPEG.C

Create PDFs

Create a PDF from scratch, with embedded fonts and images.

Sign

Digitally sign PDF files.

Solution 2: Using Adobe LiveCycle connector

AEM and LiveCycle Connector enables seamless invocation of Adobe LiveCycle ES4 Document Services from within AEM Ib apps and workflows. LiveCycle provides a rich client SDK, which allows client applications to invoke LiveCycle services using Java APIs. AEM LiveCycle Connector simplifies using these APIs within the OSGi environment.

Solution 3: Using Java PDF API

Read for more details: http://help.adobe.com/en_US/livecycle/9.0/programLC/javadoc/com/adobe/idp/Document.html

Process steps:

  • Include project files.
  • Include client JAR files, such as adobe-generatepdf-client.jar, in your Java project’s class path.
  • Create a Generate PDF client.
  • Create a GeneratePdfServiceClient object by using its constructor and passing a ServiceClientFactory object that contains connection properties.
  • Retrieve the HTML content to convert to a PDF document.
  • Retrieve HTML content by creating a string variable and assigning a URL that points to HTML content.
  • Convert the HTML content to a PDF document.
  • Invoke the GeneratePdfServiceClient object’s htmlToPDF2 method and pass the following values:
  • A java.lang.String object that contains the URL of the HTML file to be converted.
  • A java.lang.String object that contains the file type settings to be used in the conversion.
  • A java.lang.String object that contains the name of the security settings to be used.
  • An optional com.adobe.idp.Document object that contains settings to be applied while generating the PDF document. If this information is not supplied, the settings are automatically chosen based on the previous three parameters.
  • An optional com.adobe.idp.Document object that contains metadata information to be applied to the PDF document.
  • Retrieve the results.
  • The htmlToPDF2 method returns an HtmlToPdfResult object that contains the new PDF document that was generated. To obtain the newly created PDF document, perform the following actions:
  • Invoke the HtmlToPdfResult object’s getCreatedDocument method. This returns a com.adobe.idp.Document object.
  • Invoke the com.adobe.idp.Document object’s copyToFile method to extract the PDF document from the object created in the previous step.

Can Any one recommend any other way or best way to convert HTML to PDF or Word to PDF?

1 Reply

Avatar

Level 5

Hi Amit,

I know it's late but will be useful for others. I also see PDFbox to be useful. We also have PDF generator service we could use.