Expand my Community achievements bar.

SOLVED

Importing a PDF document and converting to a web page

Avatar

Level 1

We have several large legal documents that exist as Adobe PDF files. We'd like to import them into Adobe Experience Manager such that they can be converted to HTML markup and displayed as web pages via AEM.

 

This seems like it should be an easy task, but I can't find any documentation on how this would be done. I have to imagine there exists some kind of module that can import and 'translate' from the PDF format to HTML markup. Or, from PDF to XML, and then from XML to HTML.

 

Thanks in advance!

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

Hi @Nick666 ,

 

It is an unrealistic requirement. It is not possible to convert PDFs into web pages. Until and unless you develop a special tool, based on computer vision technology to detect different component in pdf and convert it into HTML.

View solution in original post

2 Replies

Avatar

Correct answer by
Community Advisor

Hi @Nick666 ,

 

It is an unrealistic requirement. It is not possible to convert PDFs into web pages. Until and unless you develop a special tool, based on computer vision technology to detect different component in pdf and convert it into HTML.

Avatar

Level 1

PDF is a structured format. HTML is a structured format. In theory it should be possible to map a transformation between the two structures. 

Instead of PDF, how about a MS Word document? It is also a structured format.