Importing a PDF document and converting to a web page | Community
Skip to main content
May 26, 2022
Solved

Importing a PDF document and converting to a web page

  • May 26, 2022
  • 1 reply
  • 1069 views

We have several large legal documents that exist as Adobe PDF files. We'd like to import them into Adobe Experience Manager such that they can be converted to HTML markup and displayed as web pages via AEM.

 

This seems like it should be an easy task, but I can't find any documentation on how this would be done. I have to imagine there exists some kind of module that can import and 'translate' from the PDF format to HTML markup. Or, from PDF to XML, and then from XML to HTML.

 

Thanks in advance!

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by MayurSatav

Hi @nick666 ,

 

It is an unrealistic requirement. It is not possible to convert PDFs into web pages. Until and unless you develop a special tool, based on computer vision technology to detect different component in pdf and convert it into HTML.

1 reply

MayurSatav
Community Advisor and Adobe Champion
MayurSatavCommunity Advisor and Adobe ChampionAccepted solution
Community Advisor and Adobe Champion
May 28, 2022

Hi @nick666 ,

 

It is an unrealistic requirement. It is not possible to convert PDFs into web pages. Until and unless you develop a special tool, based on computer vision technology to detect different component in pdf and convert it into HTML.

Nick666Author
June 1, 2022

PDF is a structured format. HTML is a structured format. In theory it should be possible to map a transformation between the two structures. 

Instead of PDF, how about a MS Word document? It is also a structured format.