Your achievements

Level 1

0% to

Level 2

Tip /
Sign in

Sign in to Community

to gain points, level up, and earn exciting badges like the new
Bedrock Mission!

Learn more

View all

Sign in to view all badges

AEM - how to export PDF Asset content as json

Level 1
Level 1

Hello Community members -I have to check the feasibility of parsing PDF Asset conent and export it in json format.I have had a task to expose custom component's node properties like Question and Answer and their values, capture them and export them in json format, this have been achieved using a slingServlet and querybuilder.However to achieve similar task for PDF type of DAM Assets, where I need to parse the content for the plain text like "Question" and "answer" and parse related values to export them in json format. Reference PDFs can be big in size and is not just limited to a simple "Q & A " sheet, can have more headings, paragraphs, images, links etc. and There could be multiple such reference pdfs on the page component. Really appreciate any assistance, pointers, references please.

0 Replies
Community Advisor
Community Advisor

Look at this one-


"id": "pdfviewer-98746e7e01",
"documentPath": "/content/dam/core-components-examples/library/sample-assets/Bodea Brochure.pdf",
"type": "IN_LINE",
"defaultViewMode": "FIT_PAGE",
"borderless": false,
"showAnnotationTools": false,
"showFullScreen": true,
"showLeftHandPanel": true,
"showDownloadPdf": true,
"showPrintPdf": true,
"showPageControls": true,
"dockPageControls": true,
"reportSuiteId": "",
"documentFileName": "Bodea Brochure.pdf",
"viewerConfigJson": "{\"embedMode\":\"IN_LINE\",\"showDownloadPDF\":true,\"showPrintPDF\":true}",
"containerClass": "cmp-pdfviewer__in-line",
"clientId": "630681c60b144a498850fc22c1df83e0",
":type": "core-components-examples/components/pdfviewer",
"dataLayer": {
"pdfviewer-98746e7e01": {
"@type": "core-components-examples/components/pdfviewer"