AEM - how to export PDF Asset content as json
Hello Community members -I have to check the feasibility of parsing PDF Asset conent and export it in json format.I have had a task to expose custom component's node properties like Question and Answer and their values, capture them and export them in json format, this have been achieved using a slingServlet and querybuilder.However to achieve similar task for PDF type of DAM Assets, where I need to parse the content for the plain text like "Question" and "answer" and parse related values to export them in json format. Reference PDFs can be big in size and is not just limited to a simple "Q & A " sheet, can have more headings, paragraphs, images, links etc. and There could be multiple such reference pdfs on the page component. Really appreciate any assistance, pointers, references please.