OutputService generates damaged PDFs

adamp25373672

24-07-2018

I've written a servlet to generate Document of Record PDFs for our forms. After some work, I've gotten it to output PDFs, but the PDFs are always damaged.

Viewing the PDFs in a text editor doesn't reveal anything obviously wrong. Using an online PDF repair tool shows the following errors:

0x8041010B - E - The 'xref' keyword was not found or the xref table is malformed.

  - File: 20185000044 (4).pdf

Recover XREF table.

0x8A117FFD - E - Error in Flate stream: data error.

  - Object No.: 12

  - File: 20185000044 (4).pdf

0x80410306 - E - The "Length" key of the stream object is wrong.

  - Object No.: 10

  - File: 20185000044 (4).pdf

0x80410306 - E - The "Length" key of the stream object is wrong.

  - Object No.: 11

  - File: 20185000044 (4).pdf

0x80410306 - E - The "Length" key of the stream object is wrong.

  - Object No.: 3

  - File: 20185000044 (4).pdf

0x8A117FFD - E - Error in Flate stream: data error.

  - Object No.: 4

  - File: 20185000044 (4).pdf

Analyze Objects.

0x80410306 - E - The "Length" key of the stream object is wrong.

  - Object No.: 3

  - File: 20185000044 (4).pdf

0x80410306 - E - The "Length" key of the stream object is wrong.

  - Object No.: 3

  - File: 20185000044 (4).pdf

0x80410306 - E - The "Length" key of the stream object is wrong.

  - Object No.: 10

  - File: 20185000044 (4).pdf

0x80410306 - E - The "Length" key of the stream object is wrong.

  - Object No.: 10

  - File: 20185000044 (4).pdf

0x80410306 - E - The "Length" key of the stream object is wrong.

  - Object No.: 11

  - File: 20185000044 (4).pdf

0x80410306 - E - The "Length" key of the stream object is wrong.

  - Object No.: 11

  - File: 20185000044 (4).pdf

Analyze Outlines.

Analyze Pages.

0x80410402 - E - The page or page tree node has a missing or invalid "Type" key.

  - Object No.: 5

  - File: 20185000044 (4).pdf

0x80410113 - E - The file is corrupt and cannot be repaired. Some of the contents can possibly be recovered.

  - File: 20185000044 (4).pdf

Recover Pages.

Save output file.

0x80410306 - E - The "Length" key of the stream object is wrong.

  - Object No.: 11

  - Page No.: 1

  - File: 20185000044 (4).pdf

0x8A117FFD - E - Error in Flate stream: data error.

  - Object No.: 11

  - Page No.: 1

  - File: 20185000044 (4).pdf

0x0A09C006 - W - Invalid content of XMP packet header attribute 'begin': '?'.

  - XPath: /xpacket

  - File: C:\Windows\TEMP\osa\4939-524cbfb612d843ebb66d6ecf364c7ef0repair.tmp

Close file.

I tried outputting a PostScript file instead. Inspecting that file showed that it was attempting to render the template with the given data, but I was also unable to open that file.

I've tried reducing the template to a bare minimum and massaging the data every which way, but I haven't managed to make OutputService give me a single good file. Code follows.

    private void renderDOR(String id, SlingHttpServletRequest request, SlingHttpServletResponse response) {

        Document templateDocument;

        Document xmlXdpDocument;

        Document pdfDocument;

        PDFOutputOptions pdfOptions;

        FormMetadata formMetadata;

        BinaryData formData;

        try {

             formMetadata = formsDatabase.getMetadata(id);

             formData = formsDatabase.getFormData(formMetadata.getUserdataID());

        } catch (SQLException e) {

            log.error("Failed to retrieve form metadata for id " + id, e);

            response.setStatus(HttpServletResponse.SC_INTERNAL_SERVER_ERROR);

            return;

        }

        if (formData == null) {

            log.error("No formData found for form ID {}", id);

            response.setStatus(HttpServletResponse.SC_BAD_REQUEST);

            return;

        }

        log.debug(formData.getDataAsString());

        templateDocument = getTemplateDocument(formMetadata, request.getResourceResolver());

        xmlXdpDocument = getXmlXdpDocument(formDataToDocument(formData.getDataAsStream()));

        pdfOptions = new PDFOutputOptions();

        pdfOptions.setAcrobatVersion(com.adobe.fd.output.api.AcrobatVersion.Acrobat_11);

        pdfOptions.setLinearizedPDF(true);

        try {

            pdfDocument = outputService.generatePDFOutput(templateDocument, xmlXdpDocument, pdfOptions);

        } catch (OutputServiceException e) {

            log.error("Error generating PDF output", e);

            response.setStatus(HttpServletResponse.SC_INTERNAL_SERVER_ERROR);

            return;

        }

        try {

            // Write PDF to response

            InputStream in = pdfDocument.getInputStream();

            Writer out = response.getWriter();

            IOUtils.copy(in, out, StandardCharsets.UTF_8.name());

            setHeaders(response, id);

        } catch (MalformedURLException e) {

            log.error("Failed to generate PDF", e);

            response.setStatus(HttpServletResponse.SC_INTERNAL_SERVER_ERROR);

        } catch (IOException e) {

            log.error("Failed to write PDF to response", e);

            response.setStatus(HttpServletResponse.SC_INTERNAL_SERVER_ERROR);

        }

    }

    private Document getTemplateDocument(FormMetadata formMetadata, ResourceResolver resourceResolver) {

        String formPath = DEFAULT_FORM_PATH;

        if (StringUtils.isNotBlank(formMetadata.getFormPath())) {

            formPath = formMetadata.getFormPath();

        }

        Resource resource = resourceResolver.getResource(formPath + FORM_METADATA_PATH);

        if (resource == null) {

            log.error("Metadata for form {} not found", formPath);

            return null;

        }

        String templateXdpPath = resource.getValueMap().get(FormsPortalConstants.STR_DOR_TEMPLATE_REF)

                + DOR_TEMPLATE_SUBPATH;

        log.debug("Using template {}", templateXdpPath);

        return new Document(templateXdpPath);

    }

    private Document getXmlXdpDocument(org.w3c.dom.Document afXmlDocument) {

        // Adaptive Form data has additional structure.

        // Get just the XDP data and load into Adobe Document object

        try {

            String xmlDataXDP = XMLUtils.getUnboundDataXmlPart(afXmlDocument);

            log.debug("XMLDataXDP: {}", xmlDataXDP);

            InputStream inStreamXDPXML = IOUtils.toInputStream(xmlDataXDP, "UTF-8");

            return new Document(inStreamXDPXML);

        } catch (IOException e) {

            log.error("Failed to create XML XDP document", e);

        }

        return null;

    }

    private org.w3c.dom.Document formDataToDocument(InputStream data) {

        try {

            Reader reader = new InputStreamReader(data, StandardCharsets.UTF_8.name());

            InputSource source = new InputSource(reader);

            source.setEncoding(StandardCharsets.UTF_8.name());

            DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();

            docFactory.setNamespaceAware(true);

            DocumentBuilder docBuilder = docFactory.newDocumentBuilder();

            org.w3c.dom.Document document = docBuilder.parse(source);

            document.getDocumentElement().normalize();

            return document;

        } catch (ParserConfigurationException e) {

            log.warn("Failed to initialize DocumentBuilder", e);

        } catch (SAXException | IOException e) {

            log.warn("Failed to parse form data", e);

        }

        return null;

    }

    private void setHeaders(SlingHttpServletResponse response, String filename) {

        response.setContentType(Constants.ContentType.PDF);

        response.setCharacterEncoding(StandardCharsets.UTF_8.name());

        response.setHeader("Content-Disposition", "attachment; filename=\""

                + filename + ".pdf\"");

    }

What could I be doing wrong here?

Accepted Solutions (1)

Accepted Solutions (1)

James_R_Green

24-07-2018

Hi,

I'm not at a computer at the moment but i know this example works. Give that a go :

Adobe Experience Manager Help | Using AEM Document Services Programmatically

Also looking at your code, a couple of possible things to look at: do you need to set the encoding of your input source? Also I don't recall using this setting try removing it :pdfOptions.setLinearizedPDF(true);

Thanks,

Jim

Answers (4)

Answers (4)

smacdonald2008

25-07-2018

I suspected that was the Case (the use of the Document object as shown in James reply and is the same piece of code I posted) - one of the reasons wny they put methods on the Document object to get the data.

I did a lot of this work with these APIs about 10 years ago when i was working all on LiveCycle ES.

adamp25373672

25-07-2018

Writing to disk worked. Comparing the two outputs, it seems that printing directly to the response carried the wrong encoding. I'm not quite sure why just yet, but there it is.

smacdonald2008

24-07-2018

Try  this code exactly as is --

@Reference private OutputService outputService;

private File generatePDFOutput2(String contentRoot, File inputXML, File templateStr, String acrobatVersion, String tagged, String linearized, String locale) {

String outputFolder="C:/Output";

Document doc=null;

     try {

            PDFOutputOptions option = new PDFOutputOptions();             option.setContentRoot(contentRoot);

            if(locale!=null)

            {

                option.setLocale(locale);

            }

            if(acrobatVersion.equalsIgnoreCase("Acrobat_10"))

            {

                option.setAcrobatVersion(com.adobe.fd.output.api.AcrobatVersion.Acrobat_10);

            } else if(acrobatVersion.equalsIgnoreCase("Acrobat_10_1")) {                 option.setAcrobatVersion(com.adobe.fd.output.api.AcrobatVersion.Acrobat_10_1);

            } else if(acrobatVersion.equalsIgnoreCase("Acrobat_11")) {                 option.setAcrobatVersion(com.adobe.fd.output.api.AcrobatVersion.Acrobat_11);

            }

            if (tagged.equalsIgnoreCase("true") ) {

                option.setTaggedPDF(true );

            }

            if (linearized.equalsIgnoreCase("true") ) {

                option.setTaggedPDF(true );

            }

            InputStream inputXMLStream = new FileInputStream(inputXML);

            InputStream templateStream = new FileInputStream(templateStr);;

            doc = outputService.generatePDFOutput(newDocument(templateStream),new             Document(inputXMLStream),option);

                     File toSave = new File(outputFolder,"Output.pdf");

                     doc.copyToFile(toSave);

                    return toSave;

                } catch (OutputServiceException e) {

                         e.printStackTrace();

               }catch (FileNotFoundException e) {

                          e.printStackTrace();

               } catch (IOException e) {

                          e.printStackTrace();

               }finally{

                            doc.dispose();

              }

                return null;

}

Try calling the Document object copytofile() method to see if it copies a valid PDF document

adamp25373672

24-07-2018

Thanks for the suggestions. I've looked at that page before, and I don't see that it differs from what I'm doing aside from using the filesystem.

As to your other suggestions, the input source encoding is set on line 112 above, and removing setLinearizedPDF didn't make a difference.