Expand my Community achievements bar.

Join us in celebrating the outstanding achievement of our AEM Community Member of the Year!
SOLVED

PDF Size will increase in size dramatically with every submit.

Avatar

Level 8

I have a PDF Form desinged using Adobe LiveCycle Desinger ES2.

It has a submit button which will submit the form to the server (IIS and ASP.NET) using this javascript command:

event.target.submitForm( {cURL: "http://server/ASPNETWebPage.ASPX", aPackets:["datasets","pdf"], cSubmitAs: "XDP"});

On the server, from ASP.NET, I use the following code to extract the submitted "chunk" element and convert it from Base64 to Binary PDF File:

            fs = New System.IO.FileStream(mFormFileNameFolder, IO.FileMode.Create)
            bw = New System.IO.BinaryWriter(fs)
            ' Get chunk element form the submitted XML
            Dim srChunk As New StringReader(mXML.GetElementsByTagName("chunk")(0).InnerXml)
            Do While True
                Dim theChunkLine As String
                theChunkLine = srChunk.ReadLine
                If Not String.IsNullOrEmpty(theChunkLine) Then
                    theReadBytes = theChunkLine.Length
                Else
                    theReadBytes = 0
                    Exit Do
                End If
                Dim theBase64Length = (theReadBytes * 3 / 4)
                Dim buffer() As Byte
                buffer = Convert.FromBase64String(theChunkLine)
                bw.Write(buffer)
            Loop
            bw.Close()
            bw = Nothing
            fs.Close()
            fs = Nothing

The above code is working fine, and PDF is generted successfully.

I have one problem.

With every submit, the generated PDF Size will increase dramatically. I reported this to Adobe Support, and they cofirmed that this is by desing and that with every submit, the previous PDF State is saved, and the new state is added. That is why I get huge PDF File.

I was told that the only way to solve this problem is to submit the form as PDF ONLY, and after I save the PDF File on a file system, I then must use Adobe Service/Process "exportData" to extract the XML Data from the PDF.

I think this is really big change to me. I was hoping that there is a way to indentify the latest PDF State from the chunk element.

Any help will be greatly appreciated.

Tarek.

1 Accepted Solution

Avatar

Correct answer by
Level 4

The heart of the problem was that large images were being placed in XFA image fields. Due to the design of PDF and incremental updates, copies of these images were being added to the file for each file save.  I'll write more on this later, most likely on the ADEP product blog.  But for now, the solution is to limit the size of the image in the field.  [As background, the image was used for a 1x1 inch thumbnail of a face, which is well-satisfied by a 72 DPI highly compressed JPG, or around 20-40K bytes or less.  The images in the file were on the order of megabytes, which caused massive issues. 

John Brinkman did a blog post on how to check the image size and generate an error if it is too large.  You can see this on John's Formfeed blog, and it is quite elegant.

View solution in original post

28 Replies

Avatar

Level 4

You sent a file to support that shows the problem well.  The signed file had 7 incremental updates, and each update was about 1.3MB. But I noted that the image size varied significantly. Some were GIFs were 3KB while the TIFs were 360KB (all measured on the base64 data).  I would venture to say that you won't have a dramatic issue like this if the files are 1/100th the size

I kicked this around with a key form developer (see his blog) and he had a great idea.  You can check the size of the image that the user has attached and give them an error if they have added an image that is too large: that can give them some idea on how to create a thumbnail.  John's words we "Just look at the length of the imagefield.rawValue – will tell them the size of the base64 image. If it’s too big, clear the field." That may be the most effective way to make the size increase less dramatic.  And it should not change your workflow.

Avatar

Correct answer by
Level 4

The heart of the problem was that large images were being placed in XFA image fields. Due to the design of PDF and incremental updates, copies of these images were being added to the file for each file save.  I'll write more on this later, most likely on the ADEP product blog.  But for now, the solution is to limit the size of the image in the field.  [As background, the image was used for a 1x1 inch thumbnail of a face, which is well-satisfied by a 72 DPI highly compressed JPG, or around 20-40K bytes or less.  The images in the file were on the order of megabytes, which caused massive issues. 

John Brinkman did a blog post on how to check the image size and generate an error if it is too large.  You can see this on John's Formfeed blog, and it is quite elegant.

Avatar

Level 8

Thanks a lot Chunk and John.

I think this will control the issues I am facing, and will catch the cause before it hits the server.

I will implement this check in my form ASPA. I tested the sample form from John's blog, and it is working fine.

Tarek.

Avatar

Employee

Thanks John. That Formfeed blog post is a great answer to this question.

-Jeff

Avatar

Level 8

Just to confirm that I implemeted the javascript code to check for image size before submit, server weeks ago, and so far, I never faced this problem again.

Many thanks again.

Tarek.

Avatar

Level 2

I have a similar problem with the size of a pdf increasing.

Using canopener did not reveal much at first, but then I used PDFXplorer and found a difference in results.

Catalog

--- Acroform

------ Fields

------ XFA

With canopener the Fields object is empty.

With PDFXplorer the Fields object is repeating (alot!) and contains XFA object. I am guessing this is where the huge size exists.

Can anybody advise me as to what the Fields object actually represents and how/when it is populated?

If I save the pdf using Acrobat, the size is heavily reduced.

Then if I view in PDFXplorer, the Fields object is empty.

Avatar

Level 8

Hi Moris,

Thanks for the update on the same problem.

I hink based on my understanding of the feedback of Adobe Staff in this thread, it seems that those fields and XFA content are repeating because this is how it was programmed to happen (or how it was Architected) to allow keeping copies of incremental updates.

It will be good if you can post more information about the version you are using ...etc.

Tarek.

Avatar

Level 2

Hi Tarek

Appreciate your comment.

I do not have any specific version of Reader or Acrobat the users are using when this issue occurs. I have asked 1st level support to gather those details next time if possible.

I have raised this with Adobe Enterprise Support, hopefully they can shed some light.

I did find that Fields object is part of the Interactive Form Dictionary.

Also, along your feedback, there is only 2 instances of %%EOF, in the pdf at fault.

Moris