Your achievements

Level 1

0% to

Level 2

Tip /
Sign in

Sign in to Community

to gain points, level up, and earn exciting badges like the new
Bedrock Mission!

Learn more

View all

Sign in to view all badges

Diving Deeper into Document Analytics - Practical Considerations




Author: Deepak Ranjan Kar


Adobe PDF Analytics is here to help you uncover unique data and actionable insights. This post attempts to briefly explain the technology and the often-missed nuances of a document analytics implementation while also covering the best practices from successful real-world deployments.

If you are part of the professional community who works in the Financial Services sector, you might have sensed your customer’s uneasiness on new digital transformations deployed. Every industry vertical nowadays relies heavily on documents, be it legal documents, software manuals in IT, or educational content — documents are everywhere. Example of customer experiences are:

  • Inability to understand one’s preference from simple downloads.
  • Discover valuable document performance and engagement.
  • Actionable insights from those truckloads of PDF documents.

If you ever wondered derived insights on a beautiful PDF to take the next best action? For example, getting insights from questions that may arise from PDF content:

  • Sending a personalized message to a potential customer who read about a new offering in the brochure?
  • Deliver if a piece of info out there may make more sense at a prominent location on the website real estate rather than buried deep within a voluminous PDF?
  • Advising the designer to design a better piece of content from the insights on what content the user zoomed in on?
  • Ensure the document interactions yet another data source for Adobe Experience Platform and enrich the 360-degree customer profile in order to help marketers better understand the behavior of their customers?
  • Pressure test if the recommended experience design derived from the document interactions a good data source for Adobe Experience Platform to enrich the 360-degree customer profile in order to help marketers better understand the behavior of their customers?

Further, making a decision on marketing investment on these questions that are exposed through the data such as:

  • How many pages of the document did they look at? Did they make it to the end?
  • What are people looking (searching) for?
  • Is the PDF being downloaded and saved onto desktops?
  • Are people printing it out? How many?

You’re not alone to derive insights from content delivery from Adobe PDF offering. PDF Analytics or Document Analytics in a broader sense has been a path rarely traveled. There weren’t a whole lot of options for accessing a URL or data collection over the internet. That has so far restricted Analytics efforts on downloaded documents.

In this blog, we will detail the solution, recommended next steps, and resources to navigate this new content insight to deliver improved customer experience.

The Solution

As most browsers are now capable of rendering PDF documents, the question becomes simpler for our customers. How do we add Adobe Analytics client-side technologies into the content that’s already sitting there, in the browser, in plain HTML?

The answer is a JavaScript-based PDF parsing and rendering engine. Once the PDF document is rendered in the browser and you get access to the DOM (document object model), you can very easily deploy Adobe Analytics and even Adobe Experience Platform Launch, the next-generation tag management solution. With minimal coding, you can capture the page number and interactions therein: zoom, search, select, copy text, print to name a few.

Embedded PDF viewers have been around for a long time in some form or another. And the advent of DC View SDK from Adobe has made the game exciting. If you have tried Analytics on a document viewer that was too limiting in nature or lacked documentation, the DC View SDK will be enticing. It’s easy, intuitive, and explained with great simplicity and detail here.

The very fact that in-pdf interactions can be measured and were available to be viewed on a dashboard blew away the analysts at one of the early adopters of this technology. While a lot of investigation and trial and error had gone into the initial deployments, they are now happier than ever to be switching to the pixel-perfect rendering with DC View SDK.

If you are considering giving PDF Analytics a try, here are some best practices that I have documented based on my experience of working with various experimental and mainstream document viewers.

Start small with plug-and-play without the need to re-author or re-process PDF files

Well, we love to be ‘non-destructive’ as much as possible, don’t we? You let the viewer parse and render your existing document and the PDF Analytics solution should provide you with information about the page numbers and most in-pdf interactions out-of-the-box. You can start with a small set of PDFs and gradually increase coverage over time.

Go beyond mere page numbers

Leverage the “Classification” feature from Adobe Analytics and associate metadata to your page numbers. One proof-of-concept even attempted to identify headings based on certain text properties in the parsed HTML.

Easily handle direct links (hotlinks, if you like that word) already shared via emails, linked from ads and elsewhere

You should be able to intercept those file downloads via a server-side mechanism (e.g., intercepting access to assets in Adobe Experience Manager) and direct them to the embedded viewer. If not, the least you might want to do is sending-in a server-side Analytics beacon using Data Insertion API. There have been instances of that handled via custom components in Adobe Experience Manager.

One question that usually comes up is how to handle Visitor IDs for those direct downloads. A practical and easy solution would be the deployment of a short redirect (the ones we know for cookie handshake all over the internet) before they get the files. Alternatively, you may want to be a little creative with server-side handling of the Adobe Experience Cloud ID.

The above could prove to be a great value-add beyond the basic metrics around who downloaded what.

Connect to Adobe Experience Platform

With Experience Platform on the center-stage, the rich interaction data is yet another data-source candidate. Deploying the all-new Adobe Experience Platform Web SDK (Alloy.js) on the viewer, you should be able to send the interaction events on-the-fly to Experience Platform and to other Adobe and non-Adobe systems as well.

Consider the impact to UX

Another interesting consideration has been around the User Experience. How do you ensure there is minimal disruption to document usage? In one instance, users noticed the option to right-click and print was missing. Well, certain changes might be worth considering. And one possible solution would be educating the users of the changes. (Additional customization may make for a better experience.)

We may be forcing a particular view when the intent might have been to download a copy. We need to consider the impact to UX when implementing this or any new technology.

There will still be users who rather download the file. Major viewers, including DC ViewSDK, do provide a very legible download button should the need arise. However, be cognizant of that additional click introduced in the browser before being prompted for download.

In another instance, we found that users loved an outdated version of a browser. Make sure you check your Analytics data and ensure technology compatibility for the majority of your users. If required, exclude an audience from the “forced” rendering.

Think through your technology adoption strategy

“There are people in my organization who don’t like changing the way things are.”, you might ask.

Well, change is inevitable. Introducing it in a way that the benefits are obvious will make all the difference. You may want to limit exposure to certain audiences should there be a technical and/or business justification.

Final Thoughts

While we’re talking about PDF documents, the same (or similar) logic applies to other document formats and/or images. All you need is the right embedded viewer and some amount of integration effort.

PDF Analytics may soon become the new norm. The great thing is that it can be done in a non-destructive way. The plugin is easy to deploy and manage. Try it today and have fun!

Follow the Adobe Experience Platform Community Blog for more developer stories and resources, and check out Adobe Developers on Twitter for the latest news and developer products. Sign up here for future Adobe Experience Platform Meetups.


  1. Adobe Experience Platform:
  2. Adobe PDF Documentation: › devnet › pdf › pdfs › PDF32000_2008
  3. Adobe Experience Platform Launch:
  4. Adobe Analytics:
  5. Adobe Experience Manager:

Originally published: Sep 17, 2020