Expand my Community achievements bar.

Bulk Migration of Articles HTML Pages into AEM XML Documentation


Community Advisor


Company Name: Bounteous

Company URL: https://www.bounteous.com/

Your Name: Mayur Satav

Your Title: Senior Software Developer


Describe your company, the customer experience and business challenge(s) you set out to solve with Adobe Experience Cloud products, and how long your company/organization has been using Adobe Experience Cloud products.


Founded in 2003 in Chicago, Bounteous is a leading digital innovation partner that co-innovates with the world's most ambitious brands to create transformative digital experiences. With services in Strategy, Experience Design, Technology, Analytics, and Marketing, Bounteous elevates brand experiences through technology partnerships and unparalleled platform expertise.


Describe how you have integrated and used multiple Adobe Experience Cloud products to solve these challenges to improve and personalize the customer experience/journey.


Everyone might have worked on different migrations project but migrating thousands of pages from one format to another is the real challenge. We need to do a lot of cleaning, validation and make those pages new format friendly. Recently we worked on similar project in which we successfully migrated HTML pages into AEM XML Documentation. We migrated 6k+ articles in AEM XML Documentation (AEM Guides). In this blog I am going to discuss about how we have successfully migrated all those articles. Also will discuss about challenges we faced and how we resolved them. I hope this blog will give you heads up for your upcoming DITA project.


These are some issues we faced initially and then accordingly we resolved them.

  • Fetching HTML articles from APIs
  • Validation of HTML syntax
  • Unwanted characters clean up
  • Conversion of HTML Articles into DITA.
  • Converted DITA articles bulk import in AEM.
  • Creation of DITA Map for each article to maintain one-to-one mapping
  • Bulk site pages generation.

To handle above challenges we used different technologies and tools. We used divide and conquer methodology and accordingly segregated the problem into different series of steps. Later integrated the output and proceed further for site generation. I have mentioned basic implementation steps but in actual implementation we did lot of fine filtering.


we created python script to performing different operations for each article, which are as follows-

  1. Reading each article data from API and creation of JSON.
  2. Creation of HTML file with article name by adding article snippet under body tag.
  3. Updating relative path with new decided AEM folder structure path.
  4. For HTML syntax validation and clean up.
  5. Feeding final version of HTML to H2D plugin for converting HTML articles into DITA format.
  6. Creation of DITA Map for each DITA article.

Once we got DITA and DITA Map for all the articles we used AEM Bulk Asset Importer for DITA, DITA Map as well as Images and for creation of tags we used AEM Tag Maker. We created separate CSVs for each upload.

  1. DITA bulk upload.
  2. Image bulk upload.
  3. Tag Maker.

After DITA, DITA Map and Images upload we used AEM XML documentation Map Dashboard for site generation. AEM provided different output presets like-

  • AEM Site
  • PDF
  • HTML
  • Custom

Or if you want you can create your own output preset also. For bulk site generation we used AEM Map collection tool which comes with AEM XML Documentation package.


Based on your successful use and integration of multiple Adobe Experience Cloud products, describe how it has transformed the customer experience/journey, and the value, business impact, and results your company/organization has realized.


It is difficult to manage 6k+ articles and even more difficult to update and export it. But with the help of AEM XML Documentation, now they can easily Author the articles, Maintain different versions of articles, Export into different formats, Create review tasks, Create separate glossary, Generate Site etc. There are tons of useful features and customizable functionalities AEM provided for DITA which helps them to boost their business and productivity.

Aspire Analytics Experience Manager