Expand my Community achievements bar.

Enhance your AEM Assets & Boost Your Development: [AEM Gems | June 19, 2024] Improving the Developer Experience with New APIs and Events

Best practices for creating a site hierarchy from data?


Level 3

I'm running AEM 6.5 at Adobe Managed Services. I need to build a new site, based on data from an existing database-driven site. I'm looking for the current best practice for importing data and creating a page hierarchy from it.

The site lets users search for items, and the detail pages for those items sit at the bottom of a category hierarchy. I want to create intermediate pages for the hierarchy, and item pages for the leaf nodes. I can export the item and hierarchy data from the database to delimited files.

I'm quite comfortable in the Groovy console, and my first thought was to use it to process the data and create corresponding nodes in the JCR for it, specifying the appropriate page templates and components. But I'm wondering if there are any other tools in common use to do this sort of thing? What's the best way to do this in 2023?

3 Replies


Community Advisor

Hello @valcohen - 


  • If you need to build a new site based on data from an existing database-driven site, there is no specific out-of-the-box tool in AEM that directly handles this scenario.
  • However, you can still leverage the Content Transfer Tool in conjunction with other tools or custom development to import content from a database into AEM.

Here's a possible approach:


  1. Extract Data from the Database: Export the data from your database instance into a format that is compatible with the Content Transfer Tool. This could involve transforming the database data into a supported format like XML, CSV, or JSON.

  2. Map Database Fields to AEM Structure: Analyze the exported data and map the database fields to the corresponding structure in AEM. Determine how the data will be mapped to AEM pages, components, templates, and properties.

  3. Pre-process the Data: If needed, you may have to pre-process the exported data to align it with the expected format for the Content Transfer Tool. This could involve restructuring the data or transforming it to match the expected structure in AEM.

  4. Use Content Transfer Tool for Import: Once the exported data is prepared, you can utilize the Content Transfer Tool to import the data into AEM. Configure the import options, including the import structure definition file, mapping the exported data to the AEM page hierarchy and content.

  5. Custom Development: If the mapping between the database data and the AEM structure is complex or requires additional transformations, you may need to develop custom code or scripts to handle the data transformation before using the Content Transfer Tool.




Level 3

Hi Tanika, thanks for the reply. You didn't link to the tool you describe, but I found this under AEM as a Cloud Service Migration Journey -- I assume it's what you're recommending? But this tool looks to be designed to move content from an existing AEM site to a "migration set", not a set of files, and import it to AEM Cloud Service, while my target is an AEM managed instance. So it doesn't look very promising for what I'm trying to do. I'll take a closer look, but at first glance it looks like more trouble than it's worth for my use case. 

Thanks though!


Level 3

For anyone in a similar situation, I ended up writing a Groovy script for this task. I got an extract of the legacy DB in XML format with some embedded JSON data. With the Groovy Console's support for XMLSlurper, JSONSlurper, pagebuilder and JCR APIs I was able to process the data and create the site hierarchy, instantiate required content fragments, set node properties such as resource and template types, etc. 

It's been a pretty productive workflow -- I built it up piece by piece and running the script, checking results in CRX/DE, tweaking the script to fix problems and re-running was a pretty quick way to get the job done. If I had to do it again I'd pick this approach.

I'm not on Cloud Service, so I can run the script directly on my AEM instances -- otherwise I'd have to do it on a local instance and then export the results as content packages, and would have to modify the approach to deal with lack of content (particularly images) on the local instance. So, not sure how well this will work if Cloud Service is your target; happily I didn't need to deal with that.