We are trying to extract component and template structure in JSON format from AEM CMS version 6.5, we required template structure details like fields Name, field types, any datasource url which added into field and for component, we required template name along with content which used to build the respective component. Currently for Page content extraction we used JSON exporter module(.model.json) and for template extraction we tried with querybuilder => /bin/querybuilder.json?path=/conf/we-retail/settings/wcm/templates/hero-page/initial/jcr:content/root&type=nt:unstructured&p.hits=selective&p.properties=jcr:path&p.limit=-1. We would like to understand is there any direct way to fetch the response for the above requirement
Topics help categorize Community content and increase your ability to discover relevant content.
Views
Replies
Total Likes
Hi @BhuvaneshwariG ,
Unfortunately, AEM 6.5 does not provide a direct OOTB API to extract component/template structure as JSON including field types and datasources.
Recommended Approach:
1. Use QueryBuilder with enriched properties for partial info.
2. Write a Groovy script or Sling Servlet for detailed, structured extraction.
3. If frequently needed, consider creating a custom tool/endpoint that recursively parses component/template folders and outputs structured JSON.
1. JSON Exporter (Sling Model Exporter)
Use Case: Extracting page content as JSON.
Limitation: Primarily for content rendering, not for template or component metadata.
Your Current Use: Correct for page content but not suitable for extracting field configurations or datasources from templates/components.
2. QueryBuilder Approach for Template Fields
You’re using something like:
/bin/querybuilder.json?path=/conf/we-retail/settings/wcm/templates/hero-page/initial/jcr:content/root&type=nt:unstructured&p.hits=selective&p.properties=jcr:path&p.limit=-1
This helps extract node structure, but you might miss granular field-level details, like:
- Field Names
- Field Types (e.g., text, pathbrowser, multifield)
- Datasource paths (if any)
Suggestion:
- Enhance the query with more p.properties like sling:resourceType, name, fieldLabel, fieldDescription, datasource, options, etc.
Example:
p.properties=jcr:path,name,sling:resourceType,fieldLabel,fieldDescription,datasource
3. JCR Node Traversal (Custom Servlet or Script)
For detailed extraction, you may write a custom servlet or use Groovy scripts via ACS AEM Tools to recursively traverse:
/conf/<your-site>/settings/wcm/templates → for templates
/apps/<your-project>/components → for component dialogs
You can access:
cq:dialog or _cq_dialog nodes for components
Fetch field names, field types, and datasource values from granite:Field types
Example JSON structure output (custom):
{
"component": "my-project/components/content/hero",
"template": "my-project/templates/hero-template",
"fields": [
{
"name": "title",
"type": "text",
"label": "Hero Title"
},
{
"name": "image",
"type": "pathbrowser",
"datasource": "/mnt/overlay/dam/gui/content/assets.html"
}
]
}
4. ACS AEM Tools – Groovy Console (Quick Extraction)
- URL: /etc/groovyconsole.html
- Write a script to traverse templates/components, read dialogs, and output JSON.
- Ideal for one-time extraction or prototyping.
Regards,
Amit
Views
Replies
Total Likes
You can think of writing an external script if you don't need to get the result from AEM directly. Previously I have written a Python script to retrieve a list of components and templates from a project, with the output formatted as CSV:
Title,Resource Type
Text (v2),core/wcm/components/text/v2/text
Breadcrumb (v1),core/wcm/components/breadcrumb/v1/breadcrumb
The code can be easily updated to generate JSON and parse dialog files to fetch all properties.
import os
import xml.etree.ElementTree as ET
def find_xml_components(search_path, searchComponents=True):
"""
Recursively searches within a specified path for XML files named ".content.xml" that meet certain criteria.
If searchComponents is True, it looks for components (jcr:primaryType="cq:Component") with a componentGroup other than ".hidden".
If searchComponents is False, it looks for templates (jcr:primaryType="cq:Template").
Prints a CSV row for each file that meets the criteria, with the file path and the value of the jcr:title property.
Args:
search_path (str): The path in which to search for XML files.
searchComponents (bool): If True, searches for components; if False, searches for templates.
"""
# Check if the passed path is a valid directory
if not os.path.isdir(search_path):
print(f"Error: The path '{search_path}' is not a valid directory.")
return
csv_data = []
for root, _, files in os.walk(search_path):
for file in files:
if file == ".content.xml":
file_path = os.path.join(root, file)
# Skip files in header or footer folders
if 'header' in file_path or 'footer' in file_path:
continue
try:
# Extract the resource type from the file path
resource_type = file_path.split('/apps/')[1].replace('/.content.xml', '')
tree = ET.parse(file_path)
root_element = tree.getroot()
# Discover namespaces
ns = {}
for event, elem in ET.iterparse(file_path, events=("start-ns",)):
if event == "start-ns":
ns[elem[0]] = elem[1]
# Check if jcr and cq namespaces are present, or set default namespaces to avoid errors
jcr_ns = ns.get("jcr", "http://www.jcp.org/jcr/1.0")
cq_ns = ns.get("cq", "http://www.day.com/jcr/cq/1.0")
# Get the values of the attributes
primary_type_element = root_element.attrib.get(f"{{{jcr_ns}}}primaryType")
component_group_element = root_element.attrib.get(f"{{{cq_ns}}}componentGroup")
jcr_title_element = root_element.attrib.get(f"{{{jcr_ns}}}title")
if jcr_title_element is None:
continue
if searchComponents and primary_type_element == "cq:Component":
# Check if the componentGroup is not ".hidden" or if it is not present
if component_group_element is None or (component_group_element != ".hidden"):
title = jcr_title_element if jcr_title_element is not None else ""
csv_data.append((resource_type, title))
else:
continue
elif primary_type_element == "cq:Template":
csv_data.append((resource_type, jcr_title_element))
except ET.ParseError as e:
print(f"Error parsing XML file '{file_path}': {e}")
except Exception as e:
print(f"Error handling file '{file_path}': {e}")
# Print the results in CSV format
if csv_data:
print("Title,Resource Type")
for row in csv_data:
print(f"{row[1]},{row[0]}")
else:
print("No files matching the criteria found")
if __name__ == "__main__":
print("Components CSV:")
for search_directory in [
'~/aem-core-cif-components/ui.apps/src/main/content/jcr_root/apps/core/cif'
]:
find_xml_components(search_directory, True)
print("Templates CSV:")
for search_directory in [
'~/aem-core-wcm-components/content/src/content/jcr_root/apps/core'
]:
find_xml_components(search_directory, False)
Views
Replies
Total Likes
Views
Likes
Replies
Views
Likes
Replies