The majority of our assets are ingested through RESTful APIs, and often the metadata fails to meet compliance standards. I am seeking recommendations for frameworks or approach that can be utilized to
Solved! Go to Solution.
Views
Replies
Total Likes
Thanks @Tethich @Fanindra_Surat @A_H_M_Imrul @narendragandhi for your response. All the approaches seem valid for different use-cases.
Sharing my thought process, after borrowing your amazing suggestions.
Existing Data:
Incoming Data:
Evolving Rules:
Validation efforts should serve the needs of both technical and business teams:
Reports and Visualizations
Scheduled Health Reports: Automate periodic checks using schedulers to assess overall system health.
Extend ACS Commons Reports for validate assets + generate targeted reports by query, path, or metadata filters.
External Validations before ingesting data: This would definitely have been a good approach to lighten AEM load. However, there are few challenges:
Dependency on other team to update the rules
We would still need an approach to identify issues in existing assets/ones updated manually/ gradual clean-up of medium-low priority data.
I am inclining towards Custom Reporting and Storage by using:
Store validation results for non-compliant assets in /var nodes, enabling:
Please do share your thoughts if you see any challenges/improvements.
Is it possible to expand a little more the subject ? How does your current import process looks like ? How do all the parties involved in the ingestion integrate ? When and where do you expect the regular quality checks on the content to happen ? What does it mean for you that a metadata compliance is failing ? Some diagrams or screenshot also would not hurt.
Thanks for the queries. Sharing details below:
Non-compliance can be if metadata is from a specific set of values, format etc.
For the regular scheduled quality check, it can be a scheduled job which can get the list of updated assets in a time period. This will perform the specified checks and then add/update a field in metadata that keeps track of the last review date.
For the API to fetch metadata of assets, you can explore this - https://developer.adobe.com/experience-cloud/experience-manager-apis/api/experimental/assets/author/
Hope this helps!
Narendra
In am thinking that you might lift some of the burden from AEM and have a separate app, maybe built with microservices, that runs periodically, checks the objects metadatas in Amazon S3 based on your criteria and marks the metadata accordingly. So that when AEM will pull the stuff it will already know if data was validated or not before ingesting it.
Some good ways to implement a validator were already posted here.
Hello @aanchal-sikka ,
I hope you're doing well.
Given the additional computational overhead, it might be better to perform validation or sanity checks outside of AEM, prior to asset ingestion, if feasible.
If the compliance violations occur when assets are exposed to traffic from the publisher, could we consider using an asset replication interceptor (such as a replication preprocessor) to validate and allow only compliant assets to be replicated?
To monitor faulty assets, we could set up a scheduled Sling job to generate reports that identify non-compliant entries.
Let me know if this approach makes sense
Hi @aanchal-sikka -
My thoughts -
From what you shared below, you have a custom process setup in AEM that reads assets and metadata from a S3 location. Are you referring to the issues within this metadata that is stored separately or the ones like, xmp metadata that are extracted out of an asset?
If it is the separately managed metadata - Can you not validate or run your compliance check during the ingestion phase in your custom process?
If it is the xmp metadata - you will need to create a custom process and configure it to be invoked as part of a DAM metadata writeback workflow itself or as a separate scheduler as per the need.
Regards,
Fani
@aanchal-sikka Did you find the suggestions helpful? Please let us know if you require more information. Otherwise, please mark the answer as correct for posterity. If you've discovered a solution yourself, we would appreciate it if you could share it with the community. Thank you!
Views
Replies
Total Likes
Thanks @Tethich @Fanindra_Surat @A_H_M_Imrul @narendragandhi for your response. All the approaches seem valid for different use-cases.
Sharing my thought process, after borrowing your amazing suggestions.
Existing Data:
Incoming Data:
Evolving Rules:
Validation efforts should serve the needs of both technical and business teams:
Reports and Visualizations
Scheduled Health Reports: Automate periodic checks using schedulers to assess overall system health.
Extend ACS Commons Reports for validate assets + generate targeted reports by query, path, or metadata filters.
External Validations before ingesting data: This would definitely have been a good approach to lighten AEM load. However, there are few challenges:
Dependency on other team to update the rules
We would still need an approach to identify issues in existing assets/ones updated manually/ gradual clean-up of medium-low priority data.
I am inclining towards Custom Reporting and Storage by using:
Store validation results for non-compliant assets in /var nodes, enabling:
Please do share your thoughts if you see any challenges/improvements.
Views
Likes
Replies