Expand my Community achievements bar.

SOLVED

How does DAM Parse Word Documents create renditions/modify asset properties?

Avatar

Level 8

I've uploaded 2 word documents (both .docx) on my local and Adobe AMS-hosted AEM. Both documents contains a form.

 

In both instances, these things happened:

1. DAM Parse Word Documents workflow triggered

2. images presents on the Word document where identified as sub-assets

 

But something is different from the documents. When inspecting the properties for both documents:

- document1 has "Referenced by: Page extract from Word"

- document2 has no "Referenced by: Page extract from Word"

 

According to this page (https://helpx.adobe.com/au/experience-manager/6-4/assets/using/managing-assets-touch-ui.html#main-pa...), a cq:Page node is created for the documents and that is correct. I found these:

- http://localhost:4502/content/dam/mytest/document1.docx/jcr%3acontent/renditions/page.html

- http://localhost:4502/content/dam/mytest/document2.docx/jcr%3acontent/renditions/page.html

 

Any ideas what's happening in relation to "Referenced by: Page extract from Word"? I would've expected both to have it or not have it.

 

Thanks

Topics

Topics help categorize Community content and increase your ability to discover relevant content.

1 Accepted Solution

Avatar

Correct answer by
Employee

Generally this is done by Apache Tika 

 

https://tika.apache.org/

View solution in original post

1 Reply

Avatar

Correct answer by
Employee

Generally this is done by Apache Tika 

 

https://tika.apache.org/