How does Indexing happen in CQ 5.6.1 | Community
Skip to main content
Level 9
October 16, 2015
Solved

How does Indexing happen in CQ 5.6.1

  • October 16, 2015
  • 6 replies
  • 1450 views

Hi All,

Details as below :

1] Suppose I upload a DAM asset in CQ with name xyz.jpg

2] The very next moment I can make use of that asset in my page. Go to content finder in my page, search by name xyz.jpg and drag the asset onto the relevant component in my page.

3] How is it that CQ indexes DAM asset so quickly and makes it available for searching. What exactly is the process flow that happens in the background.

4] Can someone please provide a brief description to this and provide few good references to it.

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by joerghoh

Hi,

The indexing happens as part of the write action to the repository. When you upload large assets, it's likely that it's done asynchronously, for smaller writes it happens synchronously. This is part of the repository implementation and I don't know if there's a good documentation on that.

kind regards,
Jörg

6 replies

joerghoh
Adobe Employee
joerghohAdobe EmployeeAccepted solution
Adobe Employee
October 16, 2015

Hi,

The indexing happens as part of the write action to the repository. When you upload large assets, it's likely that it's done asynchronously, for smaller writes it happens synchronously. This is part of the repository implementation and I don't know if there's a good documentation on that.

kind regards,
Jörg

askdctmAuthor
Level 9
October 16, 2015

Hi Jorg,

Thank you for your reply.

I have heard couple of terms related to indexing as below :

- Workspace index

-Repository index

-Version history reindex

I am not getting as to what exactly does this mean. Brief description on this willl be helpful.

Adobe Employee
October 16, 2015

Just as an addendum to what Joerg explained, please refer to http://jackrabbit.apache.org/how-jackrabbit-works.html . Look for links to Query Manager in that link. 

joerghoh
Adobe Employee
Adobe Employee
October 16, 2015

Hi,

Ok, some more details (assuming, that we talk about TarPM here)

  • Each workspace consists of a bunch of tar files, where changes are just appended. So to find the latest entry for any given item, you need to maintain a kind of "HEAD" pointer. These pointers are maintained in the workspace index files. These are the files named "index_0.tar", "index_1.tar" etc just next to the "data*.tar" files. This index is maintained within as part of the transaction.
  • To support JCQ query, a separate Lucene index is maintained (this the index which I referred in my first response). Depending on the change this index is updated either within the transaction (synchronous) or outside the transaction (async).
  • The term "repository index" isn't clearly defined :-) It can refer to both of the 2 indexes I just mentioned.
  • "Version history reindex": There is a dedicated workspace to keep the versions, and as just described, this workspace has its own index.

Is that sufficient?

Please note, that this has completly changed with AEM 6.0 and Oak as repository.

kind regards,
Jörg

askdctmAuthor
Level 9
October 16, 2015

Hi Kalyanar,

Thanks a lot for the reference link you provided.

askdctmAuthor
Level 9
October 16, 2015

Hi Kalyanar/Jorg,

Also, can you please let me know the difference between index folder present in the below two locations 

crx-quickstart/repository/workspaces/crx.default/index/

crx-quickstart/repository/repository/index/