Expand my Community achievements bar.

Learn about Edge Delivery Services in upcoming GEM session
SOLVED

Indexing the content fragment's data that every page has referenced.

Avatar

Level 2

Hi there,

We have received a request in order to do a suggestive search service that fetch a list of pages based on a String received in a parameter ( the searched word). The problem came when a pages has a content fragment inside it. The word received needs to be searched into the content fragment inside. For example:

Calling service -> localhost:4502/bin/service.json?searchedString=My text

Searching the string My text into the next structure

/content

     /project

          /page-of-my-text

               /jcr:content

                    property: content-fragment = /content/dam/project/content-fragment

/content

     /dam

          /project

               /content-fragment

                    property:data = "Hey, My text is here"

And the result should be the hits where that string is appearing.

    title:My Test Page

     path: /content/project/page-of-my-text

     string-searched: My Text

We already have how to search all nodes in /content but we don't know how to link the property content-fragment inside in order to index the content fragment that every page has inside.

Do you guys have any idea? We are not using Solr (the request is to use only OOB of Adobe).

Regards.

1 Accepted Solution

Avatar

Correct answer by
Level 10

I think the flow would be:

1) Navigate to /content/page path, find the 'content-fragment' node and pick the fileReference property to identify the fragment's path.

2) Navigate to the fragment path and find appropriate master/variation node and pick the data property.

You may achieve this in a nested query, joined query or separate queries etc.

#1 would have its own index /oak:index which could be a simple property index and doesn't need to be a full-text

#2 would have its own index under /oak:index/damAssetLucene which would be a full-text index and you would specify the specific property names under that.

When you run the query(ies) in explain plan, you can validate that both indexes are being applied or not.

If you plan to use a single index for both scenarios then try following:

Create a non-root index under your content path

/content

     /project_root

          /page1    //content page

          /oak:index

               /mydamAssetLucene  // type lucene - replicate the /oak:index/damAssetLucene node structure here and delete unwanted nodes/properties

                    /aggregates

                         /dam:Asset  //for CF

                    /indexRules

                         /dam:Asset // for CF

                         /cq:PageContent // for content pages under project_root

                            /properties  // properties of content pages

I have never done like this but worth trying. This structure would enable full text index on your content page properties as well which is not required but you may want to live with that. Check the performance & explain plan thoroughly.

Jackrabbit Oak – Lucene Index

View solution in original post

3 Replies

Avatar

Level 10

What is the cardinality or the relationship between content pages and content fragment(s) - many-to-one or many-to-many?

If you already have the code to search under /content path hierarchy then you can easily pick content-fragment property defined in each page and search on the data property of each fragment. - This would take care of the search solution

You mentioned that its a suggestive search which means it would require a full-text search index - refer 'The Lucene Full Text Index' @ Oak Queries and Indexing

Since its a CF index, it must be created under /oak:index/damAssetLucene, refer 'Manual Addition of an Oak Index' @  Content Fragment Updates and Content Services - Feature Pack Release Notes

If its a single CF referenced in multiple pages and specific requirements that restrict you from creating the index under /oak:index/damAssetLucene, then you could also create the index under '/content/dam/project/content-fragment' node.

Avatar

Level 2

Hi gauravb10066713,

The cardinality is one page to one content fragment. The code is ready to search under /content. The problem is how to get the data property of each content fragment assoiated to that page in the same search. I mean, basically by searching in the same oak index. We have created our custom Index in order to index some properties.

We would want to create a custom index with the assosiation mentionned. The data of the content fragment referenced to a page.

Regards.

Avatar

Correct answer by
Level 10

I think the flow would be:

1) Navigate to /content/page path, find the 'content-fragment' node and pick the fileReference property to identify the fragment's path.

2) Navigate to the fragment path and find appropriate master/variation node and pick the data property.

You may achieve this in a nested query, joined query or separate queries etc.

#1 would have its own index /oak:index which could be a simple property index and doesn't need to be a full-text

#2 would have its own index under /oak:index/damAssetLucene which would be a full-text index and you would specify the specific property names under that.

When you run the query(ies) in explain plan, you can validate that both indexes are being applied or not.

If you plan to use a single index for both scenarios then try following:

Create a non-root index under your content path

/content

     /project_root

          /page1    //content page

          /oak:index

               /mydamAssetLucene  // type lucene - replicate the /oak:index/damAssetLucene node structure here and delete unwanted nodes/properties

                    /aggregates

                         /dam:Asset  //for CF

                    /indexRules

                         /dam:Asset // for CF

                         /cq:PageContent // for content pages under project_root

                            /properties  // properties of content pages

I have never done like this but worth trying. This structure would enable full text index on your content page properties as well which is not required but you may want to live with that. Check the performance & explain plan thoroughly.

Jackrabbit Oak – Lucene Index