Your achievements

Level 1

0% to

Level 2

Tip /
Sign in

Sign in to Community

to gain points, level up, and earn exciting badges like the new
BedrockMission!

Learn more

View all

Sign in to view all badges

SOLVED

DAM Assets Issue on disabling Google search

BhargavThogata
Level 1
Level 1

Hi,


The requirement is:

We need to make a CQ page non Google Searcheable. For that we have added <meta name="robots" content="noindex"> in template level and provided a checkbox option in page properties so that he can disable search for a specific page.
Now the challenge is for Assets (PDF), i added a custom checkbox but dont know where to place this <meta> tag since we dont have any templates for Assets unlike pages. Need help on this.

 

Thanks,
Bhargav
1 Accepted Solution
edubey
Correct answer by
Level 10
Level 10

Yes, 

Like you said you have implemented as checkbox for the asset. Now your next task should be to implement a event handler / workflow / scheduler which will update the robots.txt file dynamically.

Example: 

Whenever user check/uncheck the custom implemented checkbox, it will get store in JCR. Now any of event handler / workflow / scheduler  will take that stored property and updated the robot.txt file accordingly.

have doubts? let me know 

View solution in original post

7 Replies
edubey
Correct answer by
Level 10
Level 10

Yes, 

Like you said you have implemented as checkbox for the asset. Now your next task should be to implement a event handler / workflow / scheduler which will update the robots.txt file dynamically.

Example: 

Whenever user check/uncheck the custom implemented checkbox, it will get store in JCR. Now any of event handler / workflow / scheduler  will take that stored property and updated the robot.txt file accordingly.

have doubts? let me know 

View solution in original post

edubey
Level 10
Level 10

Hi Bhargav,

One similar question was asked sometime back where user don't want PDF to be searchable.

Please see the thread and let me know if you have any doubt on it.

Thread : http://help-forums.adobe.com/content/adobeforums/en/experience-manager-forum/adobe-experience-manage...

Thanks

BhargavThogata
Level 1
Level 1

If we need to test whether its properly working or not, how can we do that ?

 

Thanks,

Bhargav

BhargavThogata
Level 1
Level 1

So I need to update the robots.txt at /content/<project> with asset path (say for example /content/dam/geometrixx/documents/GeoSphere_Datasheet.pdf) ?

One more doubt is does the <meta> tag i stated in question is no longer needed right since we are controlling from checkbox !?!

Please correct me if I go wrong

 

Thanks,

Bhargav

edubey
Level 10
Level 10

Yes, you are on right track. You need to mentioned you PDF file path.

For PDF we do not have <meta> tag like we have for pages but as you mentioned you can use these tags for pages.

Here is more for you:

1) Use robots.txt to block the files from search engines crawlers:

User-agent: * Disallow: /pdfs/ # Block the /pdfs/directory. Disallow: *.pdf  # Block pdf files. Non-standard but works for major search engines.

2) Use rel="nofollow" on links to those PDFs

<a href="something.pdf" rel="nofollow">Download PDF</a>

Complete Documentation: http://www.robotstxt.org/robotstxt.html

Any Doubt? let me know

thanks