Hi,
The requirement is:
We need to make a CQ page non Google Searcheable. For that we have added <meta name="robots" content="noindex"> in template level and provided a checkbox option in page properties so that he can disable search for a specific page.
Now the challenge is for Assets (PDF), i added a custom checkbox but dont know where to place this <meta> tag since we dont have any templates for Assets unlike pages. Need help on this.
Solved! Go to Solution.
Views
Replies
Total Likes
Yes,
Like you said you have implemented as checkbox for the asset. Now your next task should be to implement a event handler / workflow / scheduler which will update the robots.txt file dynamically.
Example:
Whenever user check/uncheck the custom implemented checkbox, it will get store in JCR. Now any of event handler / workflow / scheduler will take that stored property and updated the robot.txt file accordingly.
have doubts? let me know
Views
Replies
Total Likes
Yes,
Like you said you have implemented as checkbox for the asset. Now your next task should be to implement a event handler / workflow / scheduler which will update the robots.txt file dynamically.
Example:
Whenever user check/uncheck the custom implemented checkbox, it will get store in JCR. Now any of event handler / workflow / scheduler will take that stored property and updated the robot.txt file accordingly.
have doubts? let me know
Views
Replies
Total Likes
Your welcome :)
Views
Replies
Total Likes
Hi Bhargav,
One similar question was asked sometime back where user don't want PDF to be searchable.
Please see the thread and let me know if you have any doubt on it.
Thanks
Views
Replies
Total Likes
If we need to test whether its properly working or not, how can we do that ?
Thanks,
Bhargav
Views
Replies
Total Likes
So I need to update the robots.txt at /content/<project> with asset path (say for example /content/dam/geometrixx/documents/GeoSphere_Datasheet.pdf) ?
One more doubt is does the <meta> tag i stated in question is no longer needed right since we are controlling from checkbox !?!
Please correct me if I go wrong
Thanks,
Bhargav
Views
Replies
Total Likes
You should probably take a look at : https://support.google.com/webmasters/answer/6062598?hl=en
Views
Replies
Total Likes
Yes, you are on right track. You need to mentioned you PDF file path.
For PDF we do not have <meta> tag like we have for pages but as you mentioned you can use these tags for pages.
Here is more for you:
1) Use robots.txt to block the files from search engines crawlers:
User-agent: * Disallow: /pdfs/ # Block the /pdfs/directory. Disallow: *.pdf # Block pdf files. Non-standard but works for major search engines.
2) Use rel="nofollow" on links to those PDFs
<a href="something.pdf" rel="nofollow">Download PDF</a>
Complete Documentation: http://www.robotstxt.org/robotstxt.html
Any Doubt? let me know
thanks
Views
Replies
Total Likes