Hi
I'm running CQ 5.6.0 and currently have PDF DAM assets that are externally searchable via internet search engines. I would like to be able to customise the DAM so that I can define at an asset level whether or not a particular asset should be searchable in Google, Bing etc.
My plan is to create a property at the asset level which, once selected, adds the path to a NoRobots.txt file.
My questions are:
Many thanks in advance
Fabio
Solved! Go to Solution.
Views
Replies
Total Likes
Two Things Here:
You can create versions of a pdf as well, this way only one pdf will be present for every product. These version are similar to version we have for pages.
Then you can use same name for new file. This approach will not require to hide previous files for same product explicitly as they will not be visible.
If still you want to go for a new file with every version, then I recommend you to follow some naming convention for the latest file, so that your java code can ignore these files under same product and get the name of all other file. Once you have all the name, you can add those path in robots.txt. Make sure you follow naming conversion of robots.txt while updating it programatically.
Tell me, what do you feel?
Views
Replies
Total Likes
Hi,
Have you tried changing the permissions on the assets of folders? Try removing the permissions for Everyone group, it should work.
Thanks
Views
Replies
Total Likes
Hi
Thanks for the fast response!
I'm looking for a way to control which PDFs are searchable and which aren't that I can give the users of the system. The defining of permissions on a folder sounds like a superuser group role which my users won't be allowed to have.
Is there another way or have I misunderstood?
thanks
Fabio
Views
Replies
Total Likes
Hi Fabio,
I just wanted to get clear on your exact requirement, so here what is understood ( correct me if I am wrong ):-
You are having some PDFs in your DAM which are not in single folder ( maybe distributed across folders )
Are you planning to dynamically include this hide from search engine pdf features?. It means today pdf "A" can be hidden from search and tomorrow it may not be hidden.
[OR]
A particular PDF will always be hidden forever and pdf which are not hidden will never be hidden.
Please clear If I am wrong.
Views
Replies
Total Likes
Almost ;)
The DAM folder structure reflects the company's product structure with one folder per product. Within a Product's folder there are PDFs which describe the product. The users would like to add new versions of the PDFs with new URLs so that they can direct users to the PDF most appropriate to a customer when they signed up to a product. However, they only want the latest version of a PDF to be externally searchable by Google, Bing, Yahoo etc.
To solve this we have directed the users to upload new versions of the PDF but warning them to make sure the File name is different so that we'll get a new path. What I'm stuck on is how to stop a Search Engine finding a particular, defined asset from the DAM.
So the requirements are:
Thanks in advance
Fabio
Views
Replies
Total Likes
Two Things Here:
You can create versions of a pdf as well, this way only one pdf will be present for every product. These version are similar to version we have for pages.
Then you can use same name for new file. This approach will not require to hide previous files for same product explicitly as they will not be visible.
If still you want to go for a new file with every version, then I recommend you to follow some naming convention for the latest file, so that your java code can ignore these files under same product and get the name of all other file. Once you have all the name, you can add those path in robots.txt. Make sure you follow naming conversion of robots.txt while updating it programatically.
Tell me, what do you feel?
Views
Replies
Total Likes
Two Things Here:
You can create versions of a pdf as well, this way only one pdf will be present for every product. These version are similar to version we have for pages.
Then you can use same name for new file. This approach will not require to hide previous files for same product explicitly as they will not be visible.
That won't work as a requirement I have is to have the old pages accessible if someone hits the link directly for the old PDF. Only the new version needs to be searchable by Google.
Once you have all the name, you can add those path in robots.txt. Make sure you follow naming conversion of robots.txt while updating it programatically.
This is the way I was envisaging it working. The question is how to create and update the robots.txt file? Is there a way to add a property to a DAM asset so that I can programmatically create the robots.txt file with the required paths?
Views
Replies
Total Likes
If you want to add property to dam asset, probably you can provide some dialog box [OR] customize OOTB DAM fields to add an extra property to assets
Else As I suggested you can go for any naming convention to be followed by file (This could take less time and will be easy).
You can create a file via java with MIME type set as "text/plain" for robots.txt. and upload it to you content/[project]/robots.txt path.
Views
Replies
Total Likes
I do not think there is a way to prevent a Google crawl on PDF 1 while allowing a crawl on PDF 2 in same location. I will double check.
Views
Replies
Total Likes
Perfect- thank you
Views
Replies
Total Likes
Views
Likes
Replies
Views
Likes
Replies