We are flagged by Security team for below issue.
They have created a Python script which can be used to crawl any of our sites and fetch JCR data like UserID. Our sites are public sites.
Here is summary of the test, details of the test is attached here.
The Python project is in GitHub for your test/reference.
https://github.com/ilatypov/aem-hacker/blob/master/aem_slurper.py
$ time python3 aem_slurper.py www.manulifebank.ca 2>&1 | tee manulifebank.txt
Connecting to www.manulifebank.ca...
2020-09-28 09:23:15-0400 /content/manulife-bank/en_CA/jcr:content admin {"id": "jcr:content", "uri": "/content/manulife-bank/en_CA/jcr:content", "jcr:primaryType": "cq:PageContent", "jcr:mixinTypes": ["mix:lockable", "mix:versionable", "cq:LiveSync", "cq:PropertyLiveSyncCancelled"], "chatButton": "default", "imageSelect": "ui.icon.tab.select.value", "dateFormatField": "MMMM yyyy", "jcr:createdBy": "admin", "jcr:title": "English", "contentCategory": "none", "resourceCardType":
[...]
^C
real 1m19.645s
user 0m0.567s
sys 0m0.346s
Our security concern is the script can see the user ids when it crawls pages. The user ids are also available under nodes through CRX when I check publisher as anonymous user.
Question: How can I hide the user ids from external scripts to crawl..?
Solved! Go to Solution.
Views
Replies
Total Likes
you can not hide the user ids.
you can deny the access to those nodes by external systems by updating your dispatcher rules.
Apply dispatcher filters.
## Deny content reading by queries and prevent un-intended self DOS attacks
/0033 { /type "deny" /selectors '(feed|pages|rss|blueprint|infinity|tidy|sysview|docview|query|[0-9-]+|jcr:content)' /extension '(json|xml|html|feed)' }
/0322 { /type "deny" /suffix '(.*infinity.*|.*children.*|.*tidy.*)' }
/0323 { /type "deny" /url '.*/[.][.];/.*' }
check the DoS (Denial of Service) rules , security checklist for more details
you can not hide the user ids.
you can deny the access to those nodes by external systems by updating your dispatcher rules.
Apply dispatcher filters.
## Deny content reading by queries and prevent un-intended self DOS attacks
/0033 { /type "deny" /selectors '(feed|pages|rss|blueprint|infinity|tidy|sysview|docview|query|[0-9-]+|jcr:content)' /extension '(json|xml|html|feed)' }
/0322 { /type "deny" /suffix '(.*infinity.*|.*children.*|.*tidy.*)' }
/0323 { /type "deny" /url '.*/[.][.];/.*' }
check the DoS (Denial of Service) rules , security checklist for more details
Views
Replies
Total Likes
##you can try like this
## Block Python script
RewriteRule ^.*py - [F,NC,L]
or
*.py
Views
Replies
Total Likes
Views
Replies
Total Likes