Expand my Community achievements bar.

Don’t miss the AEM Skill Exchange in SF on Nov 14—hear from industry leaders, learn best practices, and enhance your AEM strategy with practical tips.
SOLVED

AEM 6.3: Getting 403 error on dispatcher and blank page

Avatar

Level 4

Hi,

I have configured Apache web server and dispatcher. Here are the rewrite rules:

RewriteRule ^/$ /content/aemsite/en.html [PT,L]

  RewriteCond %{REQUEST_URI} ^/content
  RewriteCond %{REQUEST_URI} !^/content/campaigns
  RewriteCond %{REQUEST_URI} !^/content/dam 
  RewriteRule !^/content/aemsite/en - [R=404,L,NC]

  RewriteCond %{REQUEST_URI} !^/apps
  RewriteCond %{REQUEST_URI} !^/content
  RewriteCond %{REQUEST_URI} !^/etc
  RewriteCond %{REQUEST_URI} !^/home
  RewriteCond %{REQUEST_URI} !^/libs
  RewriteCond %{REQUEST_URI} !^/tmp
  RewriteCond %{REQUEST_URI} !^/var
  RewriteRule ^/(.*)$ /content/aemsite/en/$1 [PT,L]

I have also configured sling mappings:

{
  "jcr:primaryType": "sling:Mapping",
  "www_aemsite_com": {
  "sling:internalRedirect": [
  "/content/aemsite/en.html"
  ],
  "jcr:primaryType": "sling:Mapping",
  "sling:match": "www.aemsite.com/$"
  },
  "www.aemsite.com": {
  "sling:internalRedirect": [
  "/content/aemsite/en",
  "/"
  ],
  "jcr:primaryType": "sling:Mapping"
  }
}

and I have also configured "Day CQ Link Checker Transformer" to strip HTML extension:

<?xml version="1.0" encoding="UTF-8"?>
<jcr:root xmlns:sling="http://sling.apache.org/jcr/sling/1.0" xmlns:jcr="http://www.jcp.org/jcr/1.0"
  jcr:primaryType="sling:OsgiConfig"
  linkcheckertransformer.strictExtensionCheck="{Boolean}false"
  linkcheckertransformer.rewriteElements="[a:href,area:href,form:action]"
  linkcheckertransformer.disableRewriting="{Boolean}false"
  linkcheckertransformer.disableChecking="{Boolean}false"
  linkcheckertransformer.stripHtmltExtension="{Boolean}true"
  linkcheckertransformer.mapCacheSize="{Long}5000"/>

A trailing slash gets appended to the URL. For example, a link that points to http://www.aemsite.com/articles has the "/" appended and  I see a blank page. In the logs I see a 403 on web server level while trying to access the pages under http://www.aemsite.com/ as shown below.

127.0.0.1 - - [09/Mar/2018:22:26:44 -0500] "GET /articles HTTP/1.1" 302

127.0.0.1 - - [09/Mar/2018:22:26:44 -0500] "GET /articles/ HTTP/1.1" 403 1

How can I make sure that http://www.aemsite.com/articles gets redirected to http://www.aemsite.com/articles.html internally?

Thanks in advance

1 Accepted Solution

Avatar

Correct answer by
Employee Advisor

Hi,

for me it looks like that the problem in this case is in the webserver/dispatcher config. Is this request sent to AEM at all? If you enable debug logging on the dispatcher you can see if this request is handled by the dispatcher at all or maybe only by the webserver. And if it's handled by the dispatcher it can show you if the request is sent to AEM.

Jörg

View solution in original post

19 Replies

Avatar

Administrator

Jörg Hoh​ Request your help!



Kautuk Sahni

Avatar

Correct answer by
Employee Advisor

Hi,

for me it looks like that the problem in this case is in the webserver/dispatcher config. Is this request sent to AEM at all? If you enable debug logging on the dispatcher you can see if this request is handled by the dispatcher at all or maybe only by the webserver. And if it's handled by the dispatcher it can show you if the request is sent to AEM.

Jörg

Avatar

Level 4

Hi Jörg Hoh thank you for your response. The logs

127.0.0.1 - - [09/Mar/2018:22:26:44 -0500] "GET /articles HTTP/1.1" 302

127.0.0.1 - - [09/Mar/2018:22:26:44 -0500] "GET /articles/ HTTP/1.1" 403 1

are from dispatcher.log and yes the request is not reaching to the publish environment. I have checked the rules under /filter but can't locate if it is due to any rules. Here is the whole dispatcher file:

# Each farm configures a set of load balanced renders (i.e. remote servers)

/farms

  {

  # First farm entry

  /wwwaemsitecom

    { 

    # Request headers that should be forwarded to the remote server.

    /clientheaders

      {

      # Forward all request headers that are end-to-end. If you want

      # to forward a specific set of headers, you'll have to list

      # them here.

      "*"

      }

     

    # Hostname globbing for farm selection (virtual domain addressing)

    /virtualhosts

      {

      # Entries will be compared against the "Host" request header

      # and an optional request URL prefix.

      #

      # Examples:

      #

      #   "www.aemsite.com"

      #   intranet.*

      #   myhost:8888/mysite

      "*"

      }

     

    # The load will be balanced among these render instances

    /renders

      {

      /rend01

        {

        # Hostname or IP of the render

        /hostname "127.0.0.1"

        # Port of the render

        /port "4503"

        # Connect timeout in milliseconds, 0 to wait indefinitely

        # /timeout "0"

        }

      }

     

    # The filter section defines the requests that should be handled by the dispatcher.

    #

    # Entries can be either specified using globs, or elements of the request line:

    #

    # (1) globs will be compared against the entire request line, e.g.:

    #

    #       /0001 { /type "deny" /glob "* /index.html *" }

    #

    #     denies request "GET /index.html HTTP/1.1" but not "GET /index.html?a=b HTTP/1.1".

    #

    # (2) method/url/query/protocol/path/selectors/extension/suffix will be compared

    #     against the respective elements of  the request line, e.g.:

    #

    #       /0001 { /type "deny" /method "GET" /url "/index.html" }

    #

    #     denies both "GET /index.html" and "GET /index.html?a=b HTTP/1.1".

    #

    # (3) all elements of the request line can also be specified as regular expressions,

    #     which are identified by using single quotes, e.g.

    #

    #       /0001 { /type "allow" /method '(GET|HEAD)' }

    #

    #     allows GET or HEAD requests, or:

    #

    #       /0002 { /type "deny" /extension '()' }

    #

    #     denies requests having no extension.

    #

    # Note: specifying elements of the request line is the preferred method.

    #

    /filter

{

      # Deny everything first and then allow specific entries

      /0001 { /type "deny" /glob "*" }

 

      

      # Open consoles

#     /0011 { /type "allow" /url "/admin/*"  }  # allow servlet engine admin

#     /0012 { /type "allow" /url "/crx/*"    }  # allow content repository

#     /0013 { /type "allow" /url "/system/*" }  # allow OSGi console

  /0014 { /type "allow" /glob "* /*" }  # allowing /

      # Allow non-public content directories

#     /0021 { /type "allow" /url "/apps/*"   }  # allow apps access

#     /0022 { /type "allow" /url "/bin/*"    }

#     /0023 { /type "allow" /url "/content*" }  # disable this rule to allow mapped content only

      

#     /0024 { /type "allow" /url "/libs/*"   }

#     /0025 { /type "deny"  /url "/libs/shindig/proxy*" } # if you enable /libs close access to proxy

#     /0026 { /type "allow" /url "/home/*"   }

#     /0027 { /type "allow" /url "/tmp/*"    }

#     /0028 { /type "allow" /url "/var/*"    }

  /0029 { /type "deny" /url "/etc/"    }

  /0030 { /type "allow" /url "/etc.clientlibs/*"    }

  /0031 { /type "allow" /url "/content/*" }

  /0032 { /type "allow" /url "/etc/designs/*" }

  /0033 { /type "allow" /url "/etc/clientlibs/*" }

 

  /0034 { /type "allow" /path "/libs/granite/csrf/token"  /extension '(json)' }

      # Enable extensions in non-public content directories, using a regular expression

      /0041

        {

        /type "allow"

        /extension '(css|gif|ico|js|png|swf|jpe?g)'

        }

      # Enable features

      /0062 { /type "allow" /url "/libs/cq/personalization/*"  }  # enable personalization

      # Deny content grabbing, on all accessible pages, using regular expressions

      /0081

        {

        /type "deny"

        /selectors '((sys|doc)view|query|[0-9-]+)'

        /extension '(json|xml)'

        }

      # Deny content grabbing for /content

      /0082

        {

        /type "deny"

        /path "/content"

        /selectors '(feed|rss|pages|languages|blueprint|infinity|tidy)'

        /extension '(json|xml|html)'

        }

/0083 { /type "deny"  /glob "GET *.sysview.xml*"   }

      /0084 { /type "deny"  /glob "GET *.docview.json*"  }

      /0085 { /type "deny"  /glob "GET *.docview.xml*"  }

      /0086 { /type "deny"  /glob "GET *.*[0-9].json*" }

#     /0087 { /type "allow" /method "GET" /extension 'json' "*.1.json" }  # allow one-level json requests

}

    # The cache section regulates what responses will be cached and where.

    /cache

      {

      # The docroot must be equal to the document root of the webserver. The

      # dispatcher will store files relative to this directory and subsequent

      # requests may be "declined" by the dispatcher, allowing the webserver

      # to deliver them just like static files.

      /docroot "C:/Apache2.2/htdocs"

      # Sets the level upto which files named ".stat" will be created in the

      # document root of the webserver. When an activation request for some

      # page is received, only files within the same subtree are affected

      # by the invalidation.

      /statfileslevel "2"

     

      # Flag indicating whether to cache responses to requests that contain

      # authorization information.

      /allowAuthorized "0"

     

      # Flag indicating whether the dispatcher should serve stale content if

      # no remote server is available.

      /serveStaleOnError "1"

     

      # The rules section defines what responses should be cached based on

      # the requested URL. Please note that only the following requests can

      # lead to cacheable responses:

      #

      # - HTTP method is GET

      # - URL has an extension

      # - Request has no query string

      # - Request has no "Authorization" header (unless allowAuthorized is 1)

      /rules

        {

        /0000

          {

          # the globbing pattern to be compared against the url

          # example: *             -> everything

          #        : /foo/bar.*    -> only the /foo/bar documents

          #        : /foo/bar/*    -> all pages below /foo/bar

          #        : /foo/bar[./]* -> all pages below and /foo/bar itself

          #        : *.html        -> all .html files

          /glob "*"

          /type "allow"

          }

        }

       

      # The invalidate section defines the pages that are "invalidated" after

      # any activation. Please note that the activated page itself and all

      # related documents are flushed on an modification. For example: if the

      # page /foo/bar is activated, all /foo/bar.* files are removed from the

      # cache.

      /invalidate

        {

        /0000

          {

          /glob "*"

          /type "deny"

          }

        /0001

          {

          # Consider all HTML files stale after an activation.

          /glob "*.html"

          /type "allow"

          }

        /0002

          {

          /glob "/etc/segmentation.segment.js"

          /type "allow"

          }

        /0003

          {

          /glob "*/analytics.sitecatalyst.js"

          /type "allow"

          }

        }

      # The allowedClients section restricts the client IP addresses that are

      # allowed to issue activation requests.

      /allowedClients

        {

        # Uncomment the following to restrict activation requests to originate

        # from "localhost" only.

        #

        #/0000

        #  {

        #  /glob "*"

        #  /type "deny"

        #  }

        #/0001

        #  {

        #  /glob "127.0.0.1"

        #  /type "allow"

        #  }

        }

       

      # The ignoreUrlParams section contains query string parameter names that

      # should be ignored when determining whether some request's output can be

      # cached or delivered from cache.

      #

      # In this example configuration, the "q" parameter will be ignored.

      #/ignoreUrlParams

      #  {

      #  /0001 { /glob "*" /type "deny" }

      #  /0002 { /glob "q" /type "allow" }

      #  }

      # Cache response headers next to a cached file. On the first request to

      # an uncached resource, all headers matching one of the values found here

      # are stored in a separate file, next to the cache file. On subsequent

      # requests to the cached resource, the stored headers are added to the

      # response.

      #

      # Note, that file globbing characters are not allowed here.

      #

      #/headers

      #  {

      #  "Cache-Control"

      #  "Content-Disposition"

      #  "Content-Type"

      #  "Expires"

      #  "Last-Modified"

      #  "X-Content-Type-Options"

      #  }

      # A grace period defines the number of seconds a stale, auto-invalidated

      # resource may still be served from the cache after the last activation

      # occurring. Auto-invalidated resources are invalidated by any activation,

      # when their path matches the /invalidate section above. This setting

      # can be used in a setup, where a batch of activations would otherwise

      # repeatedly invalidate the entire cache.

      #

      #/gracePeriod "2"

      # Enable TTL evaluates the response headers from the backend, and if they

      # contain a Cache-Control max-age or Expires date, an auxiliary, empty file

      # next to the cache file is created, with modification time equal to the

      # expiry date. When the cache file is requested past the modification time

      # it is automatically re-requested from the backend.

      #

      # /enableTTL "1"

      }

     

    # The statistics sections dictates how the load should be balanced among the

    # renders according to the media-type.

    /statistics

      {

      /categories

        {

        /html

          {

          /glob "*.html"

          }

        /others

          {

          /glob "*"

          }

        }

      }

    # Authorization checker: before a page in the cache is delivered, a HEAD

    # request is sent to the URL specified in /url with the query string

    # '?uri='. If the response status is 200 (OK), the page is returned

    # from the cache. Otherwise, the request is forwarded to the render and

    # its response returned.

    #

    # Only pages matching the /filter section below are checked, all other pages

    # get delivered unchecked.

    #

    # All header lines returned from the auth_checker's HEAD request that match

    # the /headers section will be returned as well.

    #

    #/auth_checker

    #  {

    #  /url "/bin/permissioncheck.html"

    #  /filter

    #    {

    #    /0000

    #      {

    #      /glob "*"

    #      /type "deny"

    #      }

    #    /0001

    #      {

    #      /glob "*.html"

    #      /type "allow"

    #      }

    #    }

    #  /headers

    #    {

    #    /0000

    #      {

    #      /glob "*"

    #      /type "deny"

    #      }

    #    /0001

    #      {

    #      /glob "Set-Cookie:*"

    #      /type "allow"

    #      }

    #    }

    #  }

    }

  }

In httpd.conf I am also using DirectorySlash Off under virtual hosts configuration but the / is still getting appended to all the URL under http://www.aemsite.com. I don't know what configuration under dispatcher or httpd.conf is responsible for the behavior.

Thanks

Avatar

Employee Advisor

If the dispatcher denies access to that URL, you should be able to see that when you enable DEBUG logging for the dispatcher. That's the first thing you should validate. Also observe the Apache error.log

regards,
Jörg

Avatar

Level 4

I have the debug enabled for dispatcher and these are the logs from dispatcher.log

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] response.headers[Date] = "Wed, 14 Mar 2018 01:57:19 GMT"

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] response.headers[X-Content-Type-Options] = "nosniff"

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] response.headers[Location] = "http://www.aemsite.com/articles/"

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] response.headers[Content-Length] = "0"

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] Storing socket for later reuse: 127.0.0.1:4503

[Tue Mar 13 21:57:19 2018] [I] [pid 7048 (tid 1184)] "GET /articles" 302 - 43ms [wwwaemsitecom/rend01]

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] Found farm wwwaemsitecom for www.aemsite.com

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] checking [/articles/]

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] request URL has no extension: /articles/

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] cache-action for [/articles/]: NONE

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] Reusing socket: 127.0.0.1:4503

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] Connected to backend rend01 (127.0.0.1:4503)

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] Adding request header: Host

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] Adding request header: User-Agent

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] Adding request header: Accept

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] Adding request header: Accept-Language

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] Adding request header: Accept-Encoding

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] Adding request header: Referer

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] Adding request header: Upgrade-Insecure-Requests

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] Adding request header: Via

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] Adding request header: X-Forwarded-For

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] Adding request header: Server-Agent

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] response.status = 403

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] response.headers[Content-Type] = "text/html; charset=ISO-8859-1"

[Tue Mar 13 21:57:19 2018] [D] [pid 7048 (tid 1184)] Storing socket for later reuse: 127.0.0.1:4503

[Tue Mar 13 21:57:19 2018] [I] [pid 7048 (tid 1184)] "GET /articles/" 403 1 78ms [wwwaemsitecom/rend01]

Avatar

Employee Advisor

Hi,

the dispatcher forwards the request to "/articles/" to AEM, which then responds with this 403; to investigate there further, you should have a look at the details of this request in the "Recent Requests console" (part of the OSGI webconsole) of the AEM publish instance.

On the other hand I don't know which causes the initial redirect from "/articles" to "/articles/". I would assume that this is some kind of apache rule, but it does not append any extension to it, which I would assume. So you should add a rewrite rule, which adds an ".html" extension if no extension is present. And only then forward the request to the dispatcher/AEM.

Jörg

Avatar

Level 4

Hi,

I am using the LinkCheckerTransformerFactory to strip off the .html extension. I want to map the /articles to articles.html without actually changing the URL. I have checked the Recent Request Console and here are the logs:

0 TIMER_START{Request Processing}

  1 COMMENT timer_end format is {<elapsed microseconds>,<timer name>} <optional message>

  7 LOG Method=GET, PathInfo=null

  11 TIMER_START{handleSecurity}

  3586 TIMER_END{3573,handleSecurity} authenticator org.apache.sling.auth.core.impl.SlingAuthenticator@6f5ad765 returns true

  3938 TIMER_START{ResourceResolution}

  8125 TIMER_END{4184,ResourceResolution} URI=/articles/ resolves to Resource=JcrNodeResource, type=cq:Page, superType=null, path=/content/aemsite/en/articles

  8132 LOG Resource Path Info: SlingRequestPathInfo: path='/content/aemsite/en/articles', selectorString='null', extension='null', suffix='/'

  8132 TIMER_START{ServletResolution}

  8136 TIMER_START{resolveServlet(/content/aemsite/en/articles)}

  8213 TIMER_END{76,resolveServlet(/content/aemsite/en/articles)} Using servlet org.apache.sling.servlets.get.DefaultGetServlet

  8227 TIMER_END{94,ServletResolution} URI=/articles/ handled by Servlet=org.apache.sling.servlets.get.DefaultGetServlet

  8234 LOG Applying Requestfilters

  8237 LOG Calling filter: com.adobe.granite.resourceresolverhelper.impl.ResourceResolverHelperImpl

  8245 LOG Calling filter: org.apache.sling.i18n.impl.I18NFilter

  8248 LOG Calling filter: com.adobe.granite.httpcache.impl.InnerCacheFilter

  8255 LOG Calling filter: org.apache.sling.rewriter.impl.RewriterFilter

  8259 LOG Calling filter: com.adobe.cq.mcm.campaign.servlets.CampaignCopyTracker

  8261 LOG Calling filter: com.day.cq.wcm.core.impl.WCMRequestFilter

  8266 LOG Calling filter: com.adobe.fd.core.security.internal.CurrentUserServiceImpl

  9157 LOG Calling filter: com.adobe.cq.wcm.core.components.internal.servlets.CoreFormHandlingServlet

  9162 LOG Calling filter: com.adobe.granite.optout.impl.OptOutFilter

  9185 LOG Calling filter: com.day.cq.wcm.foundation.forms.impl.FormsHandlingServlet

  9188 LOG Calling filter: com.adobe.cq.social.commons.cors.CORSAuthenticationFilter

  9195 LOG Calling filter: com.adobe.livecycle.dsc.clientsdk.internal.ResourceResolverHolderFilter

  9206 LOG Calling filter: org.apache.sling.engine.impl.debug.RequestProgressTrackerLogFilter

  9212 LOG Calling filter: com.adobe.livecycle.content.appcontext.impl.AppContextFilter

  9218 LOG Calling filter: org.apache.sling.dynamicinclude.CacheControlFilter

  9225 LOG Calling filter: com.day.cq.wcm.mobile.core.impl.redirect.RedirectFilter

  9239 LOG Calling filter: com.day.cq.wcm.core.impl.AuthoringUIModeServiceImpl

  9243 LOG Calling filter: org.apache.sling.security.impl.ContentDispositionFilter

  9245 LOG Calling filter: com.adobe.granite.csrf.impl.CSRFFilter

  9248 LOG Calling filter: com.adobe.granite.rest.assets.impl.AssetContentDispositionFilter

  9253 LOG Calling filter: com.adobe.granite.requests.logging.impl.RequestLoggerImpl

  9257 LOG Calling filter: com.adobe.granite.rest.impl.servlet.ApiResourceFilter

  9264 LOG Calling filter: com.day.cq.dam.core.impl.assetlinkshare.AdhocAssetShareAuthHandler

  9266 LOG Calling filter: com.adobe.cq.social.ugcbase.security.impl.SaferSlingPostServlet

  9268 LOG Calling filter: com.day.cq.dam.core.impl.servlet.ActivityRecordHandler

  9281 LOG Calling filter: org.apache.sling.dynamicinclude.SyntheticResourceFilter

  9288 LOG Applying Componentfilters

  9289 LOG Calling filter: com.day.cq.personalization.impl.TargetComponentFilter

  9291 LOG Calling filter: com.day.cq.wcm.core.impl.WCMComponentFilter

  9735 LOG Calling filter: com.day.cq.wcm.core.impl.WCMDebugFilter

  9755 TIMER_START{org.apache.sling.servlets.get.DefaultGetServlet#0}

  9767 LOG Using org.apache.sling.servlets.get.impl.helpers.StreamRendererServlet to render for extension=null

  10085 LOG Applying Error filters

  10087 LOG Calling filter: org.apache.sling.i18n.impl.I18NFilter

  10088 LOG Calling filter: org.apache.sling.rewriter.impl.RewriterFilter

  10141 TIMER_START{handleError:status=403}

  26833 TIMER_END{16689,handleError:status=403} Using handler /apps/sling/servlet/errorhandler/default.jsp

  54132 LOG Found processor for post processing ProcessorConfiguration: {contentTypes=[text/html], order=-1, active=true, valid=true, processErrorResponse=true, pipeline=(generator=Config(type=htmlparser, config=JcrPropertyMap [node=Node[NodeDelegate{tree=/apps/aemsite-project/config/rewriter/default/generator-htmlparser: { jcr:primaryType = nt:unstructured, includeTags = [A, /A, IMG, AREA, FORM, BASE, LINK, SCRIPT, BODY, /BODY]}}], values={jcr:primaryType=nt:unstructured, includeTags=[Ljava.lang.String;@4764b191}]), transformers=(Config(type=linkchecker, config={}), Config(type=mobile, config=JcrPropertyMap [node=Node[NodeDelegate{tree=/apps/aemsite-project/config/rewriter/default/transformer-mobile: { jcr:primaryType = nt:unstructured, component-optional = true}}], values={jcr:primaryType=nt:unstructured, component-optional=true}]), Config(type=mobiledebug, config=JcrPropertyMap [node=Node[NodeDelegate{tree=/apps/aemsite-project/config/rewriter/default/transformer-mobiledebug: { jcr:primaryType = nt:unstructured, component-optional = true}}], values={jcr:primaryType=nt:unstructured, component-optional=true}]), Config(type=contentsync, config=JcrPropertyMap [node=Node[NodeDelegate{tree=/apps/aemsite-project/config/rewriter/default/transformer-contentsync: { jcr:primaryType = nt:unstructured, component-optional = true}}], values={jcr:primaryType=nt:unstructured, component-optional=true}]), Config(type=versioned-clientlibs, config={}), serializer=Config(type=htmlwriter, config={}))}

  54597 TIMER_END{44455,handleError:status=403} Error handler finished

  54753 LOG Filter timing: filter=org.apache.sling.rewriter.impl.RewriterFilter, inner=45, total=45, outer=0

  54760 TIMER_END{45004,org.apache.sling.servlets.get.DefaultGetServlet#0}

  54768 LOG Filter timing: filter=com.day.cq.wcm.core.impl.WCMDebugFilter, inner=45, total=45, outer=0

  54770 LOG Filter timing: filter=com.day.cq.wcm.core.impl.WCMComponentFilter, inner=45, total=45, outer=0

  54784 TIMER_END{54783,Request Processing} Request Processing

  55493 LOG Filter timing: filter=org.apache.sling.dynamicinclude.SyntheticResourceFilter, inner=45, total=45, outer=0

  55496 LOG Filter timing: filter=com.day.cq.dam.core.impl.servlet.ActivityRecordHandler, inner=45, total=45, outer=0

  55499 LOG Filter timing: filter=com.adobe.cq.dam.webdav.impl.io.DamWebdavRequestFilter, inner=45, total=45, outer=0

  55502 LOG Filter timing: filter=com.adobe.cq.social.ugcbase.security.impl.SaferSlingPostServlet, inner=45, total=45, outer=0

  55505 LOG Filter timing: filter=com.day.cq.dam.core.impl.assetlinkshare.AdhocAssetShareAuthHandler, inner=45, total=45, outer=0

  55508 LOG Filter timing: filter=com.day.cq.dam.core.impl.servlet.DamContentDispositionFilter, inner=45, total=45, outer=0

  55511 LOG Filter timing: filter=com.adobe.granite.rest.impl.servlet.ApiResourceFilter, inner=45, total=45, outer=0

  55513 LOG Filter timing: filter=com.adobe.granite.requests.logging.impl.RequestLoggerImpl, inner=45, total=45, outer=0

  55517 LOG Filter timing: filter=com.adobe.granite.rest.assets.impl.AssetContentDispositionFilter, inner=45, total=45, outer=0

  55520 LOG Filter timing: filter=com.adobe.granite.csrf.impl.CSRFFilter, inner=45, total=45, outer=0

  55523 LOG Filter timing: filter=org.apache.sling.security.impl.ContentDispositionFilter, inner=45, total=45, outer=0

  55526 LOG Filter timing: filter=com.day.cq.wcm.core.impl.AuthoringUIModeServiceImpl, inner=45, total=45, outer=0

  55529 LOG Filter timing: filter=com.day.cq.wcm.mobile.core.impl.redirect.RedirectFilter, inner=45, total=45, outer=0

  55532 LOG Filter timing: filter=org.apache.sling.dynamicinclude.CacheControlFilter, inner=45, total=45, outer=0

  55535 LOG Filter timing: filter=com.adobe.livecycle.content.appcontext.impl.AppContextFilter, inner=45, total=45, outer=0

  55538 LOG Filter timing: filter=org.apache.sling.engine.impl.debug.RequestProgressTrackerLogFilter, inner=45, total=46, outer=1

  55540 LOG Filter timing: filter=com.adobe.livecycle.dsc.clientsdk.internal.ResourceResolverHolderFilter, inner=46, total=46, outer=0

  55543 LOG Filter timing: filter=com.adobe.cq.social.commons.cors.CORSAuthenticationFilter, inner=46, total=46, outer=0

  55546 LOG Filter timing: filter=com.day.cq.wcm.foundation.forms.impl.FormsHandlingServlet, inner=46, total=46, outer=0

  55549 LOG Filter timing: filter=com.adobe.granite.optout.impl.OptOutFilter, inner=46, total=46, outer=0

  55553 LOG Filter timing: filter=com.adobe.cq.wcm.core.components.internal.servlets.CoreFormHandlingServlet, inner=46, total=46, outer=0

  55556 LOG Filter timing: filter=com.adobe.fd.core.security.internal.CurrentUserServiceImpl, inner=46, total=47, outer=1

  55558 LOG Filter timing: filter=com.day.cq.wcm.core.impl.WCMRequestFilter, inner=47, total=47, outer=0

  55562 LOG Filter timing: filter=com.adobe.cq.mcm.campaign.servlets.CampaignCopyTracker, inner=47, total=47, outer=0

  55580 LOG Filter timing: filter=org.apache.sling.rewriter.impl.RewriterFilter, inner=47, total=47, outer=0

  55583 LOG Filter timing: filter=com.adobe.granite.httpcache.impl.InnerCacheFilter, inner=47, total=47, outer=0

  55586 LOG Filter timing: filter=org.apache.sling.i18n.impl.I18NFilter, inner=47, total=47, outer=0

  55590 LOG Filter timing: filter=org.apache.sling.distribution.servlet.DistributionAgentCreationFilter, inner=47, total=47, outer=0

From the logs it looks like the /articles/ is being resolved to path=/content/aemsite/en/articles but still throwing a 403 error.

Thank you so much for having patience to look into this issue.

Avatar

Employee Advisor

125 TIMER_END{4184,ResourceResolution} URI=/articles/ resolves to Resource=JcrNodeResource, type=cq:Page, superType=null, path=/content/aemsite/en/articles

  8132 LOG Resource Path Info: SlingRequestPathInfo: path='/content/aemsite/en/articles', selectorString='null', extension='null', suffix='/'

  8132 TIMER_START{ServletResolution}

  8136 TIMER_START{resolveServlet(/content/aemsite/en/articles)}

  8213 TIMER_END{76,resolveServlet(/content/aemsite/en/articles)} Using servlet org.apache.sling.servlets.get.DefaultGetServlet

That means that it does not start the resource resolution process to find out the correct script/component to handle it.  When you request http://${hostname}/content/aemsite/en/articles.html this section will look differently.

The problem is in your case that the ".html" extension is missing; you need to extend the rewrites on the dispatcher to append the extension "html" as well if no other extension is available.

RewriteRule ^/$ /content/aemsite/en.html [PT,L]

  RewriteCond %{REQUEST_URI} ^/content
  RewriteCond %{REQUEST_URI} !^/content/campaigns
  RewriteCond %{REQUEST_URI} !^/content/dam 
  RewriteRule !^/content/aemsite/en - [R=404,L,NC]

  RewriteCond %{REQUEST_URI} !^/apps
  RewriteCond %{REQUEST_URI} !^/content
  RewriteCond %{REQUEST_URI} !^/etc
  RewriteCond %{REQUEST_URI} !^/home
  RewriteCond %{REQUEST_URI} !^/libs
  RewriteCond %{REQUEST_URI} !^/tmp
  RewriteCond %{REQUEST_URI} !^/var

  RewriteCond %{REQUEST_URI} !.html$
  RewriteRule ^/(.*)$ /content/aemsite/en/$1.html [PT,L]

(I modified the last 2 lines)

Jörg

Avatar

Level 4

Hi Jörg,

I added the new rules to the httpd.conf but it somehow still adds the trailing slash.

Thanks

Avatar

Employee Advisor

Then you need to enable the logging for mod_rewrite and find out which rule introduces the trailing slash.

Jörg

Avatar

Level 4

The logging was already enabled for mod_rewrite but it was somehow not able to log anything. I fixed that issue and put the rules you had provided in place. What happening was that /articles/  would get converted to /articles/.html due to the /articles being turned to /articles/. So, I modified the rules a bit :

# Define virtual host for aemsite.com

<VirtualHost *:80>

  ServerAdmin webmaster@localhost

  ServerName aemsite.com

  DocumentRoot "C:\Apache2.2\htdocs"

  DirectorySlash Off

   RewriteEngine On

   RewriteLog "C:\Apache2.2\logs\rewrite.log"

   RewriteLogLevel 9

  

   RewriteRule ^/$ /content/aemsite/en.html [PT,L]

   RewriteCond %{REQUEST_URI} !^/apps

   RewriteCond %{REQUEST_URI} !^/content

   RewriteCond %{REQUEST_URI} !^/etc

   RewriteCond %{REQUEST_URI} !^/home

   RewriteCond %{REQUEST_URI} !^/libs

   RewriteCond %{REQUEST_URI} !^/tmp

   RewriteCond %{REQUEST_URI} !^/var

   RewriteCond %{REQUEST_URI} !(.*)/$

   RewriteRule ^/(.*)$ /content/aemsite/en/$1.html [PT,L]

 

</VirtualHost>

(The last two lines), thinking that the request /products would be rewritten to  /content/aemsite/en/$1.html

Here is the complete log from rewrite.log:

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353f0f8/initial] (2) init rewrite engine with requested uri /products

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353f0f8/initial] (3) applying pattern '^/$' to uri '/products'

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353f0f8/initial] (3) applying pattern '^/(.*)$' to uri '/products'

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353f0f8/initial] (4) RewriteCond: input='/products' pattern='!^/apps' => matched

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353f0f8/initial] (4) RewriteCond: input='/products' pattern='!^/content' => matched

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353f0f8/initial] (4) RewriteCond: input='/products' pattern='!^/etc' => matched

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353f0f8/initial] (4) RewriteCond: input='/products' pattern='!^/home' => matched

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353f0f8/initial] (4) RewriteCond: input='/products' pattern='!^/libs' => matched

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353f0f8/initial] (4) RewriteCond: input='/products' pattern='!^/tmp' => matched

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353f0f8/initial] (4) RewriteCond: input='/products' pattern='!^/var' => matched

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353f0f8/initial] (4) RewriteCond: input='/products' pattern='!(.*)/$' => matched

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353f0f8/initial] (2) rewrite '/products' -> '/content/aemsite/en/products.html'

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353f0f8/initial] (2) forcing '/content/aemsite/en/products.html' to get passed through to next API URI-to-filename handler

-----

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353b0e8/initial] (2) init rewrite engine with requested uri /products/

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353b0e8/initial] (3) applying pattern '^/$' to uri '/products/'

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353b0e8/initial] (3) applying pattern '^/(.*)$' to uri '/products/'

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353b0e8/initial] (4) RewriteCond: input='/products/' pattern='!^/apps' => matched

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353b0e8/initial] (4) RewriteCond: input='/products/' pattern='!^/content' => matched

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353b0e8/initial] (4) RewriteCond: input='/products/' pattern='!^/etc' => matched

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353b0e8/initial] (4) RewriteCond: input='/products/' pattern='!^/home' => matched

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353b0e8/initial] (4) RewriteCond: input='/products/' pattern='!^/libs' => matched

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353b0e8/initial] (4) RewriteCond: input='/products/' pattern='!^/tmp' => matched

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353b0e8/initial] (4) RewriteCond: input='/products/' pattern='!^/var' => matched

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353b0e8/initial] (4) RewriteCond: input='/products/' pattern='!(.*)/$' => not-matched

127.0.0.1 - - [16/Mar/2018:23:52:08 --0400] [aemsite.com/sid#304ab50][rid#353b0e8/initial] (1) pass through /products/

It looks like there are two requests /products and /products/, the rule works fine as long as it is /products but once it is /products/ it fails. I checked the network tab in the browser to check the requests and below is the screenshot:

products.PNG

I don't know what exactly is the reason behind the /products/

Thank you again for all the help and effort.

Avatar

Employee Advisor

Strange. In your first log we see a rewrite from /products to /content/aemsite/en/products.html, but in the request view we also see a request to /products, which results in a redirect to /products/.

Have you made the changes on all dispatcher instances of the environment you were testing with?

Jörg

Avatar

Level 4

I am using only one dispatcher. Are there any other logs I need to check ?

Thanks

Avatar

Employee Advisor

This rule prevents /products/ from being rewritten:

   RewriteCond %{REQUEST_URI} !(.*)/$

But I cannot a clean picture from it.

1) Your screenshot of the browser suggests that /products is redirected to /products/

2) You posted a log extract, which shows that /products is being rewritten to /content/aemsite/en/products.html

3) You posted a log extract, which indicates, that /products/ cannot be handled.

So 1 and 2 contradict. And 3 supports the behaviour you are analyzing.

Can you validate and confirm my 3 statements here?

Jörg

Avatar

Level 4

The problem is even if I comment out all the rules the /products still somehow gets converted to /products/

1) Your screenshot of the browser suggests that /products is redirected to /products/ That is correct

2) You posted a log extract, which shows that /products is being rewritten to /content/aemsite/en/products.html

3) You posted a log extract, which indicates, that /products/ cannot be handled.

I have been trying a lot of rules to make it work. The log extract must have been due to that. Sorry for the confusion, the current set of rules look like this:

RewriteRule ^/$ /content/aemsite/en.html [PT,L]

   RewriteCond %{REQUEST_URI} !^/apps

   RewriteCond %{REQUEST_URI} !^/content

   RewriteCond %{REQUEST_URI} !^/etc

   RewriteCond %{REQUEST_URI} !^/home

   RewriteCond %{REQUEST_URI} !^/libs

   RewriteCond %{REQUEST_URI} !^/tmp

   RewriteCond %{REQUEST_URI} !^/var

   RewriteRule ^/(.*)$ /content/aemsite/en/$1 [PT,L]

this is the log extract:

127.0.0.1 - - [19/Mar/2018:21:23:42 --0400] [aemsite.com/sid#356c590][rid#36080c8/initial] (2) init rewrite engine with requested uri /products

127.0.0.1 - - [19/Mar/2018:21:23:42 --0400] [aemsite.com/sid#356c590][rid#36080c8/initial] (3) applying pattern '^/$' to uri '/products'

127.0.0.1 - - [19/Mar/2018:21:23:42 --0400] [aemsite.com/sid#356c590][rid#36080c8/initial] (3) applying pattern '^/(.*)$' to uri '/products'

127.0.0.1 - - [19/Mar/2018:21:23:42 --0400] [aemsite.com/sid#356c590][rid#36080c8/initial] (4) RewriteCond: input='/products' pattern='!^/apps' => matched

127.0.0.1 - - [19/Mar/2018:21:23:42 --0400] [aemsite.com/sid#356c590][rid#36080c8/initial] (4) RewriteCond: input='/products' pattern='!^/content' => matched

127.0.0.1 - - [19/Mar/2018:21:23:42 --0400] [aemsite.com/sid#356c590][rid#36080c8/initial] (4) RewriteCond: input='/products' pattern='!^/etc' => matched

127.0.0.1 - - [19/Mar/2018:21:23:42 --0400] [aemsite.com/sid#356c590][rid#36080c8/initial] (4) RewriteCond: input='/products' pattern='!^/home' => matched

127.0.0.1 - - [19/Mar/2018:21:23:42 --0400] [aemsite.com/sid#356c590][rid#36080c8/initial] (4) RewriteCond: input='/products' pattern='!^/libs' => matched

127.0.0.1 - - [19/Mar/2018:21:23:42 --0400] [aemsite.com/sid#356c590][rid#36080c8/initial] (4) RewriteCond: input='/products' pattern='!^/tmp' => matched

127.0.0.1 - - [19/Mar/2018:21:23:42 --0400] [aemsite.com/sid#356c590][rid#36080c8/initial] (4) RewriteCond: input='/products' pattern='!^/var' => matched

127.0.0.1 - - [19/Mar/2018:21:23:42 --0400] [aemsite.com/sid#356c590][rid#36080c8/initial] (2) rewrite '/products' -> '/content/aemsite/en/products'

127.0.0.1 - - [19/Mar/2018:21:23:42 --0400] [aemsite.com/sid#356c590][rid#36080c8/initial] (2) forcing '/content/aemsite/en/products' to get passed through to next API URI-to-filename handler

127.0.0.1 - - [19/Mar/2018:21:23:43 --0400] [aemsite.com/sid#356c590][rid#3592190/initial] (2) init rewrite engine with requested uri /products/

127.0.0.1 - - [19/Mar/2018:21:23:43 --0400] [aemsite.com/sid#356c590][rid#3592190/initial] (3) applying pattern '^/$' to uri '/products/'

127.0.0.1 - - [19/Mar/2018:21:23:43 --0400] [aemsite.com/sid#356c590][rid#3592190/initial] (3) applying pattern '^/(.*)$' to uri '/products/'

127.0.0.1 - - [19/Mar/2018:21:23:43 --0400] [aemsite.com/sid#356c590][rid#3592190/initial] (4) RewriteCond: input='/products/' pattern='!^/apps' => matched

127.0.0.1 - - [19/Mar/2018:21:23:43 --0400] [aemsite.com/sid#356c590][rid#3592190/initial] (4) RewriteCond: input='/products/' pattern='!^/content' => matched

127.0.0.1 - - [19/Mar/2018:21:23:43 --0400] [aemsite.com/sid#356c590][rid#3592190/initial] (4) RewriteCond: input='/products/' pattern='!^/etc' => matched

127.0.0.1 - - [19/Mar/2018:21:23:43 --0400] [aemsite.com/sid#356c590][rid#3592190/initial] (4) RewriteCond: input='/products/' pattern='!^/home' => matched

127.0.0.1 - - [19/Mar/2018:21:23:43 --0400] [aemsite.com/sid#356c590][rid#3592190/initial] (4) RewriteCond: input='/products/' pattern='!^/libs' => matched

127.0.0.1 - - [19/Mar/2018:21:23:43 --0400] [aemsite.com/sid#356c590][rid#3592190/initial] (4) RewriteCond: input='/products/' pattern='!^/tmp' => matched

127.0.0.1 - - [19/Mar/2018:21:23:43 --0400] [aemsite.com/sid#356c590][rid#3592190/initial] (4) RewriteCond: input='/products/' pattern='!^/var' => matched

127.0.0.1 - - [19/Mar/2018:21:23:43 --0400] [aemsite.com/sid#356c590][rid#3592190/initial] (2) rewrite '/products/' -> '/content/aemsite/en/products/'

127.0.0.1 - - [19/Mar/2018:21:23:43 --0400] [aemsite.com/sid#356c590][rid#3592190/initial] (2) forcing '/content/aemsite/en/products/' to get passed through to next API URI-to-filename handler

-----------------------------

I am unable to locate why /products is being rewritten to /products/

I think that's the reason behind this whole issue. Please correct me if I am wrong.

Thanks

Avatar

Employee Advisor

Hi,

I see this

127.0.0.1 - - [19/Mar/2018:21:23:42 --0400] [aemsite.com/sid#356c590][rid#36080c8/initial] (2) forcing '/content/aemsite/en/products' to get passed through to next API URI-to-filename handler

and this

127.0.0.1 - - [19/Mar/2018:21:23:43 --0400] [aemsite.com/sid#356c590][rid#3592190/initial] (2) init rewrite engine with requested uri /products/

But I don't know if these 2 lines belong to the same request at all... sid# and rid# are identical, but the parameters don't look the same.

Sorry, I am out of ideas atm.

Jörg

Avatar

Level 4

The request to /product gets rewritten to /products/.

I can understand, I am out of ideas too. I tried everything. debugging the virtual hots configuration to dispatcher but nothing seems to resolve this issue.

Thank you for looking into this issue and apologies for it took a lot of your time.

Avatar

Level 3

Hi,

Use DirectorSlash Off to configure mod_dir not to append /.

Regards

Kanika

Avatar

Employee

I realise this is long after the fact but figured I would add this incase others experience similar issues. I am not sure exactly *where* the forward slash redirect comes from. But to address the issue follow the steps below. 

 

If you have confirmed in the vhost that 

DirectorySlash Off

is configured then also check the dispatcher_vhost.conf  that the following is configured correctly ( read more here )

Options Indexes

 

If you are still experiencing 403 errors on trailing slash then it is most likely caused by the Sling Default GET servlet configuration. This configuration has an "Auto Index" property. If this option is not checked, the request to the resource is forbidden and results in a status 403/FORBIDDEN. You most likely want to leave this unchecked. 

 

Confirm that you have a dispatcher rule in place that grabs paths with trailing slash and appends .html. Pay close attention to your rewrite flags and the order of the rules as these might have an impact on how your rule is processed. Also add RewriteCond's that might be applicable. Example rewrite rule:

RewriteRule ^(.+[^/])(/?)$ $1.html [L,R=301]

  

Lastly you mentioned that you are stripping .html extension using the Link Checker Transformer. Make sure this is really what you want to do in the first place.