Expand my Community achievements bar.

Join us in celebrating the outstanding achievement of our AEM Community Member of the Year!

Dispatcher cache invalidation not working - After removing /statfilelevel it starts working

Avatar

Level 4

Hi all,

Hope everyone is doing fine !

We are facing issue related to dispatcher cache invalidation. 

Activation to publish servers is working perfectly fine. When we add /statfilelevel entry to dispatcher.any with any value (1 , 2 or 3) then dispatcher cache invalidation does not work. We can see ".stat" file timestamp changes but that also does not invalidate the dispatcher cache. When we remove /statfilelevel entry from dispatcher.any everything starts working fine.

Any reason why this is happening. Obviously we want to leverage /statfilelevel entry. Attaching dispatcher.any for your reference.

Please let me know if you have encountered similar issue.

Thanks in advance

 

# Each farm configures a set of load balanced renders (i.e. remote servers)
/farms
  {
  # First farm entry
  /website 
    {  
    # Request headers that should be forwarded to the remote server.
    /clientheaders
      {
      # Forward all request headers that are end-to-end. If you want
      # to forward a specific set of headers, you'll have to list
      # them here.
#      "*"
      "referer"
      "user-agent"
      # "authorization"
      "from"
      "content-type"
      "content-length"
      "accept-charset"
      "accept-encoding"
      "accept-language"
      "accept"
      "host"
      "if-match"
      "if-none-match"
      "if-range"
      "if-unmodified-since"
      "max-forwards"
      "proxy-authorization"
      "proxy-connection"
      "range"
      "cookie"
      "cq-action"
      "cq-handle"
      "handle"
      "action"
      "cqstats"
      "PATH"
      "depth"
      "translate"
      "expires"
      "date"
      "dav"
      "ms-author-via"
      "if"
      "lock-token"
      "x-expected-entity-length"
      "destination"
      }
      
    # Hostname globbing for farm selection (virtual domain addressing)
    /virtualhosts
      {
      # Entries will be compared against the "Host" request header
      # and an optional request URL prefix.
      #
      # Examples:
      #
      #   www.company.com
      #   intranet.*
      #   myhost:8888/mysite
      "*"
      }
      
    # The load will be balanced among these render instances
    
    /renders
      {
      /publish1
        {
        # Hostname or IP of the render
        #/hostname "127.0.0.1"
        /hostname "my.web.com"
        # Port of the render
        /port "4503"
        # Connect timeout in milliseconds, 0 to wait indefinitely
        # /timeout "0"
        }
      }
      
    # The filter section defines the requests that should be handled by the dispatcher.
    # The globs will be compared against the request line, e.g. "GET /index.html HTTP/1.1".
    /filter
      {
      # Deny everything first and then allow specific entries
       /0000 { /type "deny" /glob "*" }
       #/0011 { /type "allow" /glob "* /libs/*"   }  # allow apps access
       #/0012 { /type "allow" /glob "* /bin/*"    }
       /0013 { /type "allow" /glob "* /content/thread/*" }  # disable this rule to allow mapped content only
       /0014 { /type "allow" /glob "* /content/dam/thread/*" }  # disable this rule to allow mapped content only
       /0015 { /type "allow" /glob "* /etc/clientlibs/thread/*" }
       /0016 { /type "allow" /glob "* /etc/designs/thread/*" }  
       #/0017 { /type "allow" /glob "* /etc/map/http/*" }
       #/0018 { /type "allow" /glob "* /etc/map/removeddesigns/*" }
       /0019 { /type "allow" /glob "* /etc/tags/thread/*" }
       /0020 { /type "allow" /glob "* /apps/thread/*"   }
       /0021 { /type "allow" /glob "* /dam/thread*" } 
       /0027 { /type "allow" /glob "* /thread*" }
       
       
       #/0033 { /type "allow" /glob "* /home/*"   }
       #/0034 { /type "allow" /glob "* /tmp/*"    }
       #/0035 { /type "allow" /glob "* /var/*"    }
       
       
       # Enable specific mime types in non-public content directories 
       /0041 { /type "allow" /glob "* *.css *"   }  # enable css
       /0042 { /type "allow" /glob "* *.gif *"   }  # enable gifs
       /0043 { /type "allow" /glob "* *.ico *"   }  # enable icos
       /0044 { /type "allow" /glob "* *.js *"    }  # enable javascript
       /0045 { /type "allow" /glob "* *.png *"   }  # enable png
       /0046 { /type "allow" /glob "* *.swf *"   }  # enable flash
       /0047 { /type "allow" /glob "* *.jpg *"   }  # enable jpg
       /0048 { /type "allow" /glob "* *.jpeg *"  }  # enable jpeg
       /0049 { /type "allow" /glob "* *.webm *"  }  # enable webm
       /0050 { /type "allow" /glob "* *.ogg *"  }  # enable ogg
       /0051 { /type "allow" /glob "* *.mp4 *"  }  # enable mp4
       /0052 { /type "allow" /glob "* *.mp3 *"  }  # enable mp3
       /0053 { /type "allow" /glob "* *.wav *"  }  # enable wav
       /0054 { /type "allow" /glob "* *.svg *"  }  # enable svg
       # Added while testing - pg1675
       /0060 { /type "allow" /glob "* *.htm* *"  }  # enable html
       # Enable features 
       #/0062 { /type "allow" /glob "* /libs/cq/personalization/*"  }  # enable personalization
       # Deny content grabbing
       #/0081 { /type "deny"  /glob "GET *.infinity.json*" }
       #/0082 { /type "deny"  /glob "GET *.tidy.json*"     }
       #/0083 { /type "deny"  /glob "GET *.sysview.xml*"   }
       #/0084 { /type "deny"  /glob "GET *.docview.json*"  }
       #/0085 { /type "deny"  /glob "GET *.docview.xml*"  }
       #/0086 { /type "deny"  /glob "GET *.*[0-9].json*" }
       #/0087 { /type "allow" /glob "GET *.1.json*" }          # allow one-level json requests
       # Deny query
       #/0090 { /type "deny"  /glob "* *.query.json*" }
      }

    # The cache section regulates what responses will be cached and where.
    /cache
      {
      # The docroot must be equal to the document root of the webserver. The
      # dispatcher will store files relative to this directory and subsequent
      # requests may be "declined" by the dispatcher, allowing the webserver
      # to deliver them just like static files.
      # /docroot "/opt/communique/dispatcher/cache"
      /docroot "/opt/app/workload/opt/apache/htdocs"

      # Sets the level upto which files named ".stat" will be created in the 
      # document root of the webserver. When an activation request for some 
      # page is received, only files within the same subtree are affected 
      # by the invalidation.
      /statfileslevel "1"
      
      # Flag indicating whether to cache responses to requests that contain
      # authorization information.
      #/allowAuthorized "0"
      
      # Flag indicating whether the dispatcher should serve stale content if
      # no remote server is available.
      /serveStaleOnError "1"
      
      # The rules section defines what responses should be cached based on
      # the requested URL. Please note that only the following requests can
      # lead to cacheable responses:
      #
      # - HTTP method is GET
      # - URL has an extension
      # - Request has no query string
      # - Request has no "Authorization" header (unless allowAuthorized is 1)
      /rules
        {
        /0000
          {
          # the globbing pattern to be compared against the url
          # example: *             -> everything
          #        : /foo/bar.*    -> only the /foo/bar documents
          #        : /foo/bar/*    -> all pages below /foo/bar
          #        : /foo/bar[./]* -> all pages below and /foo/bar itself
          #        : *.html        -> all .html files
          /glob "*"
          /type "allow"
          }
        }
        
      # The invalidate section defines the pages that are "invalidated" after
      # any activation. Please note that the activated page itself and all 
      # related documents are flushed on an modification. For example: if the 
      # page /foo/bar is activated, all /foo/bar.* files are removed from the
      # cache.
      /invalidate
        {
        /0000
          {
          /glob "*"
          /type "deny"
          }
        /0001
          {
          # Consider all HTML files stale after an activation.
          /glob "*.html"
          /type "allow"
          }
        /0002
          {
          /glob "/etc/segmentation.segment.js"
          /type "allow"
          }
        /0003
          {
          /glob "*/analytics.sitecatalyst.js"
          /type "allow"
          }
        }

      # The allowedClients section restricts the client IP addresses that are
      # allowed to issue activation requests.
      /allowedClients
        {
        # Uncomment the following to restrict activation requests to originate
        # from "localhost" only.
        #
        #/0000
        #  {
        #  /glob "*"
        #  /type "deny"
        #  }
        #/0001
        #  {
        #  /glob "127.0.0.1"
        #  /type "allow"
        #  }
        }
        
      # The ignoreUrlParams section contains query string parameter names that
      # should be ignored when determining whether some request's output can be
      # cached or delivered from cache.
      #
      # In this example configuration, the "q" parameter will be ignored. 
      #/ignoreUrlParams
      #  {
      #  /0001 { /glob "*" /type "deny" }
      #  /0002 { /glob "q" /type "allow" }
      #  }
      
      }
      
    # The statistics sections dictates how the load should be balanced among the
    # renders according to the media-type. 
    /statistics
      {
      /categories
        {
        /html
          {
          /glob "*.html"
          }
        /others
          {
          /glob "*"
          }
        }
      }
    }
  }

2 Replies

Avatar

Level 8

Have you tried turning up the logging level on dispatcher and seeing what it thinks it's doing when the flush requests come in? That's usually the most informative option. 

When you see this sort issue where everything works fine with a stat file level of zero and it doesn't work with anything the most likely cause is a mismatch between the directory structure in your dispatcher cache root and your repository structure. Keep in mind what increasing the stat file level is intended to do, it limits the scope of auto-invalidation based cache root's directory structure and the path of the item being flushed. Are you using either the JCR Resource Resolver, or /etc/map to rewrite your incoming URLs? If so that may be the cause of the problem. 

  • Simplest example - assume you are removing /content from all you URLs - so /content/mysite/en/home.html becomes /mysite/en/home.html. If you aren't using Apache to do your rewrites that means that you cache version of the home page is stored in {cacheroot}/mysite/en/home.html. 
  • When you activate /content/mysite/en/home dispatcher gets the request and looks for /content/mysite/en/home.html. If you have set the statfile level to 1 it touches {cacheroot}/content/.stat and {cacheroot}/.stat. /mysite/.stat is unchanged. 
  • When the next request comes in form /mysite/en/home.html the cache copy is served because /mysite/.stat hasn't changed.

That's a simple example, but there may be other reasons that path of the item activated doesn't match you cache root structure so turning up the dispatcher logging and watching what it's thinking both when the cache request comes in, and then after the next request will likely be informative. 

Avatar

Level 4

Hi Ortas,

First of all thanks for taking time to write this detailed & informative reply.

1) Yes I see that there is URL re-write happening. So my URL does not contain "/content". Even if you add "/content" to it - then it will remove it automatically. 

2) Also in dispatcher logs I see that the request to my HTML file does not have "/content" in it. See logs below

For example:

[Thu Jun 26 05:03:39 2014] [D] [759(140264572188416)] checking [/dir1/sub-dir1/home.html]
[Thu Jun 26 05:03:39 2014] [D] [759(140264572188416)] cachefile does not exist or is a directory: /opt/app/workload/opt/apache/htdocs/dir1/sub-dir1/home.html
[Thu Jun 26 05:03:39 2014] [D] [759(140264572188416)] try to create new cachefile: /opt/app/workload/opt/apache/htdocs/dir1/sub-dir1/home.html
[Thu Jun 26 05:03:39 2014] [D] [759(140264572188416)] cache-action for [/thread/simplify/kids.html]: CREATE

 

So seems like your analysis is correct - what do you say ?

 

3) I will try disabling JCR Resource Resolver or /etc/map and see how it works

4) So for re-write do you recommend doing this on Apache level instead of CQ level ?

Thanks in advance.