our site works fine with custom 404 pages served from our language directories where the error page is displayed properly and the status returned is a 404 on the initial page hit at the specific URL in question. If, however, we hit the root of the site with an invalid link the custom error page is displayed, however viewing the network tab in a browser it first does a 301 redirect before hitting the custom error at /en-us/error/404 (so the URL does end up changing).
Looking into this, I found that I could get the correct behavior by editing our dispatcher configuration farm file by changing
the first line in our filter section (1001) from deny to allow.
We do not want to allow everything by default as this seems to be a security risk, however, we were wondering
if it is possible to get the 404 to function similar to the language paths with a less permissible rule.
As a side note, I noticed several large sites that are running AEM (Intel, McDonalds, Cisco) seem to have the same issue with their sites when having a 404 at the root of the site. I'm not sure if this is just a known/accepted issue with the way the dispatcher works or if there are sites that have solved this.
In my view, sites are easier to maintain if the dispatcher is set to "allow" content by default so authors can deploy new language folders under /content without requiring any adjustments to dispatcher rules. Ideally the dispatcher rules are never a bottleneck for authors who are pushing out new content. You don't want to be in a position where you have to update your dispatcher rules while you are vacation because authors want to urgently deploy some new language pages. Of course you always have to be aware of security and you can/must set the dispatcher to "deny" for folders where you don't want people snooping around like /apps, /libs, /var, /system, etc. I think this approach would require fewer dispatcher rules than your current approach.
In my view, your web server rules should match your dispatcher rules, and you should configure the global error page at web server level, it can be 301 and internal rewrite and locale based error pages at AEM level.