Expand my Community achievements bar.

character # in URL form facebook share is rejected by dispatcher and URL is getting truncated

Avatar

Level 3

I have this issue on my dispatcher. We have an option on our site to share user visited page to be shared by the page visitor on to his facebook account.

When site visitor shares the page on his facebook account a link is created on  facebook for the shared content.

The URL for shared content on facebook looks likes this --> https://example.domain.com/products/ex-email-security-products.html#.VzRmmrMTLds.facebook

we can see that there is special chracter # after .html in the share URL given above.

so when we click the shared content on facebook the above URL is redirected back to my site and the above URL also is getting truncated and redirected to incorrect URL given below.

 https://example.domain.com/products/.VzRmmrMTLds.facebook

Due to above truncation when ever users clicks on the shared content on facebook due to truncate action of dispatcher user is not able to visit the correct shared page from facebook.

Any idea on how to correct the above issue of URL truncation created by dispatcher . What filter rule I should add to fix this truncation problem.

currently we have below rule in filters section of publish-farm.any

/0002 { /type "allow" /glob "* /*.html*"    }    

7 Replies

Avatar

Level 10

I am checking with internal Adobe ppl so see if anyone has seen this issue before. 

Avatar

Level 3

I found another reason for this behaviour

I just hit the publish instance  URL -->http://<ipaddress>:4503/products/nx-network-security-products.html#.VzRmmrMTLds.facebook

gets incorrectly  redirected to URL--->  http://<ipaddress>:4503/products/.VzRmmrMTLds.facebook

But this -->http://<ipaddress>:4503/products/nx-network-security-products.html#VzRmmrMTLds.facebook  goes to    incorrectly to     :    : :::::::> http://<ipaddress>:4503/products/VzRmmrMTLds.facebook

 

----------------

but no changes for this URL -->/products/nx-network-security-products.html#VzRmmrMTLds

Avatar

Administrator

Hi 

Just for information:

# behaves differently in URL by the browsers. 

Browsers only send the part before the # to the server. The part after is for auto-scrolling to elements via their id or name attribute. I guess, in your case also "http://<ipaddress>:4503/products/nx-network-security-products.html" be send as an request.

//

 

Reference link:- https://blog.httpwatch.com/2011/03/01/6-things-you-should-know-about-fragment-urls/

I hope this would be helpful.

Thanks and Regards

Kautuk Sahni



Kautuk Sahni

Avatar

Level 3

Hi  Kautuk Sahni,

I have googled about the functionality  provided by # . the role of # is that everything that is there after # in a URL is called a fragment .

There is more information on the role of # in this link ->https://en.wikipedia.org/wiki/Uniform_Resource_Locator#Syntax.

May be different browsers handle request URL in different ways. But no browser will truncate what is typed in address bar.

But the issue is that no browser will truncate the URL itself. But in our case the URL itself gets truncated.

On the same browser open any other site and type the path for resource having # in its path which is hosted on AEM,  we observe that browser will never truncate the URL itself.( I am not worried about the response, let the response contain errors , errors  are OK).

I have observed in our case the truncation happens at publish instance level and not at dispatcher level.

So I wanted to know how to fix the publish instance behaviour.

We have a resource resolver mapping in our publish instance that causes long URLs to be shortened I wanted to know if that mapping is causing this issue?

Avatar

Level 10

We have a resource resolver mapping in our publish instance that causes long URLs to be shortened I wanted to know if that mapping is causing this issue?

Support replied with this: 

I have used mod_rewrite to handle my URL rewrites in the past.  Is that what is truncating the URL here?  

You may be able to play with your RewriteRules to trim off everything after the # instead. 

Avatar

Administrator

subramanya vl wrote...

Hi  Kautuk Sahni,

I have googled about the functionality  provided by # . the role of # is that everything that is there after # in a URL is called a fragment .

There is more information on the role of # in this link ->https://en.wikipedia.org/wiki/Uniform_Resource_Locator#Syntax.

May be different browsers handle request URL in different ways. But no browser will truncate what is typed in address bar.

But the issue is that no browser will truncate the URL itself. But in our case the URL itself gets truncated.

On the same browser open any other site and type the path for resource having # in its path which is hosted on AEM,  we observe that browser will never truncate the URL itself.( I am not worried about the response, let the response contain errors , errors  are OK).

I have observed in our case the truncation happens at publish instance level and not at dispatcher level.

So I wanted to know how to fix the publish instance behaviour.

We have a resource resolver mapping in our publish instance that causes long URLs to be shortened I wanted to know if that mapping is causing this issue?

 

 

Hi 

You are correct that no browser would truncate the URL, but it could be the case the URL that you mentioned is getting redirected, and the redirected URL is invoking some different script (Sling Resource Resolution, Read More).

If the original URL is sent to the server and then redirected (301/302/307) the fragment is lost because the server doesn't process it. The only way to keep the fragment is for the server to respond with a JS-based redirect instead of an HTTP code.  The JS would see the fragment in the browser URL and could keep it.  If Apache or AEM is redirecting with HTTP 301/302/307 codes it will be lost.

 

I hope this would be of some help to you.

Thanks and Regards

Kautuk Sahni



Kautuk Sahni