Expand my Community achievements bar.

Join us in celebrating the outstanding achievement of our AEM Community Member of the Year!
SOLVED

chinese url parameters to jsp does appear correctly

Avatar

Level 4

Hi,

 

I am having the url  

http://localhost:4503/content/abc.html?q=安全

when i try to print the character

<c:out value="${param.q}" />

The result is totally a different character appear "安全".please let me know how to solve it

In the Jsp i tried the below does not work

<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<%@page contentType="text/html;charset=utf-8" pageEncoding="utf-8"%>
<%
       slingRequest.setCharacterEncoding("utf-8");
       out.print(slingRequest.getParameter("q"));
out.print(java.net.URLDecoder.decode(request.getParameter ("q"), "utf-8"));

     %>

 

Thanks

1 Accepted Solution

Avatar

Correct answer by
Administrator

Hi 

The characters (chinese) used in query parameters are Unicode characters (encoding used to accommodate character set beyond 2^8 -1 = 255 [ASCII]).

So to explain you a bit, when you send these characters or try to look these characters from the browser console, these character get converted to %encoding UTF 8 format.

Example:-

URL:- abc.com/about?q=安

Internally :- abc.com/about?q=%E5%AE%89

//Here %E5%AE%89 is % encoded UTF-8 for 安. 

Online tool to get these value :- http://www.endmemo.com/unicode/unicodeconverter.php

Solution (workaround) :- Use final String param = new String(request.getParameter("param").getBytes("iso-8859-1"), "UTF-8"); [Note that this is valid if the decoding charset (URIEncoding) of the server is iso-8859-1 - otherwise this charset must be passed in.]

Actual Solution :- Decode the query string yourself, and manually parse it (as opposed to using the ServletRequest APIs) into a parameter map yourself

The problem is that the submitted query string is getting mutilated on the way into your server-side script, because getParameter() uses ISO-8559-1 instead of UTF-8. This stems from Ancient Times before the web settled on UTF-8 for URI/IRI, but it's rather pathetic that the Servlet spec hasn't been updated to match reality, or at least provide a reliable, supported option for it.

(There is request.setCharacterEncoding in Servlet 2.3, but it doesn't affect query string parsing, and if a single parameter has been read before, possibly by some other framework element, it won't work at all.)

Reference links :- 

Link:-http://stackoverflow.com/questions/469874/how-do-i-correctly-decode-unicode-parameters-passed-to-a-s...

Link:- http://stackoverflow.com/questions/3029401/request-getquerystring-seems-to-need-some-encoding

Link:- http://stackoverflow.com/questions/3029401/request-getquerystring-seems-to-need-some-encoding/302940...

 

Option 3:-

Send the URL as % encoded string to the Server Side.

URL:- abc.com/about?q=安

Convert this to before sending to the server :- abc.com/about?q=%E5%AE%89 using "encodeURIComponent("安");"[JavaScript function, this function would return %E5%AE%89].

Then, at server side, use:

protected static final String CHARSET_FOR_URL_ENCODING = "UTF-8";
uname = URLDecoder.decode( "%E5%AE%89", CHARSET_FOR_URL_ENCODING); System.out.println("query string decoded : " + uname); // query string decoded : 安

I hope this would help you.

 

Thanks and Regards

Kautuk Sahni



Kautuk Sahni

View solution in original post

9 Replies

Avatar

Community Advisor
Hi 'srinivas, Before sending in query parameter encode the value and decode it when you fetching it I think this may work.

Avatar

Level 4

Thanks  for the input.

 

It still does not work.When i encode the url

http://localhost:4503/content/abc.html?q=%E5%AE%89%E5%85%A8

 

now when i decode the url in jsp

out.print(java.net.URLDecoder.decode(request.getParameter ("q"), "utf-8"));

 

it still prints "å®�å�¨" instead of 安全

Avatar

Level 9

Hi Sri,

The root cause of this issue is that you are not adding encoding in response (UTF-8 with response content type). Use below code and check the response. 

<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<%@page contentType="text/html;charset=utf-8" pageEncoding="utf-8"%>
<%

slingResponse.setContentType("text/html;charset=UTF-8");

       Out. print (slingRequest. getParameter ("q"));
out.print(java.net.URLDecoder.decode(request.getParameter ("q"), "utf-8"));

     %>

Jitendra

Avatar

Level 4

The issue is still not resolved , after adding the way you have mentioned above

Avatar

Correct answer by
Administrator

Hi 

The characters (chinese) used in query parameters are Unicode characters (encoding used to accommodate character set beyond 2^8 -1 = 255 [ASCII]).

So to explain you a bit, when you send these characters or try to look these characters from the browser console, these character get converted to %encoding UTF 8 format.

Example:-

URL:- abc.com/about?q=安

Internally :- abc.com/about?q=%E5%AE%89

//Here %E5%AE%89 is % encoded UTF-8 for 安. 

Online tool to get these value :- http://www.endmemo.com/unicode/unicodeconverter.php

Solution (workaround) :- Use final String param = new String(request.getParameter("param").getBytes("iso-8859-1"), "UTF-8"); [Note that this is valid if the decoding charset (URIEncoding) of the server is iso-8859-1 - otherwise this charset must be passed in.]

Actual Solution :- Decode the query string yourself, and manually parse it (as opposed to using the ServletRequest APIs) into a parameter map yourself

The problem is that the submitted query string is getting mutilated on the way into your server-side script, because getParameter() uses ISO-8559-1 instead of UTF-8. This stems from Ancient Times before the web settled on UTF-8 for URI/IRI, but it's rather pathetic that the Servlet spec hasn't been updated to match reality, or at least provide a reliable, supported option for it.

(There is request.setCharacterEncoding in Servlet 2.3, but it doesn't affect query string parsing, and if a single parameter has been read before, possibly by some other framework element, it won't work at all.)

Reference links :- 

Link:-http://stackoverflow.com/questions/469874/how-do-i-correctly-decode-unicode-parameters-passed-to-a-s...

Link:- http://stackoverflow.com/questions/3029401/request-getquerystring-seems-to-need-some-encoding

Link:- http://stackoverflow.com/questions/3029401/request-getquerystring-seems-to-need-some-encoding/302940...

 

Option 3:-

Send the URL as % encoded string to the Server Side.

URL:- abc.com/about?q=安

Convert this to before sending to the server :- abc.com/about?q=%E5%AE%89 using "encodeURIComponent("安");"[JavaScript function, this function would return %E5%AE%89].

Then, at server side, use:

protected static final String CHARSET_FOR_URL_ENCODING = "UTF-8";
uname = URLDecoder.decode( "%E5%AE%89", CHARSET_FOR_URL_ENCODING); System.out.println("query string decoded : " + uname); // query string decoded : 安

I hope this would help you.

 

Thanks and Regards

Kautuk Sahni



Kautuk Sahni

Avatar

Level 9

May I have the whole script (jsp) which gets invoked on your request and also can you give me browser console snapshot (response header information)?.

Jitendra

Avatar

Administrator

Jitendra S.Tomar wrote...

May I have the whole script (jsp) which gets invoked on your request and also can you give me browser console snapshot (response header information)?.

Jitendra

 

Hi Jitendra

The problem is that the submitted query string is getting mutilated on the way into your server-side script, because getParameter() uses ISO-8559-1 instead of UTF-8. This stems from Ancient Times before the web settled on UTF-8 for URI/IRI, but it's rather pathetic that the Servlet spec hasn't been updated to match reality, or at least provide a reliable, supported option for it.

Link:-http://stackoverflow.com/questions/469874/how-do-i-correctly-decode-unicode-parameters-passed-to-a-s...

Link:- http://stackoverflow.com/questions/3029401/request-getquerystring-seems-to-need-some-encoding

Link:- http://stackoverflow.com/questions/3029401/request-getquerystring-seems-to-need-some-encoding/302940...

 

Thanks and Regards

Kautuk Sahni



Kautuk Sahni

Avatar

Employee Advisor

You need to add a special parameter to the URL named "_charset_=UTF-8" and then the following code in JSP will work. Your url should be  abc.com/about?q=安&_charset_=UTF-8

<meta http-equiv="Content-Type" content="text/html;charset=utf-8" /> <% out.print(java.net.URLDecoder.decode(request.getParameter ("q"), "utf-8")); %>

If the charset parameter is not passed in the URL then Sling decodes the request parameters using the default ISO-8859-1 encoding. For more details see the Character Encoding section in the following documentationhttps://sling.apache.org/documentation/the-sling-engine/request-parameters.html

Up to and including Sling Engine 2.2.2 request parameters are always decoded with ISO-8859-1 encoding if the _charset_ request parameter is missing. As of Sling Engine 2.2.4 the_charset_ request parameter is optional. As of this version the Sling Main Servlet supports a configuration setting which allows to change the default character encoding used if the_charset_ request parameter is missing. To enable this functionality set the sling.default.parameter.encoding parameter of the Sling Main Servlet (PIDorg.apache.sling.engine.impl.SlingMainServlet) configuration to the desired encoding, which of course must be supported by the actual Java Platform.

Avatar

Level 4

Thanks a lot Kautuk .It worked.

As i am using it in jsp so instead of using scriptlet can i use taglib/jstl.if you please suggest on this.I am using the  below

<c:out value="${param.q}" />

 

Using the  below  will work on using scriptlet and  jstl .But can we do it totally in jstl instead of using any scriptlet

<% final String param = new String(request.getParameter("q").getBytes("iso-8859-1"), "utf-8");

 pageContext.setAttribute("paramq", param  ,PageContext.PAGE_SCOPE);%>

<c:out value="fff=${pageScope.paramq}"/>