Your achievements

Level 1

0% to

Level 2

Tip /
Sign in

Sign in to Community

to gain points, level up, and earn exciting badges like the new
Bedrock Mission!

Learn more

View all

Sign in to view all badges

Adobe Summit 2023 [19th to 23rd March, Las Vegas and Virtual] | Complete AEM Session & Lab list
SOLVED

Facing special characters on datasource node rendering in place of "-"

Avatar

Level 3

 

keshava219_0-1657298081575.png

 

In above image its rending data from thirdparty api calls by name . in place of   "-"  its coming. 

 

but from api i can see value 

"Welcome to Progress – Image Film".
 

 

 

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

Making sure you are encoding your strings as UTF-8 in Java should solve the trick. 

 

Encoding With Commons-Codec

 

<dependency>
    <groupId>commons-codec</groupId>
    <artifactId>commons-codec</artifactId>
    <version>1.14</version>
</dependency>

String rawString = "Welcome to Progress - Image Film"; 
byte[] bytes = StringUtils.getBytesUtf8(rawString);
 
String utf8EncodedString = StringUtils.newStringUtf8(bytes);


What is UTF-8?

 

UTF-8 is a variable-width character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from Unicode Transformation Format – 8-bit. UTF-8 is capable of encoding all 1,112,064 valid character code points in Unicode using one to four one-byte code units.

0 Replies

Avatar

Employee Advisor

Hi @keshava219 ,

 

Is character encoding enabled for the third party api response ?

Thanks,

Milind

Avatar

Level 3

Hi @milind_bachani ,

 

                              Im not using response to get api details , Directly from getInputStream

here is the snippet im using :

conn.setRequestProperty("Content-Type", "application/json");
conn.setRequestMethod(ABBConstants.GET_REQUEST);

BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream()));

getting output of json ,

im passing to datasource.

Avatar

Community Advisor

Hi @keshava219 , This is because of characters encoding/decoding in a API response( as a matter of. fact, any string-based response).

I saw your response is coming from BufferedReader, so now convert BR to String & then pass into below method to get properly decoded values.

please use following code after you receive response & pass each string here to get proper values.

~Aditya.Ch

import java.net.*;
private String allowSplChars(String incomingString) {
        String encodedValue = null;
        try {
        encodedValue= URLEncoder.encode(incomingString.trim(), "UTF-8");
        } catch (UnsupportedEncodingException uee) { 
        	log.error("UnsupportedEncodingException", uee);

        }
         return encodedValue;
	}

 

Avatar

Level 3

hi @Aditya_Chabuku ,

 

 Thanks for the reply, 

 

keshava219_0-1657300915276.png

 

after trying that above method  spaces and also decoder no change previous still same thing coming .

Avatar

Community Advisor

@keshava219  What is your content type? is it "application/x-www-form-urlencoded"?

Avatar

Community Advisor

@keshava219 Please try out with the below properties in the servlet, while making the request to the api to return the response.

    HttpURLConnection con = (HttpURLConnection) obj.openConnection();
    con.setRequestMethod("GET");
    con.setRequestProperty("accept-charset", "UTF-8");
    con.setRequestProperty("content-type", "application/x-www-form-urlencoded; charset=utf-8");

 Hope this helps!

Thanks

Avatar

Correct answer by
Community Advisor

Making sure you are encoding your strings as UTF-8 in Java should solve the trick. 

 

Encoding With Commons-Codec

 

<dependency>
    <groupId>commons-codec</groupId>
    <artifactId>commons-codec</artifactId>
    <version>1.14</version>
</dependency>

String rawString = "Welcome to Progress - Image Film"; 
byte[] bytes = StringUtils.getBytesUtf8(rawString);
 
String utf8EncodedString = StringUtils.newStringUtf8(bytes);


What is UTF-8?

 

UTF-8 is a variable-width character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from Unicode Transformation Format – 8-bit. UTF-8 is capable of encoding all 1,112,064 valid character code points in Unicode using one to four one-byte code units.