Adobe Experience Manager Sites & More

LyonMartin · 1/18/24

Hi would like to ask how would I be able to retrieve Japanese (or other double byte characters) or symbols (like degrees) and save it to a csv as when I tried to use this one, they end up displaying as question marks.

I have here the code snippet that I'm using:

InputStream in = JcrUtils.readFile(reportNode);
InputStreamReader reader = new InputStreamReader(in);
CSVParser parser =  CSVReportPrinter.parse(reader);
List<CSVRecord> records = parser.getRecords();

I also tried adding char encoding:

InputStreamReader reader = new InputStreamReader(in, "UTF-8");

but I still got the same question marks for those characters on the csv.

LyonMartin · 1/23/24

So, I've found out why it's not encoding properly when exporting it to csv.
After adding UTF-8 on the OutputStreamWriter, I still encoutered the issue, but different characters are still present on the csv.
After some checking, I've found an article about opening csv using the excel
https://answers.microsoft.com/en-us/msoffice/forum/all/how-to-open-utf-8-csv-file-in-excel-without-m...
Which in turn, partially solves the problem, though it's still a trouble if you want to open it by default.

With that, I need add UTF-8 BOM on the CSV.
https://stackoverflow.com/questions/4389005/how-to-add-a-utf-8-bom-in-java
I'm just torn if I should add it on a BufferedWriter, but for now, this one works for me:

ByteArrayOutputStream os = new ByteArrayOutputStream();
OutputStreamWriter osw = new OutputStreamWriter(os, StandardCharsets.UTF_8);
osw.write('\ufeff');

View solution in original post

arunpatidar · 1/18/24

Ensure that the encoding used for reading the input stream matches the encoding of your CSV file. Additionally, when writing the CSV file, make sure to use an appropriate encoding, such as UTF-8, to correctly store double-byte characters or symbols. If you are writing the CSV file back, you would need to use the correct encoding during the writing process as well.

Arun Patidar

LyonMartin · 1/23/24

So, I've found out why it's not encoding properly when exporting it to csv.
After adding UTF-8 on the OutputStreamWriter, I still encoutered the issue, but different characters are still present on the csv.
After some checking, I've found an article about opening csv using the excel
https://answers.microsoft.com/en-us/msoffice/forum/all/how-to-open-utf-8-csv-file-in-excel-without-m...
Which in turn, partially solves the problem, though it's still a trouble if you want to open it by default.

With that, I need add UTF-8 BOM on the CSV.
https://stackoverflow.com/questions/4389005/how-to-add-a-utf-8-bom-in-java
I'm just torn if I should add it on a BufferedWriter, but for now, this one works for me:

ByteArrayOutputStream os = new ByteArrayOutputStream();
OutputStreamWriter osw = new OutputStreamWriter(os, StandardCharsets.UTF_8);
osw.write('\ufeff');

Adobe Experience Manager Sites & More

Japanese/double byte Values becomes question mark while using JCRUtils.readFile

Arun Patidar

Learn

Documentation

Community

Support

Resources

Adobe account

Adobe