Expand my Community achievements bar.

SOLVED

Translate Content Fragment's JSON Properties With MS Machine Translation

Avatar

Level 1

Hello,

 

I'm currently working on a requirement for which translation of content fragments is required. The setup is pretty straight forward and translating the content fragments with Microsoft Machine Translation works basically. However, for some custom multi-fields with complex structure, the translation is not working due to the fact that the field values are stored as an array of JSON data. Translation is applied to the whole content of the JSON, also the keys and quotes are translated. The resulting content is not valid JSON anymore and the content fragment editor is therefor broken.

 

For example, the translation of the following content in French is displayed below with the wrong values.

{
  "data": {
    "articleByPath": {
      "item": {
        "_path": "/content/dam/wknd/en/magazine/sample-article",
        "main": {
          "json": [
            {
              "nodeType": "paragraph",
              "content": [
                {
                  "nodeType": "text",
                  "value": "This is a paragraph that includes "
                },
                {
                  "nodeType": "text",
                  "value": "important",
                  "format": {
                    "variants": [
                      "bold"
                    ]
                  }
                },
                {
                  "nodeType": "text",
                  "value": " content. "
                }
              ]
            }
          ]
        }
      }
    }
  }
}

 

Translation Result FR

{
  « données » : {
    « articleByPath » : {
      « item » : {
        « _path » : « /content/dam/wknd/en/magazine/sample-article »,
        « main » : {
          « json » : [
            {
              « nodeType » : « paragraphe »,
              « contenu » : [
                {
                  « nodeType » : « texte »,
                  « value » : « Il s’agit d’un paragraphe qui inclut »
                },
                {
                  « nodeType » : « texte »,
                  « value » : « important »,
                  « format » : {
                    « variantes » : [
                      « Audacieux »
                    ]
                  }
                },
                {
                  « nodeType » : « texte »,
                  « value » : " contenu. »
                }
              ]
            }
          ]
        }
      }
    }
  }
}

 

How is this issue to be fixed? Is there any configuration in AEM to properly handle JSON input in the translation? There was not appropriated option to use in the translation_rules.xml. Is this something to handle by the translator?

 

Any Input is highly appreciated.

 

Topics

Topics help categorize Community content and increase your ability to discover relevant content.

1 Accepted Solution

Avatar

Correct answer by
Community Advisor

Correct, it seems the original problem lies in how the data is initially stored. I imagine, as it's a JSON string, the translation provider won't be able to distinguish between this JSON and a regular word – it appears the same to them. I agree with you; it seems the best option is to migrate that custom multifield to a regular one that stores the data in nodes. Any path you choose will require some amount of work, but opting to get rid of the custom multifield and leveraging the out-of-the-box functionality for this sounds cleaner. I understand that the 'custom multifield' was needed due to the limitations of the multifield in the old AEM versions.

 

Weight the options and go for it, good luck.



Esteban Bustamante

View solution in original post

4 Replies

Avatar

Community Advisor

Hi

 

Are you storing the values as a JSON object or a JSON string representation? I recommend checking with the translator to see if they can handle this scenario. Translation service providers sometimes have solutions for such issues, such as setting up a specific translation dictionary. This dictionary can be configured to ignore words from the JSON structure that do not need translation. If this is not an option, you might need to add an extra step to clean up the translation. For instance, you could adjust the data post-translation or selectively pass only the fields from the JSON that require translation, and then reassemble the JSON. Consider exploring these approaches to address the issue

 

Hope this helps



Esteban Bustamante

Hey Esteban,

 

thanks for the reply. The values are stored as JSON-String. Currently, it would be hard to setup such a dictionary that could handle the data properly on the provider site. The best way in my opinion would be a combination of both approaches, submitting only the values that need translation to the translation service and adjust the data post-translation. But this sounds like not an easy task.

Our workaround is currently to get rid of the custom multi-field with JSON data storage and just use simple fields. But this also means a big step backwards for the data structure.

 

Cheers

Louis

Avatar

Correct answer by
Community Advisor

Correct, it seems the original problem lies in how the data is initially stored. I imagine, as it's a JSON string, the translation provider won't be able to distinguish between this JSON and a regular word – it appears the same to them. I agree with you; it seems the best option is to migrate that custom multifield to a regular one that stores the data in nodes. Any path you choose will require some amount of work, but opting to get rid of the custom multifield and leveraging the out-of-the-box functionality for this sounds cleaner. I understand that the 'custom multifield' was needed due to the limitations of the multifield in the old AEM versions.

 

Weight the options and go for it, good luck.



Esteban Bustamante

Avatar

Administrator

@louisn54620083 Did you find the suggestions from users helpful? Please let us know if more information is required. Otherwise, please mark the answer as correct for posterity. If you have found out solution yourself, please share it with the community.



Kautuk Sahni