Expand my Community achievements bar.

Help shape the future of AI assistance by participating in this quick card sorting activity. Your input will help create a more effective system that better serves your needs and those of your colleagues.
SOLVED

Formatting JSON to allow mapping of data ingestion through Data Landing Zone detect the structure of object arrays

Avatar

Level 2

I am facing a problem where the structure of object arrays are not correctly detected when trying to ingest data into a Dataset from a JSON through Data Landing Zone.

Basically, we have object array fields where the size is not fixed, and in some (most) rows of data, the array is empty. Here is a simplified example.

 

When given the following JSON: 
[{},{},{},{},{"cooperations": [{"a": 1,"b": 2}]},{"cooperations": [{"a": 1,"b": 2},{"a": 3,"b": 4}]}],
the mapping detects the object structure as so.

DuckAndChips_0-1694518427863.png

 

However, when I increase the number of empty objects at the beginning of the above JSON to 90 empty objects, the mapping does not detect the object structure.

DuckAndChips_1-1694518514419.png

 

When explicitly using empty arrays, like so:
[{"cooperations": []},{"cooperations": [{"a": 1,"b": 2}]},{"cooperations": [{"a": 1,"b": 2},{"a": 3,"b": 4}]}]
the object structure is not detected. Same thing goes for when I try to use null.
Moreover, an attempt to do object-to-object mapping returns an error saying that there are no compatible fields.

DuckAndChips_2-1694518689518.png

 

Please advise on the formatting of the JSON to be able to cater for my use case.

 

1 Accepted Solution

Avatar

Correct answer by
Level 2

Further investigations show that the first item in the outermost array has to have all fields with non-empty values with the correct types, meaning AEP uses the first element to detect the type definition of the JSON.

View solution in original post

5 Replies

Avatar

Level 2

Hi @DuckAndChips ,

 

The JSON format you're using is not valid. To fit your needs, please use the JSON structure provided below. Save the sample data in JSON format and then proceed with ingesting it.-

[{"<<tenantId>>":{"cooperations":[{"a":"n","b":"12"},{"a":"i","b":"11"}]}},
{"<<tenantId>>":{"cooperations":[{"a":"m","b":"12"},{"a":"j","b":"11"}]}},
{"<<tenantId>>":{"cooperations":[{"a":"","b":""},{"a":"","b":""}]}},
{"<<tenantId>>":{"cooperations":[{"a":"","b":""},{"a":"","b":""}]}},
{"<<tenantId>>":{"cooperations":[{"a":"","b":""},{"a":"","b":""}]}}]

 

Krishna

Avatar

Level 2

Thanks for your reply.

However, in our use case, the sizes of the "cooperations" arrays are not fixed. In your example, all "cooperations" arrays have a size of 2. Please kindly advise on how to maintain the array structure when many of the "cooperations" array are empty.

Moreover, is the tenant_id necessary? Since I should be able to map fields one-to-one just by using different paths.

Avatar

Level 2

Hi @DuckAndChips ,

 

I've sent you a sample file to test the ingestion. You can add as many objects to the 'cooperations' array object.

But in the second case when array's objects are empty, you can only add empty objects like this: {"a":" ","b":" "}, and not completely empty ones like {}.

 

'tenant_id' is used to ensure that resources you create are namespaced properly and contained within your organization. If you do not know your ID, you can access it by performing the following GET request:

 

curl -X GET \
https://platform.adobe.io/data/foundation/schemaregistry/stats \
-H 'Authorization: Bearer {ACCESS_TOKEN}' \
-H 'x-api-key: {API_KEY}' \
-H 'x-gw-ims-org-id: {ORG_ID}' \
-H 'x-sandbox-name: {SANDBOX_NAME}'

 

https://experienceleague.adobe.com/docs/experience-platform/xdm/api/getting-started.html?lang=en#kno...

 

Krishna

 

 

Avatar

Correct answer by
Level 2

Further investigations show that the first item in the outermost array has to have all fields with non-empty values with the correct types, meaning AEP uses the first element to detect the type definition of the JSON.

Avatar

Employee

We are facing a similar issue, while ingesting data from json using S3
Scenario: We are having attributes in the json only when there is data present for them. 
Now lets say first record in the file has 5 attributes with values, and second record has 10 attributes with values present for them , AEP is taking only 5 attributes while mapping source to target. 

Is this a known issue? How to go about it?