How to do batch ingestion for an array type field ? | Community
Skip to main content
AtulChavan
Community Advisor
Community Advisor
April 18, 2021
Solved

How to do batch ingestion for an array type field ?

  • April 18, 2021
  • 2 replies
  • 4454 views

Hi there,

 

I had defined schema having an array of object & created profile dataset.

 

I have single attribute to be attached with profile having multiple values. 

 

Now Ingesting data in batch mode  with x values causing it to retain just last value in array not all of them. Ideally next batch insertion having x values should replace older x values. Here they replace themselves & holds just last element in same batch

**data [.csv]

MV
-----
valueFirst
valueSecond
valueThird

What happening is :

AEP Profile Attribute
--------------------------- 
sandbox.data.0.MV = valueThird    // replaces its own batch data valueFirst->valueSecond->valueThird

 What I want is  :  
sandbox.data.0.MV = valueFirst

sandbox.data.1.MV = valueSecond

sandbox.data.2.MV = valueThird

Thoughts please...... 

This post is no longer active and is closed to new replies. Need help? Start a new post to ask your question.
Best answer by Danny-Miller

@atulchavan Two options:

  1. Limit the Array to some amount (say 3).  Then pass all three values in as different fields.  Then use a mapper function to construct an Array. (I've done this and it works)
  2. Pass the multiple values in as a pipe delimited string in one field.  Then use a mapper function to construct an Array. (I haven't done this, but I think it will work)

2 replies

NimashaJain
Adobe Employee
Adobe Employee
July 20, 2022

@atulchavan Is it resolved

AtulChavan
Community Advisor
Community Advisor
July 20, 2022

Hi @nimashajain , yes its resolved. CSV file don't accommodate array values hence AEP behaved as it should i.e overwriting existing value with latest one as it was a profile data.

Danny-Miller
Adobe Employee
Danny-MillerAdobe EmployeeAccepted solution
Adobe Employee
July 20, 2022

@atulchavan Two options:

  1. Limit the Array to some amount (say 3).  Then pass all three values in as different fields.  Then use a mapper function to construct an Array. (I've done this and it works)
  2. Pass the multiple values in as a pipe delimited string in one field.  Then use a mapper function to construct an Array. (I haven't done this, but I think it will work)
Level 6
April 7, 2023

Hi @danny-miller, can you provide an example of how to use the mapper function? I am facing the same issue with BigQuery source and cannot map to a query. Thanks!

Level 6
April 13, 2023

Below example for csv, but fundamentally the same as a table.

csv file creating the identityMap from one field, one namespace:
to_object('CUSTID',to_object('id',trim(CUST_ID),'primary',true))

 

Another doing a multiple fields with the same namespace:
to_object('CUSTID',arrays_to_objects('id',explode(concat(CUST_ID,",",CRM_I),",") , 'primary', explode('true|false', '.')))


@danny-miller how do you then store the values from the source object?

 

 

For example, I have this as a calculated field in the source data:

to_object("cust_array",arrays_to_objects("bus_cust_id",explode(bus_cust_id,",")))

 it outputs multiple rows of objects with output like this:

{"cust_array":[{"bus_cust_id":"1"},{"bus_cust_id":"2"},{"bus_cust_id":"3"},{"bus_cust_id":"4"},{"bus_cust_id":"5"}]}

 

How can I map it in a way that the calculated values go into the right array, and into the right index level? I was thinking something like this but the * doesn't work.

to_object("cust_array",arrays_to_objects("bus_cust_id",explode(bus_cust_id,","))).cust_array[*].bus_cust_id