Expand my Community achievements bar.

SOLVED

How to do batch ingestion for an array type field ?

Avatar

Level 5

Hi there,

 

I had defined schema having an array of object & created profile dataset.

 

I have single attribute to be attached with profile having multiple values. 

 

Now Ingesting data in batch mode  with x values causing it to retain just last value in array not all of them. Ideally next batch insertion having x values should replace older x values. Here they replace themselves & holds just last element in same batch

**data [.csv]

MV
-----
valueFirst
valueSecond
valueThird

What happening is :

AEP Profile Attribute
--------------------------- 
sandbox.data.0.MV = valueThird    // replaces its own batch data valueFirst->valueSecond->valueThird

 What I want is  :  
sandbox.data.0.MV = valueFirst

sandbox.data.1.MV = valueSecond

sandbox.data.2.MV = valueThird

Thoughts please...... 

1 Accepted Solution

Avatar

Correct answer by
Employee Advisor

@AtulChavan Two options:

  1. Limit the Array to some amount (say 3).  Then pass all three values in as different fields.  Then use a mapper function to construct an Array. (I've done this and it works)
  2. Pass the multiple values in as a pipe delimited string in one field.  Then use a mapper function to construct an Array. (I haven't done this, but I think it will work)

View solution in original post

8 Replies

Avatar

Level 5

Hi @NimashaJain , yes its resolved. CSV file don't accommodate array values hence AEP behaved as it should i.e overwriting existing value with latest one as it was a profile data.

Avatar

Correct answer by
Employee Advisor

@AtulChavan Two options:

  1. Limit the Array to some amount (say 3).  Then pass all three values in as different fields.  Then use a mapper function to construct an Array. (I've done this and it works)
  2. Pass the multiple values in as a pipe delimited string in one field.  Then use a mapper function to construct an Array. (I haven't done this, but I think it will work)

Avatar

Level 7

Hi @Danny-Miller, can you provide an example of how to use the mapper function? I am facing the same issue with BigQuery source and cannot map to a query. Thanks!

Avatar

Employee Advisor

Below example for csv, but fundamentally the same as a table.

csv file creating the identityMap from one field, one namespace:
to_object('CUSTID',to_object('id',trim(CUST_ID),'primary',true))

 

Another doing a multiple fields with the same namespace:
to_object('CUSTID',arrays_to_objects('id',explode(concat(CUST_ID,",",CRM_I),",") , 'primary', explode('true|false', '.')))

Avatar

Level 7

@Danny-Miller how do you then store the values from the source object?

 

akwankl_1-1681366504718.png

 

For example, I have this as a calculated field in the source data:

to_object("cust_array",arrays_to_objects("bus_cust_id",explode(bus_cust_id,",")))

 it outputs multiple rows of objects with output like this:

{"cust_array":[{"bus_cust_id":"1"},{"bus_cust_id":"2"},{"bus_cust_id":"3"},{"bus_cust_id":"4"},{"bus_cust_id":"5"}]}

 

How can I map it in a way that the calculated values go into the right array, and into the right index level? I was thinking something like this but the * doesn't work.

to_object("cust_array",arrays_to_objects("bus_cust_id",explode(bus_cust_id,","))).cust_array[*].bus_cust_id

Avatar

Employee Advisor

@akwankl Can you post this as a new question so others can see it?  It looks like your asking another question than the original one.  I'd also put some more detail with data as an example.  Maybe others can help.  I don't dive into the Mapper very often.