Hi,
Client has multiple sub-businesses (about 4 or 5) and was earlier bringing in 1-Business data into AEP RTCDP. Then, very recently client wanted to bring in 2nd Businessses data. To do that I updated the exisiting schema as per client request and added fields which for 2nd Business.
For data Ingestion, I used exisiting Scheduled daily dataflow and added 2nd business or new fields into it. And created another dataset which had Schema PK + 2nd business fields to do a one-time load to bring in historical data for 2nd business. Client also wanted to use this data in CJA, due to two datasets and very less overlapping data between then, it will be an issue.
To fix this, I have two approaches,
- Approach 1: Using existing dataset - 1-Time-Historical load will be done to Existing dataset; scheduled data flow will run as it is. (New dataset that was created earlier for 2nd business, will be deleted). How should I perform this 1-Time-Load? Will it contain all fields (1st business + 2nd business fields) or it should only have 2nd business fields?
- Approach 2: Using a new dataset - To this Historical load was done earlier. As next step, a new scheduled dataflow will be setup to the new dataset & a On-Demand Load will be done for the difference between 1st One time load and Next when a scheduled dataflow is created.
Please let me know the correct and best approach.