Expand my Community achievements bar.

Help shape the future of AI assistance by participating in this quick card sorting activity. Your input will help create a more effective system that better serves your needs and those of your colleagues.
SOLVED

Query service - Access system-generated values

Avatar

Level 2

Hi, 

I'm trying to query a dataset using the query service. I would like to query only the latest batch data received. 

We are using a fully custom schema (_tenant...).

 

I've tried accessing the system-generated such as createdByBatchID, but it seems that it is not accessible in the dataset.

ex.

SELECT createdByBatchID  

FROM <dataset>

 

Do I have to activate it some ways or is it automatically available following a batch ingestion?

 

Any leads on how I could query based on latest batch would be great.

Thanks 

Topics

Topics help categorize Community content and increase your ability to discover relevant content.

1 Accepted Solution

Avatar

Correct answer by
Employee

When ingesting data into a dataSet which is not timebase I would recommend placing a timestamp field upon the schema.  This will allow you to rank over agains the dataSet.  

View solution in original post

6 Replies

Avatar

Employee

The batch_id generated from the system is a text generated value, I would not recommend using this to look for a batch which has been ingested.  I would recommend using the rank function and partition by fields to look at data which has been ingested over times within the datalake.

 

rank() OVER(PARTITION BY <field1> order by <field>)

Avatar

Level 2

@brekrut in order to rank something I have to have a field to rank it over. My question is how can I access such value. Either something as Batch_id or CreatedDate that would allow me to distinguish one batch from another. I do not have access to such field at the moment using the query service.

Thanks 

Avatar

Correct answer by
Employee

When ingesting data into a dataSet which is not timebase I would recommend placing a timestamp field upon the schema.  This will allow you to rank over agains the dataSet.  

Avatar

Employee

_acp_batchid will allow you to identify a specific batch which has been ingested in to the AEP datalake, but this is a string value and it is not sequential.

Avatar

Level 4

Have you tried this?

select _ACP_BATCHID .... 


We are filling the field

_repo.modifyDate

during ingestion process to have a timestamp available.

Avatar

Employee

Did you find the suggested solutions helpful? It would be great if you can  mark the answer as correct for posterity. If you have found out solution yourself, share it with wider audience in the community.