Expand my Community achievements bar.

Help shape the future of AI assistance by participating in this quick card sorting activity. Your input will help create a more effective system that better serves your needs and those of your colleagues.
SOLVED

How to download data of a Dataset in csv file using API

Avatar

Level 2

Hi All,

 

Is there a way we can download a dataset in a .csv file using API?

 

As we can download it in parquet format.

 

Thanks

Gouri

1 Accepted Solution

Avatar

Correct answer by
Level 4

Hi @gugun,

 

Download the batch file into CSV format is not support ATM via API. 

We had the similar requirement and we used the data access APIs to download the parquet file and created the Python script to convert the parquet file into CSV and also in JSON format. You can also follow the same approach.

 

Hope this helps.

 

With Regards,

Amit 

View solution in original post

8 Replies

Avatar

Level 3

Hi @gugun,

I'm not sure about API, but if you've SQL client connected with Query Service then from there you can export it in CSV format by querying your dataset, provided SQL client support export feature.

 

Regards,

Vikash

Avatar

Employee

Data Access API currently allows the file to be generated in a parquet format only. 

Avatar

Community Advisor

Hi @gugun ,

 

Did you try Data Access API - https://experienceleague.adobe.com/docs/experience-platform/data-access/api.html?lang=en?

 

Looking at the documentation i see it mentions about csv file format support as well (eg: profile.csv).

 

Thanks,

Chetanya

 

 

Avatar

Community Advisor and Adobe Champion

I was talking about this recently, if i'm not wrong it can be only in parquet format for now.  

tutorial here. 

https://experienceleague.adobe.com/docs/experience-platform/data-access/tutorials/dataset-data.html?...

Avatar

Level 10

Try with this API - 

https://platform.adobe.io/data/foundation/export/files/{FILE_ID}

Avatar

Correct answer by
Level 4

Hi @gugun,

 

Download the batch file into CSV format is not support ATM via API. 

We had the similar requirement and we used the data access APIs to download the parquet file and created the Python script to convert the parquet file into CSV and also in JSON format. You can also follow the same approach.

 

Hope this helps.

 

With Regards,

Amit 

Avatar

Level 1

Hi Amit,

I've hit a roadblock with parsing response and downloading parquet files.
How do you download using data access API in aepp Python library?

Avatar

Level 3

Download the dataset content in CSV & JSON using Python SDK and Pandas library. Try following code snippet in JupyterLab notebook of Data science workspace...

 

from platform_sdk.dataset_reader import DatasetReader
import pandas as pd

dataset_reader = DatasetReader(get_platform_sdk_client_context(), dataset_id="<datasetid>")
df = dataset_reader.limit(200000).read()
df.to_csv("eventData.csv",sep=',',encoding='utf-8',index=False,header=True)
df.to_json("eventData.json",orient='records',lines=True)

 

Edit limit values as per your dataset record count. Use "offset()" with "limit()" when you've large dataset.

Thanks.