Hi All,
Is there a way we can download a dataset in a .csv file using API?
As we can download it in parquet format.
Thanks
Gouri
Solved! Go to Solution.
Views
Replies
Total Likes
Hi @gugun,
Download the batch file into CSV format is not support ATM via API.
We had the similar requirement and we used the data access APIs to download the parquet file and created the Python script to convert the parquet file into CSV and also in JSON format. You can also follow the same approach.
Hope this helps.
With Regards,
Amit
Hi @gugun,
I'm not sure about API, but if you've SQL client connected with Query Service then from there you can export it in CSV format by querying your dataset, provided SQL client support export feature.
Regards,
Vikash
Views
Replies
Total Likes
Data Access API currently allows the file to be generated in a parquet format only.
Hi @gugun ,
Did you try Data Access API - https://experienceleague.adobe.com/docs/experience-platform/data-access/api.html?lang=en?
Looking at the documentation i see it mentions about csv file format support as well (eg: profile.csv).
Thanks,
Chetanya
I was talking about this recently, if i'm not wrong it can be only in parquet format for now.
tutorial here.
Try with this API -
https://platform.adobe.io/data/foundation/export/files/{FILE_ID}
Views
Replies
Total Likes
Hi @gugun,
Download the batch file into CSV format is not support ATM via API.
We had the similar requirement and we used the data access APIs to download the parquet file and created the Python script to convert the parquet file into CSV and also in JSON format. You can also follow the same approach.
Hope this helps.
With Regards,
Amit
Hi Amit,
I've hit a roadblock with parsing response and downloading parquet files.
How do you download using data access API in aepp Python library?
Views
Replies
Total Likes
Download the dataset content in CSV & JSON using Python SDK and Pandas library. Try following code snippet in JupyterLab notebook of Data science workspace...
from platform_sdk.dataset_reader import DatasetReader
import pandas as pd
dataset_reader = DatasetReader(get_platform_sdk_client_context(), dataset_id="<datasetid>")
df = dataset_reader.limit(200000).read()
df.to_csv("eventData.csv",sep=',',encoding='utf-8',index=False,header=True)
df.to_json("eventData.json",orient='records',lines=True)
Edit limit values as per your dataset record count. Use "offset()" with "limit()" when you've large dataset.
Thanks.
Views
Replies
Total Likes