Adobe Experience Platform

SaurabhM · 11/7/24

Case Study Overview

We’ll be using the Data Distiller Query Pro Mode Editor, a web-based editor that lets you view results of a query with limited options- 50, 100, 150, 300, or 500. While this may seem limiting, it’s important to remember that the editor operates within your browser, which has memory constraints for displaying results.

If you use a dedicated client installed locally on your machine, such as DBVisualizer, you can handle much larger datasets—up to 100,000 rows or more. However, even with local clients, you’ll eventually hit application memory limits.

This brings us to an interesting challenge: how can we efficiently paginate through the result set when the client editor imposes a limit on how much data can be displayed at once?

The answer is Data Distiller Cursors.

What is a Data Distiller Cursor?

A cursor in Data Distiller is a database object used to retrieve, manipulate, and traverse through a set of rows returned by a query one row at a time. Cursors are particularly useful when you need to process each row individually, allowing for row-by-row processing that can be controlled programmatically.

1. Sequential Processing: Cursors allow you to sequentially access rows in a result set, which is helpful when you need to handle each row individually.

2. Row-by-Row Operations: Unlike standard SQL, which typically processes whole result sets, a cursor can fetch a limited number of rows at a time, letting you work with rows individually in a controlled manner.

3. Memory Efficiency: When working with very large datasets, fetching rows in smaller chunks (instead of loading all at once) can help manage memory usage and improve performance.

Common Uses of Cursors

Batch Processing: When you need to process rows in smaller batches, especially with large datasets.
Row-by-Row Operations: For complex operations that require checking or modifying each row one at a time.
Data Migration or Transformation: Cursors can be helpful when copying data from one table to another while applying transformations.
Procedural Logic: Used in stored procedures or scripts where specific row-based logic or conditions need to be applied iteratively.

How a Data Distiller Cursor Works

DECLARE: Defines the cursor data_cursor with a specific SQL query.
FETCH: Retrieves a specified number of rows (e.g., 5) from the cursor.
CLOSE: Releases the cursor when no longer needed.

This process is especially valuable when working with large datasets, as it helps control memory usage by processing smaller chunks of data at a time.

Sample Code

-- Declare the cursor
DECLARE data_cursor CURSOR FOR
SELECT id, name, value FROM sample_dataset;

-- Fetch the first 500 rows
FETCH 500 FROM data_cursor;

-- To fetch the next 500 rows, repeat this command
FETCH 500 FROM data_cursor;

-- Close the cursor when done
CLOSE data_cursor;

Tutorial

Detailed link is here

Adobe Experience Platform

Incremental Data Extraction with Data Distiller Cursors

Case Study Overview

What is a Data Distiller Cursor?

Common Uses of Cursors

How a Data Distiller Cursor Works

Sample Code

Tutorial

Learn

Documentation

Community

Support

Resources

Adobe account

Adobe