Trying to replace a legacy data warehouse job that uses a "Previous Month" time range with a query in AEP. I can get the correct data back from my query if I manually set the start/end dates, but I can't generate working SQL for the rolling previous month time range so I can automate the job. ChatGPT gave me 4 different SQL variations to try but all failed, something like this
AND `timestamp` >= date_trunc(
'month',
date_sub(date_trunc('month', current_timestamp()), 1)
)
AND `timestamp` < date_trunc('month', current_timestamp())
gives this error ErrorCode: 42601 queryId: 09e1ee39-c4b9-4e9f-bedd-d6c0e5ff18af Syntax error encountered. Reason: [line 13:56: missing INTEGER_VALUE at ')']
Has anyone developed working AEP SQL for this?
Solved! Go to Solution.
Topics help categorize Community content and increase your ability to discover relevant content.
Views
Replies
Total Likes
This version is superior:
AND timestamp >= DATE_TRUNC('month', DATE_ADD(CURRENT_DATE, -DAY(CURRENT_DATE)))
AND timestamp < DATE_TRUNC('month', CURRENT_DATE)
as it allows the queries to run more efficiently using the physical partitioning inherent in how timeseries-based datasets are stored on disk. Otherwise the queries must perform an entire dataset scan just to find the date range that you want. So the net is to never wrap the timestamp itself in any functions... leave it by itself to the left of the comparison operator.
I would probably also combine these two conditions as a single BETWEEN clause for better readability.
try
AND MONTH (timestamp) = MONTH (DATE_ADD (CURRENT_DATE, - DAY (CURRENT_DATE)))
AND YEAR (timestamp) = YEAR (DATE_ADD (CURRENT_DATE, - DAY (CURRENT_DATE)))
or the below if you prefer to use start/end dates
AND timestamp >= DATE_TRUNC('month', DATE_ADD(CURRENT_DATE, -DAY(CURRENT_DATE)))
AND timestamp < DATE_TRUNC('month', CURRENT_DATE)
bonus if you want to declare the dates
SELECT
DATE_TRUNC('month', DATE_ADD(CURRENT_DATE, -DAY(CURRENT_DATE))) AS prev_month_start,
DATE_TRUNC('month', CURRENT_DATE) AS current_month_start
Thanks,
David
This version is superior:
AND timestamp >= DATE_TRUNC('month', DATE_ADD(CURRENT_DATE, -DAY(CURRENT_DATE)))
AND timestamp < DATE_TRUNC('month', CURRENT_DATE)
as it allows the queries to run more efficiently using the physical partitioning inherent in how timeseries-based datasets are stored on disk. Otherwise the queries must perform an entire dataset scan just to find the date range that you want. So the net is to never wrap the timestamp itself in any functions... leave it by itself to the left of the comparison operator.
I would probably also combine these two conditions as a single BETWEEN clause for better readability.
Views
Likes
Replies