We have a model service setup through AEP DSW that has been running perfectly for the last 38 weeks. The last two weeks, the service has failed to train or score (both parts start with the same base dataset). Roughly 13MM records are added to the base dataset each week. Since I cannot find anywhere in AEP where the exact size of a dataset it given, I have estimated the current size to be somewhere between 4GB and 15GB, depending on which math I use.
In the first step of the training (and scoring), we are getting the following error: ErrorCode: 53400 Size of results for interactive query exceeds max value of: [32GB]. Consider running this statement with CTAS syntax.
The query is set to pull only the most recent 20MM records (since, as previously mention, roughly 13MM new records are added to the base dataset each week). I assume the space error is happening when the query is trying to determine which are the most recent 20MM records. I deleted the three most recent batches to decrease the size of the base dataset, but I am still getting the space error. Since the code was running fine previously, I would have thought the deletion of three batches would have been enough. (Aside: I would ideally like to delete ALL previous batches, but only the last month of batches can be removed via the AEP Datasets tab (GUI) and I do not understand APIs enough to do the batch deletion elsewhere.) Based on my dataset calculations, the base dataset should be well under 32GB, but I imagine the space increases as the query sorts the dataset to find the most recent records.
Has something changed in space limits that is keeping this from running? Or does anyone have any other ideas on how to get this service running again?