I had a case like this measuring video view time. In reality most people who viewed a long video/movie for 10 minutes was enough signal to state they engaged with it. Another case sent signals at intervals: Started watching, Watch 25%, Watch 50%, Watched 100%. Perhaps consider discrete approxima...
Still need to have the Common SDK plug in on the site. Then API call to send event data to streaming HTTP endpoint in AEP. That can be lots of work as the communications, as in my case, involved multiple Adobe Orgs. Sounds like you may have the same issue. We ended up sending profile data via AP...