We too are using hourly data feeds, and pulling that data into our Data Lake... like you our files are .tsv and zipped... no one in our Data Engineering team has has any issues like what you are describing above.
Which is why I am wondering if you might be using a character encoding that is not based in an alphabet which is easily encoded with UTF-8, such as a Cyrillic language like Russian or Ukrainian, or a glyph-based language like Mandarin, Cantonese, Japanese, etc...
Adobe data, aside from some of the standard date based data should all be string values (all props and dimensions are text strings).... so unless there is an encoding issue for use with another language, it seems more likely that there is something happening in your ETL process...
Without seeing it, however, it's very hard to diagnose... perhaps Customer Care would be better, as they have more information about your data and you can share details there that you can't otherwise share on a public forum.