If you decide to stick to the token model would it be possible to define it in such a way that for every user using the api we can request a number of tokens that will be consumed. This will allow you to estimate the load on the api and still offer enough flexibility to do as much requests per user ...