Recently I found that the limit rate is calculated and enforced per current minute. So, if you issue ~ 30 requests you need to wait until next minute in order to get a API reply without an error. No matter what was actual delay between calls. That goes like this: say if you started your requests at 12:10:01 .. and issued 30 requests with a one second delay in between you could get good data for most of them. But then no matter what you do you would get errors or short reply that you reached the limit. Though, if you wait until the next minute, or 12:11:00 in this example, you could start issuing requests again with decent chance to get good data.
BTW, the limit seems to be per IP. So, theoretically, if you need to issue lots of requests, you may need to use different outgoing IPs for each call. But I think the system has very limited bandwidth resources, once you issue multiple requests from different IPs, that would make system unusable for everybody. I would refrain from such experiments.
Also, since the system is based on Heroku cluster (or at least uses the cloud gateway load-balancers from Heroku), those are inserting additional delay with handling the calls. Your request may be handled from A to Z in 1-2 seconds or it may take upto 12-15 seconds to get an actual response. I believe that load-balancer introduces its own random delay handling the requests (who knows at what random logic, I do not believe that's based on current system usage).