I run a web app called leenk.me, it’s a Social Media Optimization application for WordPress. Basically it publishes your WordPress content to Twitter / Facebook / Google Buzz whenever you publish new content to your website. There are a lot of “advertisers” who have been signing up for the service and one in particular has been hitting the service hard — really hard — about 4000 API requests in an hour.
This is a pretty big problem, all these social networks rate limit their connections (close to 300 requests per hour). So making 4000 requests in an hour could cause them to ban the user, or worse, ban leenk.me. So I’ve had to implement a rate limit of my own. I thought about doing this with iptables to block anyone who exceeds a certain number of requests per hour, but the violating users wouldn’t know what happened. I really wanted needed a way to prevent them from over-connecting to the social networks.
This is where the WordPress Transients API comes in. The transients API is very similar to the Options API but with the added feature of an expiration time. This is basically how the transients API works in WordPress:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
// Save a transient to the DB set_transient( $transient_name, $transient_value, $expiration );</pre> <ul> <li>$transient_name - A unique identifier for your cached data.</li> <li>$transient_value - Data to save, either a regular variable or an array/object. The API will handle serialization of complex data for you.</li> <li>$expiration - Number of seconds to keep the data before refreshing.</li> </ul> <pre class="php" name='code'>// Get value of a transient from the DB $transient_value = get_transient( $transient_name );</pre> <pre class="php" name='code'>// Delete a transient from the DB delete_transient( $transient_name ); |
The only problem with WP Transients is that there isn’t a good way to create “rolling transients” (as I call them). A rolling transient is a transient that rolls in time. In other words, I want to rate-limit a users connection by 1 hour from their current API call, but if the users does not make any API calls within the hour the transient should expire.
This is how I implemented “rolling transients”:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
$call_limit = 350; // API calls (in an hour) $time_limit = 60 * 60; // 1 hour (in seconds) $transient_name = $host . "_rate_limit"; // Using their host name as the unique identifier // Check to see if there are any transients that match the name, if not create a new one if ( false === ( $calls = get_transient( $transient_name ) ) ) { $calls[] = time(); set_transient( $transient, $calls, $time_limit ); // Use an array of time() stamps for rolling effect } else { // There is already a transient with this name $calls[] = time(); // Add a new time() stamp to the $calls array set_transient( $transient, $calls, $time_limit ); // Reset the transient (w/ expiration time) $call_count = count( $calls ); // How many calls have been made if ( $call_limit < $call_count ) { // If we're over the call limit, remove expired timestamps // Shift time from first element of array while ( $call = array_shift( $calls ) ) { // If time is >= current time - time limit, then it belongs in the array // Add it back and reset the transient if ( $call >= ( time() - $time_limit ) ) { array_unshift( $calls, $call ); set_transient( $transient, $calls, $time_limit ); break; // Stop processing, we're within the time_limit time now. } } // If we're still over the call limit, they've made too many requests in the time limit if ( $call_limit <= count( $calls ) ) { // The session needs to be killed die('Error: You have exceeded your rate limit for API calls, only ' . $call_limit . ' API calls are allowed every ' . $time_limit . ' seconds.'); } } } |
With this code, you now have rolling transients… if your users exceed the number of calls within the past hour they will be rejected. If not, the transient will expire as it should. Let me know if you found this useful or if you have any tips for making it better (it seems a little “hacky” to me).
If you are already tracking the number of requests made in the last hour, why not push overflow requests onto a heap and if the quota has room, ie. traffic lightens up, pull requests from the heap and post them?
Since you have serialized data already, it should be pretty cheap (resource wise) to write the queued requests (and timestamp) to a temp file using the hostname.
To make it more efficient, it might be worthwhile to move on to other users’ queues. That way you could make the most of every run.