Suppose you get thousands of requests of a same category which you will not be able to process in parallel because it might be trying to update the same database entity or you depend on some external API call that has some rate limit.

But, you still need to process all the requests. How would you do that?

Well, there is no one answer. Solution can differ based on your specific use-case or constraint.

For me, I recently encountered this problem in a multi-tenant architecture where each tenant could have thousands of users generating similar requests that impacted shared tenant-level resources. I needed to ensure all these requests were processed successfully, but in a controlled, sequential manner to avoid exceeding external API rate limits and risking failed or out-of-order processing.

Finding the Solution

Multiple Queues

First, it hit me that all i need to do is process one tenant’s request in a serial manner that i can do simply by creating a cloud task queue for each tenant as i get their first request, this comes with it’s own problems:

Managing hundreds of thousands of queues programmatically
COST: yes, this is going to be the most important factor in anything you will build. And creating and managing thousands of queues will not be cheap on any cloud service provider.

Distributed Locking

Second, Looks like we can’t create multiple queues now, how about creating just one and when you get the request, just create a distributed lock on your tenant and if the lock already exists, simply reschedule the other request after a minute.

But, if you are trying to solve 1000 simultaneous calls, you would be rescheduling 999 other calls, and 998 other calls the next minute, 997 after 2 minutes and so on. Doesn’t look efficient, right?

Sequential Processing with Redis and Single Task Queue

Finally, What if we combine the good parts of both the solutions, i.e:

Ability to get sequential calls for same tenant.
Ability to manage this with a single task queue

For this, we will intelligently determine when to schedule the next call for the same tenant, if we can do that, we can easily make sure that we are only scheduling once and in the same task queue. Hence, avoiding the two problems:

Having multiple task queues
And a loop of endless rescheduling.

To do this, we are going to use Redis not only to maintain lightweight coordination but also to keep track of the next available processing slot for each tenant, along with an accurate queue count. This allows us to determine precisely when to schedule the next request for that tenant, ensuring efficient and orderly processing without unnecessary rescheduling.

The idea is to have a stream of incoming API requests flowing through Redis, where each request is tagged by key `tenantId-productId-<any-other-field>, queued intelligently using Cloud Tasks, and processed one-by-one in controlled order, ensuring no race conditions and respecting external API rate limits, while maintaining high concurrency for unrelated keys.

Let’s understand this with an example

Walkthrough and the pseudo code

Thousands of requests come in. We will assume in this example that all the calls are coming for the same key serialise-tenantId-productId

// On each incoming request (for key: serialise-tenantId-productId)
function handleIncomingRequest(key):

    now = current_timestamp()

    // Fetch Redis state for this key
    next_slot = Redis.get(key + ":next_slot", default=now)
    queue_count = Redis.get(key + ":queue_count", default=0)

    if now >= next_slot:
        // It's time to process immediately
        // Update Redis state
        Redis.set(key + ":next_slot", now + 1 minute)
        Redis.set(key + ":queue_count", queue_count)  // no increment needed
        
        processRequest()

    else:
        // Need to reschedule for later
        my_slot = next_slot + (queue_count * 1 minute)

        // Create Cloud Task to run at my_slot
        scheduleCloudTask(key, my_slot)

        // Increment queue count
        Redis.incr(key + ":queue_count")


// When a scheduled Cloud Task executes
function processScheduledTask(key):

    now = current_timestamp()

    // Fetch Redis state
    next_slot = Redis.get(key + ":next_slot")
    queue_count = Redis.get(key + ":queue_count")

    if now >= next_slot:
        // Safe to process
        processRequest()

        // Update next_slot
        Redis.set(key + ":next_slot", now + 1 minute)

        // Decrement queue_count (if > 0)
        if queue_count > 0:
            Redis.decr(key + ":queue_count")
    else:
        // Not time yet (this task fired too early)
        // Simply reschedule
        scheduleCloudTask(key, next_slot)

T = 0: Initial State

For each key, we set an initial state when they get created

next_slot = now (available immediately)
queue_count = 0 (not yet queued any request)

T = 0: Call #1 arrives

Check the next available slot, it’s now. We will start processing this request immediately and update the following

next_slot = now + 1 min (we want to make sure we process the next request at least after 1 minute)
queue_count = 0 (we did not add anything into the queue yet)

T = 0: Call #2 arrives (simultaneously)

Check the next available slot, it’s 1 minute from now. We will have to:

add this reschedule this request to the queue for later time: my_slot = next_slot + 1 minute = T + 1 minute .
Increment the queue_count = 1 to determine the slot for next call

T = 0: Call #3 to #1000 arrives (simultaneously)

Each call does the same as call #2

Check the available slot my_slot = next_slot + (queue_count * 1 minute) and reschedules itself at my_slot
update the queue_count = queue_count + 1

Redis State after scheduling the thousand tasks

next_slot = T + 1 minute
queue_count = 999

T+1 minute: Call #2 comes again

Now, we will check with Redis: Is it time to process any new call? Yes, it is.

we will update Redis:

next_slot = now + 1 minute
queue_count = 998 (since we already know from some meta data that this request was queued, we will decrease the queue count as well)

This cycle repeats until all queued requests for that tenant are processed.

Scaling the final solution

What if i want to be able to process 2 concurrent requests for the same key at the same time?
Yes, you can, in the real world different cases can come up which would require you to build something that suits your requirement. For example if you want to be able to process two requests or n requests at the same time. Think about the tweaks that you might need to do to scale this solution to match these requirements. We will talk about it some other day.

Conclusion

This simple approach helped me handle thousands of requests per key in a controlled and scalable way, without having to create separate queues for each tenant or deal with crazy rescheduling loops. By using Redis to manage the next slot and queue count, and Cloud Tasks to handle retries and timing, I was able to make sure all requests get processed in order, without hitting external rate limits.

Of course, you can tweak this further and scale based on your system and requirements. This basic idea works for any situation where you need to process related requests one at a time, but still keep your system highly concurrent. Hopefully this gives you some ideas if you run into a similar problem.

Search This Blog

DebugDrift

How to Process Same-Key Requests Sequentially Using Redis and Cloud Tasks

Finding the Solution

Multiple Queues

Distributed Locking

Sequential Processing with Redis and Single Task Queue

Walkthrough and the pseudo code

Scaling the final solution

Conclusion

Comments

Post a Comment

Popular posts from this blog

Databases Demystified: What They Are and Why They Matter

What Is an API? A Simple 2-Sentence Explanation