No One Knows these Advanced SQS Concepts Delay Queues and Polling Strategies

40 views 10:49 pm 0 Comments September 12, 2023

Hey everyone, in this lesson, I want to delve into the world of Amazon SQS, and I’ll do my best to cover all the essential details you might encounter in an exam.

Let’s start by discussing decoupling, a crucial concept when working with SQS. If you’re not familiar with decoupling, think of it as the opposite of direct integration. Imagine you have a web tier with EC2 instances that need to pass information to an app tier for processing.

In a direct integration scenario, the web tier sends data directly to the app tier. The challenge here is that the app tier must keep up with the incoming workload. If, suddenly, your application experiences a massive surge in traffic, and the app tier can’t process the data quickly enough, you’ll encounter failures. In this situation, data can be lost, and you risk losing critical information, like customer orders or vital business data.

To address these issues, we can implement a decoupled integration. In this setup, we still have the web tier and the app tier, but now we introduce an SQS (Simple Queue Service) queue in between. The web tier sends messages to the SQS queue, and then the app tier retrieves and processes these messages from the queue. Here’s the key difference: the app tier actively polls the queue to check if there are new messages. The queue doesn’t push messages to the app tier; instead, the app tier pulls messages from the queue when it’s ready to process them.

This approach solves the previous problem. When the web tier experiences high traffic, it simply adds more messages to the queue. If the app tier can’t keep up with the processing demands, that’s fine because the information isn’t lost. As long as the data isn’t time-critical, the app tier will eventually get to it when it can allocate resources for processing.

Here are the various types of SQS (Simple Queue Service):

Standard Queue:

The standard queue offers what we call “best-effort ordering.” This means that as records are added to the queue, they are initially placed in a specific order. The queue attempts to maintain that order, but when the records are actually processed from the queue, the order might differ. In other words, records might be processed out of order. If maintaining a strict order is crucial for your application, you have a couple of options. You can include something in the message itself that the processing layer can use to ensure the correct order, or you can opt for a different type of queue known as a FIFO (First-In-First-Out) queue. In FIFO queues, you are guaranteed first-in-first-out delivery, meaning the first message added to the queue will be the first to be processed. Unlike standard queues, FIFO queues provide this strict ordering.

Differences Between Standard and FIFO:

Standard queues support a nearly unlimited number of transactions per second for each API action. On the other hand, FIFO queues have a high frequency but come with a maximum limit. They support up to 300 messages per second, including send, receive, or delete operations. Alternatively, you can batch messages, allowing up to 10 messages per operation, with FIFO queues supporting up to 3,000 messages per second.

With standard queues, you get “at least once” delivery, which means a message is guaranteed to be delivered at least once. However, occasionally, it might be delivered twice. If processing the same information multiple times is problematic for your application, you should consider implementing either application logic to handle duplicates or using a FIFO queue.

In contrast, FIFO queues provide “exactly once” processing, ensuring that each message is delivered only once, and duplicates are not introduced into the queue. This is ideal for scenarios where message duplication must be avoided at all costs.

Let’s look at a couple of important concepts related to SQS:

Message Group ID:

In the world of FIFO queues, there’s something called a “message group ID” and a “message deduplication ID” parameter that are added to messages. The message group ID serves as a tag that indicates a message’s affiliation with a specific group of messages. This ensures that messages within the same group are processed in a strict FIFO (first-in-first-out) manner. On the other hand, the deduplication ID is a token used to prevent duplicate messages from being processed within a defined deduplication interval, guaranteeing “exactly once” processing.

Dead Letter Queue (DLQ):

Now, let’s discuss the concept of a dead letter queue. It’s important to note that a dead letter queue is not a distinct type of queue but rather a configuration for a queue. So, how does it work? Imagine we have a web tier, an application tier, and an SQS queue in between. Sometimes, messages might not be successfully processed. For instance, the receive count could exceed the maximum allowed for the queue, indicating that consumers have tried and failed to process the message multiple times. In such cases, it’s essential to understand why this failure occurred. This is where a dead letter queue comes into play. It’s essentially another queue, which can be either a standard or a FIFO queue, configured to receive copies of messages that have failed to process successfully in the main queue used by your application.

The primary purpose of a DLQ is to handle message failures and provide an opportunity to isolate and analyze these problematic messages later. Remember, it’s not a distinct queue type; rather, it’s a standard or FIFO queue that you designate as a dead letter queue in the configuration of another queue. To set this up, you’ll use something called a “redrive policy,” specifying the dead letter queue’s name and the maximum number of receive attempts before a message is redirected to the dead letter queue. This configuration helps you efficiently manage and troubleshoot message processing issues in your application.

Now let us explore a few more concepts related to SQS:

Delay Queue:

Now, there’s another interesting concept known as a delay queue. In a delay queue scenario, we have a producer adding messages to the queue. The unique feature here is that we can delay the visibility of a message for a specific period. In simpler terms, during this delay period, the message remains hidden. If, for example, a Lambda function attempts to process this message within the configured delay time, it won’t be able to see it; it’s effectively invisible. However, once the delay period expires, the message becomes visible, and our Lambda function can then process it. We configure this delay using something called the “default visibility timeout.” For instance, if we set a default visibility timeout of 30 seconds, the message won’t be visible within the first 30 seconds after delivery to the queue. The default delay is 30 seconds, but you can extend it up to a maximum of 12 hours.


Now, let’s talk about how we fetch messages from the queue, which involves polling. There are two types of polling: short polling and long polling. When your consumer is looking for messages in the queue, it’s essentially polling the queue to see what’s available.

  • Short Polling: In short polling, the consumer checks a subset of servers and may not return all the messages in the queue. It’s quick, but it might not provide a comprehensive view of the queue.
  • Long Polling: On the other hand, long polling involves the consumer waiting for a specified duration, known as the “wait time seconds,” to see if any messages arrive during that time. This approach helps eliminate empty responses. Essentially, you have two options here: either you make an API call, which you will be billed for, and get a quick result (whether there’s a message or not), or you use long polling, where you wait and see if a message arrives within the defined wait time seconds. Long polling can reduce the number of API operations from your consumer, potentially lowering your overall costs.

Long polling becomes effective when the “receive message wait time” is set to a value greater than 0 seconds and up to a maximum of 20 seconds. For example, if you configure a “receive message wait time” of 20 seconds, the API operation will wait for up to 20 seconds for a message to appear in the queue before returning a result. If you set it to zero seconds, you effectively switch to short polling, where the API returns immediately, whether or not there are messages in the queue.

These concepts might seem detailed, but they’re important to understand for the exam. I hope this information proves helpful, and I look forward to seeing you in the next lesson.