All Articles

Lessons From Using Amazon MQ

I'm not sure what rationale Amazon Web Services employs to name its offerings — AWS DeepLens, AWS DeepLens/AppSync vs. Amazon MQ/Lex, etc. — but Amazon MQ still appears under the AWS namespace in the SDK. FWIW, it's basically Active MQ 5.15.0 (at this time) + monitoring + pay-as-you-go storage (doesn't really matter).

Imagine booting up a t2.micro instance on EC2 and installing ActiveMQ, setting up appropriate security groups/firewall rules, and setting up monitoring (CloudWatch or other) by yourself. Yes, you could set this up all by yourself (if you need a simple, single broker). It's really not that difficult. Amazon decided to provide a turn-key solution to something that's not much of a problem...but anyway...

How is it different from running ActiveMQ on an EC2 instance? One glaring difference is — you cannot temporarily Stop the ActiveMQ instance. You have to leave it running 24x7, or delete the broker. On the bright side, you get cool CloudWatch metrics and queue-level stats. Consequentially, you can set up alarms when certain queues reach unexpected thresholds. There's also an option to have an Active/Standby deployment for reliable uptime.

In our case, our requirement for Amazon MQ grew out of our app initially using SQS + Lambda (workers). Starting initially with a single FIFO queue, we decided to reduce overall processing time by sharding jobs across multiple FIFO queues only to realize we needed to scale a whole lot more. Amazon MQ launched around the time we needed to scale and seemed like a viable solution given the low monitoring and upkeep overhead.

Some of our lessons from using ActiveMQ are as follows:

If you can, use the official Java Client

Although ActiveMQ supports a variety of client languages, the Java client is your best bet. Amazon MQ comes with SSL mandatorily turned on (and rightly so) and connecting to it using Python or Node.js clients can be a pain. We got stompy to play well after some hacks, but the Java client uses the OpenWire protocol and works much better.

Additionally, it's possible to monitor queue stats and manage queues using JMS stuff readily available in Java.

In our case, we repurposed our existing SQS-polling worker code on Lambda/Node.js to respond to async invocations and removed all polling code. We then created thin Java workers (AWS SDK + ActiveMQ libs + Jackson + log4j => CloudWatch) that poll ActiveMQ queues and asynchronously invoke our earlier Lambda function.

Deleting ActiveMQ messages is a Pain

For our use-case, at the beginning of the day, we need to purge all queues that could contain unprocessed jobs from the previous day. We then use Lambda to replenish all queues with fresh jobs.

Destroying a queue with ActiveMQConnection.destroyDestination(queue) does not delete its messages. Who would've thunk it. If the same queue ends up being created later (with the same name), the old messages reappear like zombies.

There are two ways to purge the queue — one using a somewhat complex removeAllMessages sent over an MBeanServerConnection — this seems to be what the WebUI uses internally, and another is to...well, drain the queue! Obviously, this is the riskier alternative... while ((message = consumer.receive(100)) != null) { }

Polling Without Prefetch

We had clients running in Lambda polling the MQ using a periodic CloudWatch event but they didn't seem to receive a message if the queue had just one message. Setting the prefetch value to 0 fixed the issue.

ActiveMQPrefetchPolicy prefetchPolicy = new ActiveMQPrefetchPolicy();  

From the docs

Specifying a prefetch limit of zero will cause the consumer to poll for messages, one at a time, instead of the message being pushed to the consumer.

In our case, this is exactly the behaviour we intended for our stateless polling Lambda consumers that close the consumer connection as soon as they process a single message.

The combination of AWS Lambda, ActiveMQ, and DynamoDB makes for a pretty great "serverless" stack that is relatively painless...if your use-case fits the products. There are no server alerts to rush to, and the Lambda paradigm allows for quickly iterating over product features without breaking existing, working code. Until Aurora Serverless becomes generally available for tackling more serious use-cases, the current stack offers a cost-effective, maintenance-free way for high-velocity iterative development.