Christian Posta bio photo

Christian Posta

Field CTO at solo.io, author Istio in Action and Microservices for Java Developers, open-source enthusiast, cloud application development, committer @ Apache, Serverless, Cloud, Integration, Kubernetes, Docker, Istio, Envoy #blogger

Twitter Google+ LinkedIn Github Stackoverflow

So you want to do publish-subscribe with ActiveMQ across distributed topics and be reliable. You can just use durable subscriptions, right? Well, you could, but if you're using clustering with ActiveMQ, you may run into unintended behavior. I was recently at a client, and I noticed this behavior, and I also noticed the same behavior is present when using Weblogic JMS clustering as well. So what are the problems, and what does ActiveMQ bring to the table to solve them? Well, I'm assuming you read the title, so you may have an idea... but I'll keep going anyway...

Consider this scenario. You have broker A and broker B. They are networked together in both directions to form a full mesh type of Network of Brokers. Aka, cluster. Then you have a subscriber, FOO, using the failover URL failvoer:(tcp://hostA,tcp://hostB). This means the subscriber will pick a URL at random from the list of nodes in the failover transport and connect to that broker. So far so good. Let's say the failover url picks broker B to connect to, and the subscriber creates a durable subscription to topic TEST.TOPIC on broker B. Under the covers, the network of brokers will determine that there is a new consumer for TEST.TOPIC on B so if there are any producers on broker A, then broker A should forward messages to broker B (where the consumer is).

Now let's say that subscriber FOO decides to disconnect from broker B. This is fine because since this is a durable subscription, the messages should be persisted and if subscriber FOO comes back, it should be able to retrieve any messages that were delivered to topic TEST.TOPIC while it was away. But what happens if subscriber FOO, using the failover transport, connects to broker A directly? It will end up creating a new durable subscription (it won't use the same durable subscription that the network created.. because that one's tied to the network bridge), but it will not be able to access any messages that might be on broker B. And when a producer sends more messages to A, it will also try to forward to broker B which is where the subscriber was originally. Now imagine multiple subscribers that can connect, disconnect, and reconnect to different brokers in the cluster. You will end up with leaked subscriptions and missing messages all over the place. Yuck. The same applies to true fault scenarios where subscriber FOO doesn't willingly disconnect, but rather broker B goes down and subscriber FOO is forced to reconnect to broker A. You still end up with missing messages.

Well, turns out, this behavior is also present on Weblogic and clustered JMS servers with distributed topics (roughly equivalent to the network of brokers i described above). So are we not going to be able to do distributed pub-sub with reliability?

Well, I'm not sure how WLS solves it, but ActiveMQ has a great solution called Virtual Topics. It's a great feature for distributed pup-sub, but it also gets around the limitations of durable subscribers even in a single-broker deployment.

Limitations? What limitations?

Consider this... You're using a durable subscriber to a topic TEST.TOPIC. Turns out the processing of the messages is taking quite a long time, so you want to have another subscriber listening on that same channel of messages to "load balance" the work load. Try it. You'll find out quickly that the JMS spec won't allow that because durable subscribers must have a unique clientId and durableName and you cannot have multiple subscribers listening on the same stream of messages.

What about you're deploying your subscriber in an application server that pools connections. And you're trying to share connections among multiple durable subscribers. Well, again you'll hit the limitations on durable subscribers clientId and durableNames.

So, back to our distributed pub-sub problem. If you use virtual topics, then you get around the problems I described above. Virtual Topics use Queues under the covers (for consumers... please read the Apache docs on Virtual Topics, it has some great examples), and with queues you can get messages replayed back to you (for example, when the subscriber reconnects to A, it's messages are not lost, because messages can be replayed), or if the B broker happens to go down (fail) for some reason, you can have a slave waiting in its place to replay the otherwise potentially lost messages.

For more details, take a look at the ActiveMQ FAQ on distributed topics with durable subscribers because I'm going to go write it write now :)

If you have any questions about the details in this post, please don't hesitate to ask. I'm more than happy to answer.