October 31, 2018

34. Event-Driven Architecture (Part 2)

Show Notes
Transcription
Discussion

Chris Hickman and Jon Christensen of Kelsus and Rich Staats from Secret Stache continue their micro series on event-driven architecture. In this episode, they discuss practical examples used at Kelsus.

Some of the highlights of the show include:

Event-driven architecture can be used for clients of all sizes
Pub/Sub is a common design pattern used at any software level; many applications and frameworks have it built-in (i.e. JavaScript)
Same features of the language are used to extend it to the architectural level, but the mechanics of how it works are different
Publishers and subscribers are needed; publishers emit events, and subscribers listen to them
Message queues are robust architectural components; can be do-it-yourself, host-it-yourself, get as a service, let someone else run it, or built into a Cloud provider
Kelsus uses SQS to integrate its queuing system; uses Simple Notification Service (SNS) as event emitter and SQS to receive, listen to, and act on messages
Kelsus has an extensible and loose coupling system where things happen in parallel and fan out
Code can send a message to SNS; SNS fanout enables multiple subscribers that are SQS queues, where events are stored until ready to be processed
Lambda can be a subscriber to a SNS topic – it’s a background service where you write, deploy, and maintain the software
Lambda now subscribes to SQS; instead of writing a piece of code and hosting it, you don’t have to worry about servers, infrastructure, and deploying the code

Links and Resources

Rich: In episode 34 of Mobycast, we dive into part 2 of our micro-series on Event-Driven Architecture. In particular, we discuss a few practical examples used at Kelsus. Welcome to Mobycast, a weekly conversation about containerization, Docker, and modern software deployment. Let’s jump right in.

Jon: Hello, welcome Chris and Rich. What a week.

Chris: Hi, guys.

Rich: Hey.

Jon: Alright, I’m excited for another episode of Mobycast. I think we’ll jump right in because I feel like we started to make some good progress describing Event-driven Architecture and what they’re good for last week. I want to just get right into the making, how we in particular at Kelsus are making them for our size clients, which kind of ranges from startups to clients that need the service and into the hundreds of thousands of users. Let’s go right into it.

Chris, maybe you can talk just a little bit about how a Pub/Sub design pattern might be implemented at the application level.

Chris: Pub/Sub is a very common design pattern and it can be used at just about any level of software. A lot of applications and frameworks have this built-in as just features of that language to be able to do it. Javascript is a great example where the language itself has the concept of events. You can emit events and then you can setup event listeners that can then subscribe to those events. We do this all the time when writing Javascript applications. Javascript itself is already event-driven language because the world of web apps are very event-driven.

Just about anything in a modern Javascript client application is based upon what’s happening in the browser. When a page loads, that’s an event. When page loading is complete, that’s an event and you can subscribe to that and then do certain things. When a key is pressed on the keyboard, that’s an event so, you can set up a subscriber to listen to that event and then take an action when a certain key is pressed. Again, just very common design pattern that most developers will be very familiar with and have been used doing this at the application level. So, extending it to the architectural level is really a very straightforward next step of the abstraction and it should feel very familiar to a lot of folks.

Jon: Perfect, I guess we could just go straight there. Do we use the same features of the language in order to extend it to the architectural level? How do we make that leap?

Chris: Here we’ll keep the same concepts but the practicality, the actual mechanics of how it works is definitely different, because now you’re no longer inside a piece of software. You have to now make those remote procedure calls or some way of talking to other systems typically over a wire or over a network type thing.

Basically, you need the ability to have publishers and subscribers. The publishers are emitting events; subscribers are listening to those events. There are so many different techniques and pieces of Architecture you can use to implement this pattern out in the real world, you could use systems like ZeroMQ or even Redis databases to implement Pub/Sub. You can use things like push notifications, whether they be like the native notifications that Apple has, it’s built-in into its platform or one of the many other platforms out there that implement push notifications on a more general basis.

Then you have things like message queues, there’s so many different platform, this is a very robust architectural component. Message queues have been a key part of internet architectures now for 20 plus years, easily.

Jon: Then should we go to the project lead or CIO and say “Okay, we need three weeks to evaluate all the different message queues in order to figure out which one is the best for our project”. Since there’s so much choice out there, I guess what I’m getting at is, how do we make a decision?

Chris: Sure. This is the age that we live in where there’s just these usually many different options out there, many different competing products, solutions, and whatnot. You just do what feels right for your situation. As far as message queues go, there’s a lot of products out there. There’s the do-it-yourself, host-it-yourself. You can get it as a service, let someone else run it for you, or it can be just built-in into your actual cloud provider as well.

Folks like Amazon, and Microsoft Azure, and Google, CompuCloud, they all have this building block, a fundamental piece of those systems. You can get that so, there’s that. I would say, for the most part, take the most integrated path that makes sense to your architecture. If you’re building something in AWS and you need message queues, you’d have to have a good reason not to use SQS which is their simple queuing service that implements message queues in AWS, or at least it’s one of them. They have expanded out. There’s a few other ones that have more robust capabilities but again, you’d have to have a good reason for not using SQS if you’re building inside Amazon likewise if you’re in Azure.

Look at Azure as your go to before you start thinking about standing up your own version of ActiveMQ or ZeroMQ, or RabbitMQ, or any other of the MQ systems that are out there.

Jon: Yeah and it’s like, either you’re going to use a managed system or you’re going to use one that you have to stand up yourself. If you stand it up yourself, then you are going to have to pay the tax of making sure your computers are running and that they’re patched, and all that stuff. If you use the managed system, then you just need to choose one that you trust is going to be there for the duration of your system. So, going with some new kid on the block that has an awesome cool managed queuing service is probably not a good idea. Of course, you talked about integration. Your own cloud has definitely done a lot of work to integrate their queuing system with their own services so, go with it. Alright so, that’s what we use, we use SQS. How do we use it? Let’s talk about how that works.

Chris: Yes, maybe we could get into the details, the practical example of doing a Pub/Sub type architecture using things like SQS. Roughly one of the applications that we have, it’s a form-based application and when forms are filled out, there’s certain that kind of then have to happen after that. At first, it was really just like, “Okay, when we first built this system, there’s just one thing that had to happen” so really, we didn’t have a need for an Event-driven system so initially, it wasn’t. When a form was filled out, then go ahead and do this follow-up task and that was all just kind of part of the one system.

Once we got to the situation where it’s like, “Oh, you know what, in addition to that, we also want to do this other thing.” now, it was no longer just the one-follow, one-task. Now, there was this two separate things and they were really not related at all. That’s when it was like, “Okay, this is now a time to do something different and we need to take about becoming more Event-driven.” Really what that what that ended up being is, we’ve decided to use SNS as our event emitter approach and then we’ll use SQS as a way of listening to those messages, receiving them, and then acting upon them.

The way that it works is, when one of these forms is filled out, once that’s been validated, and processed, and stored, it then will emit an SNS message to a specific topic. SNS, Simple Notification Service, is a way of just broadcasting a message to zero to n subscribers. So, anyone that care about that, they can subscribe to that particular topic. Again, very loosely coupled, your event emitter doesn’t have to know who’s subscribed to it or if there’s even anyone that’s subscribed to it. All it cares about is, it just sends a message saying, “Here’s the message, here’s what happened, here’s some relevant data about it.” and then it can go on its way.

Once we had our SNS message being emitted, we then wired-up an SQS queue to be a listener to that. This is a principle called SNS fan out. Again, you have one topic and you can fan out to as many different types of subscribers as you want. You can have things like, subscribers can be like lambda functions if you want. You can wire up a SQS queue to be a listener, you can use SNS to send push notifications to mobile devices like iOS phones, iOS devices, or android devices.

In our particular case, we created a SQS queue to be the recipient of these messages and then we had that queue subscribe to this particular topic, so that way what happens is, whenever one of these SNS messages are emitted, it basically creates an SQS message on that queue. Now, we have another software component that’s basically just a background process that is monitoring its queue. As messages come in on that queue, it just pulls them off, opens them up, opens up the envelope, and then performs whatever action it needs to do on them.

Now, we have a very extensible system where, when we want to do maybe now, we want to go from two additional steps to three, perhaps it could be another subscriber to the topic and it will receive a message as well. All this things are happening in parallel, it’s fanning out. We have extensibility, and we have the loose coupling. We can rapidly iterate on this.

Jon: I think, Chris, the thing I’m having a hard time getting my head around is that it feels like there’s two message systems involved here and I don’t quite understand why. Why is it that we have to send a message through SNS then receive it through SQS and then have people subscribe or have functions or whatever, subscribed to the SQS system? Why can’t we just use SNS or just use SQS?

Chris: This is what gives us the loose coupling though, the one-to-many functionality. We could just make this all SQS. We could definitely say, “When a form is submitted, emit an SQS message, put it onto the queue, and then have something else read out of it.” We could use SQS only, but the problem with that is, in that code, you have to know exactly what queue you’re talking to. If you then want to extend it to be multiple queues, you’d have to update that code to go write, to duplicate that message to each one of the queues that you wanted to see.

You start getting this more hard coupling going on in the system. SNS fanout is just really this nice, generic, easy way of just saying like, “I don’t want to hard code my subscribers into my event emitter code. Instead, I’m going to have something in the middle that wires or that glues this stuff up outside of my code.” That’s what we us SNS fanout for.

Jon: Let me make sure I understand. The code can send a message to SNS, SNS fanout is what enables there to be multiple subscribers, and then the subscribers are the SQS queues, and essentially, those SQS queues are just going to store the events until somebody’s ready to process them and that’s all they’re there for?

Chris: Correct.

Jon: But if we didn’t really care about storing the events, could we have subscribers subscribe directly to the SNS fanout topics?

Chris: I think there’s like eight or nine different possible types of subscribers that you can add to SNS. We have different ways of doing it. Using SQS is good. Let’s say, it’s busy, and it’s not looking to read the message, or maybe you’ve rebooted it for some reason, or it’s just not running, for whatever reason, that message is preserved, and once it is ready to process the message it can. As opposed to, like a real-time fire hose, if you’re there to get it, you can process it. Otherwise, it gets dropped.

Sometimes you want that kind of behavior and you can definitely do that. It’s just in our particular case, we want to process, we want to take a very specific action when this happens and make sure that it happens.

Jon: Just so I can characterize it, SNS is for the Pub/Sub, so they publish and let there be different subscribers. So, it’s for the decoupling and the Pub/Sub part of it. SQS is for making sure we store the events that happen so that they all get processed. Also for making sure that they get processed in the order that they came in, right?

Chris: Yes, and just with a slight caveat, the ordering is not necessarily guaranteed. It’s your choice in AWS. There are ways of guaranteed delivery and there’s other queues that don’t guarantee you that so, it just depends on what your requirements are.

In a lot of cases, we don’t care. We don’t care if form A was submitted first or form B was submitted. We just care about doing the follow-up actions on each one of them independently.

Jon: Yeah, making sure that they get done and not dropped.

Chris: Yep.

Jon: Right on. Well, that sounds pretty straightforward. Just because I want to make sure that this episode is topical and recent, you might not recall, but I did see that AWS, within the last three or four weeks, added a new feature on either SNS or SQS and it was the ability to attach Lambda functions directly. I think it was on SNS, it was something where everyone was like “Ahh, finally, why wasn’t this a feature from day one,” kind of thing. Do you happen to remember what that one was?

Chris: Yeah, so it’s now, Lambda can be a subscriber to a SNS topic. We do it this way for some other reasons as well, so you fan out to SQS and you need to then process it, you need some code to that. It ends up being like you write another piece software, it’s a background service like a demon type job. You need to write the software, you need to deploy it, maintain it, and make sure it’s up and running. There’s a fair amount of lifting there to get that piece in place, you don’t get it for free. What the feature is, is that Lambda can now subscribe to SQS. It’s not SNS, it’s SQS.

What it means is, instead of having to go write a piece of code and host it yourself to be an SQS worker, if your worker fits inside of a Lambda function and that make sense for it in that environment, you no longer have to worry about servers and deploying that code and doing the infrastructure for it. Instead, you can just have your Lambda function, you can then set that up to read from the SQS queue that you’re pushing these events to and then you’re completely serverless.

Jon: That’s huge especially for events that might be kind of sparse, where they happen few times an hour or that kind of stuff.

Chris: Absolutely.

Jon: Very cool. Once again, we’re out of time. I could definitely talk about this for another hour, but I suppose we’ll save it for next week. Talk to you next week.

Chris: Great. Thanks guys.

Rich: Take Care.

Jon: Thanks, Rich. Thanks, Chris.

Rich: Well dear listener, you made it to the end. We appreciate your time and invite you to continue the conversation with us online. This episode along with the show notes and other valuable resources is available at mobycast.fm/34. If you have any questions or additional insights, we encourage you to leave us a comment there. Thank you and we’ll see you again next week.

The Docker Transition Checklist

19 steps to better prepare you & your engineering team for migration to containers

34. Event-Driven Architecture (Part 2)