08. Core AWS Services
Chris Hickman and Jon Christensen of Kelsus and Rich Staats from Secret Stache discuss core Amazon Web Services (AWS) they know and love and that you can use regularly. In this episode, we describe Elastic Container Service (ECS), Elastic Beanstalk, Relational Database Service (RDS), API gateway, Lambda, Simple Notification Service (SNS), Simple Queueing Service (SQS) and Simple Email Service (SES).
Some of the highlights of the show include:
- ECS manages the scheduling, running, and monitoring of containers across a cluster of machines. It’s a foundational service if you’re using Docker and containers.
- Some alternatives to ECS are to deploy your own servers using EC2 and orchestrate with a 3rd party system like Kubernetes, or to use a Platform-as-a-Service (PaaS) such as Heroku, or AWS Elastic Beanstalk.
- Beanstalk is a PaaS which is now about seven years old, and it manages at the VM level, not the container level. Compared to heroku, Beanstalk allows you to access the underlying VM, which provides more control and enables better troubleshooting. But Beanstalk does not manage your containers. Each EC2 machine deployed by Beanstalk can be used for only one app, so it could be more expensive.
- Heroku’s philosophy is like Apple’s: it’s very opinionated and makes everything super polished and easy to use for non-techies, with just a few knobs to turn. AWS has a different philosophy – they give you 19,000 knobs to control everything.
- Some disadvantages of Beanstalk: It creates a proliferation of security groups with cryptic names that can become a nightmare to manage. If you take the easy route and let Beanstalk provision you DB, then Beanstalk will also delete your DB if you delete your app: beware! Because Beanstalk uses a specialized AMI (Amazon Machine Instance), it took about a week to get a patch for the Spectre/Meltdown vulnerability, versus one day for a path on ECS.
- Jon describes how AWS services use other AWS services as building blocks. It reminds him of ‘recombinant’ foods that you might find in the midwest, where complete foods like canned soups and tater tots are mixed together to invent new dishes. Chris will now always associate Beanstalk with tater tots. 😉
- Amazon Relational Database Service (RDS) manages relational databases. (No surprise here!) It supports Postgres (our favorite), Mysql, Amazon’s own Aurora DB engine, MS SQL Server, and Oracle. RDS manages the servers, makes it easy to do backup & restore, provides fault-tolerance and multi-availability-zone deployment, automatically applies patches and updates during your defined maintenance windows. You could do all that stuff manually, and save a little bit on your monthly AWS bill, but it’s really hard to justify *not* using RDS. It’s just the responsible thing to do.
- AWS API Gateway is a front-door for API calls. It looks at request URLs and routes to the appropriate service or function you define. It also provides throttling, quotas, and metrics. The routing is similar to how web app frameworks like Express or Sinatra route incoming request URLs to functions within your app. An alternative to API Gateway is to use load balancers, and roll-your-own throttling, quotas, & metrics.
- AWS Gateway can also serve as your front door for exposing serverless functions, as with AWS Lambda.
- AWS Lamdba provides serverless functions. Chris has mixed feelings about the serverless programming paradigm. It requires you to completely rethink the way you structure your software. What we logically think of as a cohesive application devolves into a scattered set of separate functions. Some frameworks, like Cloud Formation, are beginning to help manage all your separate functions, but it is still difficult.
- Good uses cases for Lambda and serverless include: (1) event-driven actions. (2) middleware to glue together different apps. (3) scheduled jobs where you might otherwise use Cron. (4) Loosely coupled and flexible event-driven functions, for example sending an email when some data is updated in a DB. This could be used with AWS DynamoDB streams, which can be configured easily to trigger a Lambda whenever certain data is updated.
- Some limitations of Lambda: Lambda is difficult to monitor, especially in high-volume applications. It’s also not optimal for apps that need very fast responses. Because Lambda is starting containers behind the scenes, if your container isn’t already running, then the response will be slow.
- Amazon Simple Notification Service (SNS) is a full-feature publish/subscribe system based on topics. Any number of of subscribers can listen to a topic and will be notified whenever a message is published, so multiple subscribers act on a message. A common use case for SNS is mobile push notifications.
- SNS can provide ‘fan out’ by having multiple services like SQS, Lambda and SES as subscribers to a single topic.
- Amazon Simple Queueing Server (SQS) provides reliable messaging for a single recipient. It’s useful where a single unit of work needs to be done. Whichever recipient reads the message first can act on it.
Links and Resources
Mobycast Episode 1 – Virtual Machines vs. Containers
Mobycast Episode 3 – An Introduction to Elastic Container Service (ECS)
Amazon Web Services (AWS)
AWS ECS
Amazon RDS
Docker
Kubernetes
Elastic Beanstalk
Heroku
Django Python
MySQL
Amazon Aurora
AWS Lambda
API Gateway
Amazon DynamoDB
Amazon SNS
Amazon SQS
Amazon SES
Sumo Logic
Ruby on Rails
Sinatra
Kelsus
Secret Stache Media
In episode 8 of Mobycast, Jon and Chris discussed the core AWS services they use regularly. Welcome to Mobycast, a weekly conversation about containerization, Docker, and modern software deployment. Let’s jump right in.
Jon: Welcome Chris and Rich to Mobycast number eight.
Rich: Hey guys!
Jon: Good to have you. This is another fun week of talking about containerization, developing with AWS, and particularly, we’re going to talk today about the services that we like on AWS, and we use, and we find fundamental to doing software engineering in getting our jobs done and in a couple of cases, some ones that we absolutely don’t like and actively move away from.
I think this will be a part of a two-part series where today, we’ll talk about the things we’re using and next week, we’ll be talking about things that we’re evaluating and excited about and comparing those to other options out there. But before we get started, I thought we’d just see what we’re up to this week. Chris, what have you been up to this week?
Chris: I think I’m going to sound like a broken record and that I have been just super busy, heads down with my team, working on a very big and important product milestone across the set of a handful of separate projects that are all intertwined. It’s actually a lot of moving pieces, pretty complicated but some pretty powerful software for a very important client.
We have some ambitious goals and we’ve executed very, very well. I’m really happy about that. It’s been quite a push but we made it alive at the end of the tunnel.
Jon: Nice. I’m excited to hear that. And you’re looking forward to a well deserved vacation coming up.
Chris: I am indeed. I’m leaving the continent.
Jon: How about you Rich? What have you been up to this week?
Rich: This week, we’ve been working on launching the Pro Docker training website, which by the time this episode airs, will be live. I think we’ve been working on this for about two months now, collaboratively. It’ll be nice to see this thing in next into Google and driving some traffic. It’s version one of hopefully many versions but I think we started pretty nice. It was pretty good
Jon: I’m really excited about it. I’m excited for people to start hearing our voices and what we have to say. I’m excited to continue on this journey of bringing this thoughts around professional software engineering to more and more people.
I, myself, this week, I’ve also been working a bit on Pro Docker training with Rich and the other thing that I’ve been doing is I’ve been reading this book this week that’s super interesting. It’s called Conspiracy. I can’t remember who it’s by of at the the top of my head but it would be easy to find because it’s about Peter Thiel takes down a docker media.
I expected because most of the commentary I’ve read about that has been from the point of view of, what an awful thing to do to give other billionaires and people with infinite resources, a blueprint for how to take down individuals and smaller companies, which I tend to agree with that sentiment but this book is written from the point of view of what an amazing thing Peter Thiel did to pull off this intricate conspiracy and get rid of a horrible entity in the world. There’s some truth in that too. Pretty interesting read. I’ve definitely recommended to people who like to read about intricate conspiracies like that.
Let’s get into talking about AWS Services. Today, we’re just going to talk about some the services that we use and love. We’ve already talked about one of them in depths in episode three and that was Amazon’s Elastic Container Service and we’ll just redefine it here today and give some quick highlights about what it is and why it’s good but definitely go back and listen to that episode three if you’re more interested in the details of ECS. What is that though, Chris? And why do we use it? Why do we love it?
Chris: ECS is definitely the foundational AWS services that we use since we are fans of containerization. We dockerize all of our apps. ECS stands for Elastic Container Service and what it is basically, it’s a container orchestration service for scheduling, running, monitoring your containerized applications across a cluster of nodes inside your Amazon cloud.
This is the go-to service that we’re using on a daily basis. Whenever we’re doing a deploy of our code, we’re using ECS. Whenever we’re launching a new application, we’re using ECS. When we want to see what’s going on, when we have to troubleshoot, we’re looking at ECS as well. It is in the wheelhouse for us, getting that we are big fans of containerization.
Jon: Just to counter that to what we could be using. We’re using ECS as opposed to, you say, just spinning up individual machines on EC2, and installing our software there, and managing it there or installing our containers on to those individual machines and using some other orchestrator like Kubernetes. We’re using ECS in place of doing our platform as a service like Heroku or doing a platform as a service like Elastic Beanstalk and in fact, we do not like Elastic Beanstalk.
Let’s talk about Elastic Beanstalk a little bit, what it is, and why we’ve moved away from it. There’s a few applications that we used to have on Elastic Beanstalk and we’ve pulled them out of there. What is it first, Chris and then, what are some things we don’t like about it?
Chris: Elastic Beanstalk in a way, it’s a precursor to ECS but it’s at a higher level of extraction. It really is going to fall into the pass category, the platform as a service. You can think of it as where ECS is dealing with the container level. Things like Beanstalk are dealing with things at the VM level. It’s a much larger unit of granularity.
Definitely, when Beanstalk came out, it’s a pre-established technology. It’s been around now for probably at least 6, 7 years. It’s been around for quite some time. At that time, it was definitely a wonderful tool. It allowed you to really, easily spin up the resources that you need in order to bring the application and it served a really valuable purpose.
Jon: Chris, I think that one of the things that at that time Elastic Beanstalk offer that it’s competitors didn’t was the ability to get in touch with the machines that your platform, as a service, was running on. The obvious competitor of Elastic Beanstalk is Heroku. People that have used Heroku know what that is.
You know that you send your code to Heroku, and Heroku just spins it up, and now its running, and you can access it in your browser. If you want to look at logs or if you want to do anything, you have to do the specialized Heroku commands to get access to anything and you also have to connect. There’s no Heroku services to see anything, Elastic Beanstalk was a step in the right direction from that work.
It’s really hard to troubleshoot and monitor on Heroku. Can we just get a little more? Can we see these machines? And with Elastic Beanstalk, all of a sudden you could. You can turn them on, you can turn them off, see them, […] to them. You just got a lot more control and it was a wonderful step forward.
Chris: Yeah. It’s actually interesting because this is actually probably another full discussion we’re going to have about just like Amazon’s approach to building products and services as opposed to […]. For me, a really great distinction, it’s like Apple versus Amazon.
Apple is the full up, polished, make it as simple as possible, give you one knob that’s all you have to turn and Amazon, it’s like there is no common UI or UX. Every team, they get to design whatever it is they want and they’re not going to do the best job in the world at it but they’re going to give you just the raw capabilities. It’s more of like, here’s the raw tools, deal with it whatever you want and we’re going to give you 19,000 knobs. Heroku was absolutely more of like, here is a finish product/service with just a few knobs. For some people that was probably a great benefit and for other people, like what you point out, it’s frustrating because you actually want to get under the covers and do other certain things.
Jon: Right. Another analogy that I like here is this, this is maybe a funny analogy but have you ever heard of Recombinant Cuisine? It’s a midwestern invention.
Chris: No.
Jon: That’s when you have things like hot dishes that inside the hotdish the ingredients are not whole food ingredients but rather created food ingredients. You might have a hot dish that has tater tots in it, or it might have ketchup in it, or it might have just cream and mushroom soup, or just various things that are already their own food put together to create a new food based on another created food.
The analogy then goes, Elastic Beanstalk is a recombinant cuisine of EC2 and all these other Amazon services so that when it’s all stored together in this final hotdish, you still get access to the underlying services. You can still see this […] instances or you can still see the security groups and everything else that’s built up to create this meal for you.
Chris: I will forever now associate Beanstalk with tater tots.
Jon: There you go.
Chris: Maybe the other thing to add with Beanstalk is, again, it’s basically virtualizing at the VM layer and not at the containers. We’ve had long discussions about VMs versus containers.
Jon: It’s basically, that ones are first steps that now we can […] a self referential. […] I said when we talked about that.
Chris: Right. Go back. Listen to that one because that’s really like the analogy here. ECS is for containers and Beanstalk is for VMs. I will say, us internally, we have been building all of our apps using Beanstalk.
When I first came on board, because we weren’t yet using containers. Again, it worked well but it’s also pretty wasteful because if you want to have like two instances of your app running for redundancy, that means that you’re spinning up two separate EC2s that are dedicated solely to that app. You can’t really run anything else on those EC2s.
We had many additional EC2s, which meant we had issues to deal with like okay, what happens when we have security patches that are out that we need to get installed and what happens when we look at our monthly bill that we’re paying for, and just other security issues. Security groups making sure that everything is locked down and just much more difficult, much more complicated, much more expensive to operate, maintain, and to monitor. Moving from that VM as the unit of deployment to containers was just a huge, huge improvement for us. Beanstalk is just really no longer part of our tool set. We’ve switch over completely to ECS.
Jon: Yeah. I think there’s two others about Beanstalk that really drove us away from it and towards the ECS. One of them is, you mentioned security a little bit. Security is always a rabbit hole but I believe that Beanstalk has a tendency to create those proliferation of security groups, especially if you use the consul for creating your Beanstalk applications. It automatically creates security groups for the instances to talk to each other and to talk to the database, and the next thing you know you have all these rather anonymous looking security groups in your AWS consul and nobody knows what they’re for and if you have several applications running, then you just end up with legacy security groups.
Chris: And all those security groups are all called launch wizard or something like that.
Jon: Yes. That’s a real drag when it comes to Elastic Beanstalk and I should say, we’re AWS partners at Kelsus and I don’t know if we’re allowed to talk negatively about AWS Services. We might get a cease and desist from saying these bad things about Elastic Beanstalk. But then, the other thing about Elastic Beanstalk that has been a drag is there’s a little secret that happens that, again, if you use the consul, which why wouldn’t you use the consul to create a new application? Why would you do it a hard way when there’s this easy way that they make available for you? You use the consul and you let the consul create a database instance for you. You end up in a wilder herd. What is that wilder herd again, Chris?
Chris: Beanstalk makes it very, very simple for you. As you’re spinning up the applications and say, “Yeah. I have a databases associated with my application. Go ahead and create that for me as well.” It’s super easy to do. Boom, your up and running. Everything is working perfectly. Then down the line, you say, “I’m going to build a new version of this application.” Maybe we’re going to port it or maybe we’re doing some refactoring so we want to break it up into two microservices instead of one monolith. We’re breaking up the Django Python App into two node services. We’re going to delete that Beanstalk application and bring the piece to new ones but we want to keep the database. We don’t need to change that. All of our data’s there and whatnot.
If you’ve had Beanstalk create your database for you, when you go and delete that Beanstalk application, what else is it going to delete—your database. Not a good thing to do. That’s definitely a big lesson learned. Big gotcha is don’t let Beanstalk build your database for you, instead you need to create that separately. Do it yourself in RDS and then just update your application to say like, “Here is my connection string. I’m going to go talk to this RDS instance.” Don’t let Beanstalk touch your database.
Jon: Throughout all of this, it seems not only is it that we want to use ECS because it also use containerization but Beanstalk, with all of its great things that it did in order to enable us to see what’s going on inside of our platform as a service, it came with some opinionated UX type stuff, and an opinionated architecture, and opinionated design but ended up biting us. Even if it wasn’t for the fact that containerization offers this greatway forward, we’ll probably still be a little unhappy with Beanstalk.
Chris: One other thing that just come to mind with Beanstalk is chuck up another one in the concolumn, if you will. You guys remember spector meltdown.
Jon: Yes.
Chris: The […] with that a few months back. All of our OS had to be patched in order to deal with that and plug those security holes. Amazon, they almost immediately had patches for the Amazon Linux AMI. Our ECS cluster, where we had most of our software running, it was literally, all we had to do is just go change the launch configuration inside ECS to say, “Hey, here’s the new AMI I want you to use.” And then we just cycled through our cluster EC2s and terminated them and let the autoscale group bring up new ones with the new AMI. Literally, took minutes of time to go through and patch all these instances.
We have one legacy Beanstalk application and that particular one, it was much more difficult and actually, there was a lot more […] associated with how soon we get that patched because with Beanstalk, it’s not the straight Amazon Linux AMI that it’s using, it’s actually the Beanstalk version of it. They didn’t have a patch for that right out the gate.
There’s two ways of getting patches. You could do it manually so you could log in the machine and figure it out. You could do it manually through a lot more effort or like the preferred way is, to let the manage updates feature of Beanstalk handle this. You can define a maintenance window when you want to receive updates and whatnot. The point here is that, I think our Beanstalk application, it took a full week before it got patched for the meltdown spector vulnerabilities versus our ECS machines were patched immediately.
Jon: That is awesome. That one is a super cool thing that ECS made possible.
Chris: Indeed.
Jon: I know, Rich, that in your business, neither of these are coming up for you that often but I’m curious as we talk about AWS services that we use, that are indispensable. Are you, in your business, using any AWS services? If so, what are you using?
Rich: We are using AWS but not directly. The source is that we use AWS. Our hosting provider, WP Engine, everything is on AWS but we don’t actually interact with it at all.
Jon: You have this manage services or software as a service services that you’re taking advantage of. It would be interesting to know what they’re using and my guess is a lot of them probably still haven’t made the shift to containerization and maybe using things like Beanstalk or straight EC2 instances that they manage themselves.
Rich: I’m not even really sure but I know they’re pretty transparent about it. I’m just quickly seeing if there’s anything in here that would tell us that.
Jon: They maybe transparent and they may say, “These are the regions that we use, or this is where the stuff is, and this is some of the stuff that we do to make sure that things are done and things don’t fall over.” But I would guess that they probably wouldn’t get that much into the, “Here’s how the sausage is made.”
Rich: Right. And they’re not actually saying anything on their website anymore. We have used S3 for storage directly but that’s rare because we can just […] CDN with WP Engine. I’m not sure what are these things it’s like net DNA or something like that. I played around with AWS in the past when I had to but only S3, I think creating pockets and stuffs like that.
Jon: It does suggest that as much as AWS may want to try to move—I don’t know which direction this is on the market—but move towards more usable services that aren’t totally engineer-y, they’re not succeeding. They really rely on the like of companies like WP Engine or Heroku, for example. I’m sure Heroku is probably, actually, anything AWS into the scene to create services that non-super developer types or software engineers can make use of. In the list of services they have, there are definitely some that are targeted towards companies like yours and you’re not even finding out them. Interesting.
The next one I have on the list here is RDS. It’s almost even funny to talk about it because we’ve been using it for years, and years, and years, and is there even an alternative. RDS is just so fundamental but maybe we could talk about it a little bit. Do you have anything you’d like to mention about what we like about RDS, Chris?
Chris: We could probably define what it is first. RDS stands for Relational Database Service. It is a managed service providing relational databases in various flavors. Things like Postgre, MySQL, there’s an Amazon, it’s on relational database called Aurora. They also offer Microsoft SQL Server as well as a host at […], and I believe Oracle as well.
A bunch of different flavors of relational databases but the really great thing it is managed. You don’t have to manage the servers yourself. Amazon is doing that for you. It provides very great support and super easy to do things like backups and restores. It supports availability and full talents features like multi-availability zone configuration just with the mouse click.
If you have any kind of need for relational database, definitely you have to look at RDS and there’s got to be a super good reason why you wouldn’t use it. There’s so many benefits to it. It’s almost like it’s the responsible thing to do as opposed to running the stuffs yourself.
You could certainly do that and you could spin up EC2s, and install Postgres on them, and run it yourself, and it would be cheaper but you’re assuming so much risk, responsibility, and reliability by doing so, and it’s just a hassle, why would you do that? RDS is definitely one of those core things that we use with basically, every single app.
Jon: Yeah. I’m thinking back to 2001, studying up some database for some major clients like Lowe’s and Sears. I actually remember doing things like vacuum with the command we use to do on databases and reindexing. I don’t think you have to ever think about that stuff anymore with RDS. Is that right?
Chris: You still absolutely have to deal with the application level stuff. You will define your indexes and you’ll tweak and tune them and look at them from a performance standpoint but you don’t have to worry so much you don’t definitely have to worry about the hardware.
Jon: That’s not what I meant. Yeah, absolutely studying up your indexes and making sure your queries are performing and all that but there used to be some just maintenance commands that you have to run on databases and one of them was vacuum and another one was reindex. What reindex did was just go through the database and make sure that all your indexes are up to date and what vacuum did was the same thing as defragging a hard drive, re-consolidating all your data and moving it into nice places and getting rid of gaps in your database from deletion and things like that. I don’t know if those have just been replaced by better database engines or those have been replaced by database management managed services like RDS.
Chris: I think it’s definitely the former. It’s just like we don’t really have to defrag our drives anymore.
Jon: Just the engines themselves are better.
Chris: Yeah and not only that. There’s frequent updates and patches. These databases are being patched so that is something that you do get with RDS where you don’t have to install the patches yourself. You can refine those maintenance windows and RDS will do that for you. That kind of maintenance is definitely provided for you. I think there was a lot more pain 15 years ago with just keeping things up and running, for sure doing those things and there’s less of that to do now though it’s been replaced by other things that we have to do.
Jon: Right. Of course, we’re moving on. The next one we have on the list is API Gateway. What is that?
Chris: It’s a front door service for API calls and you get some common core functionality from that. To get things like if you want to do travelling and quotas, and just capturing raw metric, just understanding the raw metrics of how many times your APIs are called. It gives you the flexibility of how you want to route those API calls.
A great example is—I think we’re going to get more into this later—serverless is definitely a pretty hot topic nowadays and Lambda is one of those serverless technologies with AWS. It’s very much possible for you to write your application completely serverless. You can write it as a suite of Lambda functions but then the problem becomes like, how do you invoke those Lambda functions from the outside world? What if I have a mobile application and I need to invoke these things? How is it able to discover and talk to it? And that’s something like, API gateway comes in.
It’s that front door, it’s internet face in, its got a DNS name, and your mobile app can go and make its call to it. You define your application endpoints in API gateway and when this request comes in, it seize them and then you tell how […]. You can say, “Hey, when someone calls the get messages endpoint,” it knows to route it to this particular Lambda function, and invokes it, and then it returns back the results. It’s definitely a key core piece of the serverless puzzle of building out an application that way.
Jon: I don’t know how we’re using it. Is that how we’re using it, where we have Serverless Lambda functions and we’re fronting them with API gateway or are we sending API gateway requests to ECS?
Chris: Total truth in lending disclosure here. We are not big users of API gateway. We actually have just started to use it because we have just started to have other applications that we interact with that are being built with some of the serverless technologies like Lambda. We’re consumers of API gateway.
The alternative to API Gateway is Elastic Load Balancers fronting your applications. You’ll sign the DNS names to your Elastic Load Balancers and those are the front door into it and then whatever value added services you need on the actual API itself, you would roll your own. We have not had a huge use case scenario for doing things like throttling. We usually implement our own user identity profile stores and whatnot.
Using API gateway in that way hasn’t been really on our radar but as we go forward and we look at some of this newer technologies and where it makes sense, we’re definitely going to be looking at using API gateway more for if we do start doing more serverless, more Lambda stuff.
Jon: A takeaway from me on what you’re saying is that when we use technologies like Node.js Express Framework or when we use Ruby on Rails, those technologies come with routing layer already and as long as you can get to that routing layer, the parameters or things after the slashes in the URL, the routing layer takes care of that and then gets it to the right function within the node application or the ruby application. But when it comes to API gateway that’s […] that you don’t have a software routing layer and that you’re wanting to go directly to your function that’s exposed to be as something like Lambda, right?
Chris: Yeah. That’s a very wonderful way of thinking about it because there really is, the canonically use case here, is that if you’ve spread up your app across a collection of functions that are not part of the same application then yeah, how do you route to them? API gateway becomes your routing layer. It’s looking at the pass coming in and it sends it off to the appropriate function versus if you are Rails, Sinatra, Express, you’re doing all that stuff in the app itself. All the requests are coming in to that one application and then the application is then looking at the request and say, “Okay, what function do I route it to?” And that function leaves within the side of that application not as a separate service.
Jon: I don’t know. We’ll see if we ended up doing more serverless stuff and that actually was the next part of this talk is just what do we like about Lambda? What do we dislike about Lambda? You’ve already defined Lambda. It’s this thing that’s led to write a function that live in the cloud and the function could just be called from anywhere and then it does its job and returns it’s answer.
If it was a silver bullet, I think we would have jumped right on it and be using it for everything but I think that it comes with a few drawbacks that cause us to still use applications and frameworks instead of doing serverless everything. Maybe talk about some of our experience and some of our hesitation around jumping on the serverless bandwagon.
Chris: Definitely have mixed feelings on Lambda and mostly, from the standpoint of it causes you to change the way that you write your software. You’re basically breaking it up into these chunks that are standalone and they’re deployed individually. You’re breaking up your cohesive application into a series of functions essentially and each one of those are like atomic units.
To me, it feels weird and takes away some of the joy of the craftsmanship of putting together a cohesive application. That said, definitely there are great uses of Lambda and mostly, for me, it’s like a no brainer to use when I have events that I want to respond to and take some kind of action or I can use them as middleware to glue together these events from one system to another one. That’s where I’m super excited about Lambda. I can’t imagine right now saying like, I’m going to go and architect a complete full off application using nothing but Lambda. I just don’t see happening.
Jon: I think one of the reasons that that doesn’t feel natural is in part because I don’t think Lambda give you lots of organizational capabilities. An application, a lot of times, it has a purpose. There maybe many functions that it can perform but those functions are probably related to one another, and the over all application has a purpose in life that it’s trying to accomplish.
If you have multiple applications, each of those might have different purposes and I don’t know if there’s a way to show that in Lambda like, “Here’s my different applications with a different purposes and I can keep them separate, and I can look at them individually.” I don’t think that Lambda supports that kind of breakdown of that anthology of software. I think it’s just a list of functions.
Chris: Yeah. The tools are getting better. There are frameworks now that help manage the stuff and to make it so that it’s more seamless. There are even things like Cloud Formation can help you in this as well to hide some of the fact that all these has to be broken up into individual things.
At the end of the day, Lambda function is just code and actually the code can be as big as you want or you could […] eight megabytes of code. You could actually like whatever common libraries you need in order to do what it is that the Lambda functions, to do it back and include it with it. There’s duplication there but you know, kind of like okay. Let the tools and the frameworks handle that for you when it’s doing the deploys and what not.
There are ways. The tools are making it easier but I think that fundamental issue definitely still applies. Again, I think there’s definitely really good use cases for it. I’ll give a couple examples.
DynamoDB is one of the NoSQL databases that Amazon offers. It’s a competitor very similar to something like MongoDB or CouchDB. It has a feature in it called DynamoDB Streams and streams are basically just these event streams of whenever mutable operations happen on that database. They stick around for about 24 hours and then they fall off.
Imagine in your application, you want to know whenever one of your database objects is updated, let’s just say you want to send an email or something like that, how would you do that? Normally, you might say, “I’m going to update my application.” I actually write code in my applications so that in the code that handles this update request in addition to making the update on the database, it also then does something else to send out this notification that the update happened.
But in this Amazon ecosystem, what you can do is you can set up that stream which is basically just a one click on the Amazon consul and you can then have a Lambda function subscribed to that stream. Whenever there’s an event on that stream, your Lambda functions gets called and now that Lambda function. You can basically have that logic of , “Hey, I want to do something when this event happens.” Just with the very simple function handler that is implemented as a Lambda and now you decouple that from the rest of the application. You don’t have to re-deploy your applications to get that functionality.
You can keep that on very loose coupling and have the flexibility to do whatever it is that you want. If that changes where you want to now do two types of notifications, it’s very easy to do, you just update your Lambda function. You can have a handful of these Lambda functions in conjunction with your main application and together have a very powerful flexible system that it makes it really easy to add on this some free sophisticated features.
Jon: Yeah. That is a good use case. Another one that comes to mind is Cron type things that happen maybe infrequently. One that we did recently was reading some medicare data from the medicare government website once a month. We could’ve of course, made a docker container and put a task into ECS that could’ve been triggered from CloudWatch—triggers that happen once a month but it’s just way less setup and way less work to create a Lambda function to do that. The result is essentially the same. Something gets called, a function gets called and runs quickly, and then it’s done.
Just to share a quick story. One of the things that can be a drawback I think on Lambda is that it’s a little bit difficult to monitor. You can’t really see, especially if you have a lot of traffic going into your lambda function and they’re making an application that’s on going and handling many, many requests per second all the time, it maybe difficult to get a lot on insight into what’s happening. We know that under the covers, Lambda is using containerization and it’s spinning up containers and discarding old containers, and trying to shape it’s capability to the traffic that expects to see and that it is seen.
At one point, I was at a conference and a person that was giving a talk was talking about how they’ve done some reverse engineering to find out what Lambda is doing and how often it’s getting rid of old still containers and spinning up new ones. When you’re running a high load application on Lambda, there’s some evidence showing that occasionally, there would be slower request while the containers are starting up and becoming available. Then, just done a lot of graphing and mapping to figure out what really is going on under the covers.
At that conference, in the audience was the general manager of Lambda from AWS himself. It was interesting just to see him nodding along and at the same time, knowing that some of the things that were being shared were reverse engineering of the secret sauce of Lambda that he probably is not really allowed to talk about publicly. It was just a really fun interaction to see but at the end of the day, my takeaway was to be a little bit careful in terms of doing large applications that scale inside the Lambda because you don’t have a 100% control over how things happen and your ability to monitor is a little limited.
Chris: Yeah. Definitely, in the past, that spinning up time was definitely a huge factor because the spin up time used to be a lot lengthier. If you’re Lambda hadn’t been invoked for a certain amount of time, then basically that would be deprovisioned and the next time a call was made to it, there was some startup initialization time so it could be lengthy, 10, 15-second delay before it actually responds to that call and then after that it’s like the normal quick response time. You end up doing things like, I’m going to setup some other cron job for Lambda, some scheduled task that goes and just hits to my service, my API every minute or every few minutes type thing to keep it warm.
Jon: Yup. There’s so much to talk about in Serverless and clearly, we’re interested and excited to talk about it more but we were wanting to keep this conversation at a fairly high level across the a lot of different AWS Services. I think we have time to touch on just a last few and maybe we can touch them. There’s actually four more that we want to talk about so it’s going to be difficult get to all four. Maybe we could leave out S3 instead just talk about SNS, SQS, and SES, just briefly what those are and why we like them, why we use them.
Chris: These are some of the core messaging services that AWS provide. SNS stands for Simple Notification Service and what that’s for is that’s for basically, you can setup these—they’re called topics—which you can think of it as just a channel and then you can have basically Pub/Sub to that channel. You can publish a message to that channel. You can have a zero to end subscribers to that topic that are then notified when a message is posted onto that and then they can read that message and then do whatever it is.
It’s this multi-reader events channel, if you will, type thing and you can use this for a variety of use cases. You can use it for things like sending push notifications to mobile devices whether you’re using Apple’s APNs or you’re using Google’s GCM—I believe it’s what is called—for a push notifications to android devices. You can setup these messages to go to various different places whether it be like Qs or you can chain them together to trigger an email or something like that. Pretty flexible.
Jon: That’s been the most interesting thing to me about SNS is that, really when we started with it, we thought of it as, “Oh! That’s what we use for push notification.” But now it really seems more like a full on Pub/Sub publish subscribe box.
Chris: And now, what it is, one of the really great use […] is this concept of Fanout. This might be a good segway into SQS. SQS first stands for Simple Queue Service. What that is, it’s just a message queue. You push a message on to a queue and then, you can have readers that are just pulling messages off of it.
When a message is pulled off then it’s only that whatever pulled it off is the only one that can see it. It’s not this one to many publishing, it’s basically just one to one. Whoever gets the message does it. You can think of it as just units of work as opposed to notifying a list of subscribers.
A really great example of how these two things interact is recently we were using SQS to trigger, we would send a message to an SQS queue when a certain event happened because we want to take a specific action when that event happen. We would basically publish this message to the queue, we have the one subscriber to the queue reading from it. He would pick up this messages and say, “I’m going to go to my background task on it.” And it worked very well exactly what it needed to do.
But then, we had the situation where it’s like, we have this other piece of software that it also wants to know when that events happens because it wants to do some other task that’s related to when that event happens. What do we do there? We could definitely go ahead and create another queue inside our code that handles the event. We could basically send two messages, one to the first queue and another one to the other queue but that feels […] and what happens if the publish to the first queue is successful but the one on the second queue fails, what state are you in? That’s when we said, “Let’s go ahead. This is a great use case for SNS.”
Instead of sending two SQS messages, let’s just send one SNS message that, “Hey, this event just happened and here’s some metadata about it.” Then, what we do is we can now have multiple subscribers to that SNS topic and the subscribers can be things like an SQS queue. This concept is called Fanout. You basically can Fanout your event to multiple subscribers using something like SNS. It’s a very powerful tool to use and it doesn’t have to be just Fanning out to SQS. You can Fanout to a Lambda function as well. Actually, in our particular case, we’re Fanning out to an SQS queue as well as a Lambda function.
Jon: And just to add, those are great reasons to use a message queue or use a publish subscribe system in general. That’s what they’re for. There’s a lot of other ones out there, some open source one, some commercial one. I think the reason that we like to use SNS and SQS is because we’re already using AWS for other things. We know from experience that setting up and keeping running those open source ones requires work, and configuration, and thought, and troubleshooting, and we’d like to spend our brain cycles thinking about customer needs and user needs rather than whether or not our queueing system is running on some EC2 instance out there. Going with what Amazon has provided is absolutely adequate, if not, world class and keeps us focus on what are users need.
Chris: Absolutely.
Jon: And also I […] where do we looking for was supplicative rather than duplicity but I do like the idea of some duplicity coding. The last one before we wrap up today is SES. What it stand and what do we use it for? Why do we like it?
Chris: SES stands for Simple Email Service. This is literally an SMTP Server so that you can send email messages to where we need to send them to. Again, another core messaging service that your apps can use. We use it to trigger emails when certain events happen or we want to send information to users of the system or what not and you can, again, chain these things together. You can have SNS Fanout and one of those things is you can trigger emails through SES when those SNS messages happen.
Jon: Right. I think that SES, honestly, is a little more difficult to setup and a little harder to work with than one of its competitors, which is SendGrid. We used to use SendGrid as our de facto email sending service and it has more features. It lets you seen out reports of what emails are bouncing and manage those lists very easily within the system and SES is a little bit more just, here’s a system, use it at your parallel kind of thing. They also requires a bit more set up. They make you essentially apply with Amazon with in order to be able to send messages beyond an initial small white list of users which can be a bit of a pain.
But the one thing that it does have is just nice easy integration with other Amazon services. I think that’s one of this cases where SES may not be the best solution, may not be the most world class one but we got sucked into using it just through Amazon’s total platform capability and tie in with everything else. It’s just that locked in that Amazon tends to do the […].
Chris: Yeah. In a way, it’s a network effect. Even though it may not be of useful feature but just the integration capabilities, that definitely tips the tide into its favor.
Rich: How often that does happen where you move away from a better product because of the integrations that it has?
Jon: It may have happened one or two other times. One of the things I’m thinking about is like Elastic Beanstalk, I think is another great example. We talk about that earlier today and Heroku is a cleaner, more polished product, and it’s more pleasant to use but it left a little bit to be desired in terms of monitoring and troubleshooting that not as good product, Elastic Beanstalk, tips us into it because those are so important that we were willing to put up with a worse user experience.
Chris: It’s actually the reverse is probably more true. There’s definitely cases where the AWS version of some thing is just not good enough so we use something else. A good example for us is logging. Amazon has CloudWatch logs. Their logs are rotated, they’re easy to setup. They’re integrated but they’re difficult to work with. They’re not too developer friendly. They’re not easily searchable so we use an external service for our logs.
We use similar logic and that’s all they do. They focus on just logs and it’s great. Every kind of feature that we need, it really helps us get our job done. It’s a great example of, you don’t have to drink all the kool aid. You really have to be careful and know what to […] us are and use the right tool for the job.
But there are a lot of great tools that Amazon has and a lot of times it maybe not as full featured as something else but it still covers your core use cases until it doesn’t limit you. I think that’s the big thing, like how much does it limit you. If it is, then you’ll be looking to some other tool but if it’s just 10% better, is that enough to make you lose that power of integration that never […] with staying within the AWS ecosystem.
Rich: In this case, SendGrid isn’t an order magnitude better so therefore, SES is the right solution for that.
Chris: I would agree with that.
Jon: Yup. Next week, we have a couple more core AWS things that I want to still talk about. S3 and I want to talk more about CloudWatch and then, talk about several things that we’re evaluating and excited about that we haven’t use much, some we’ve used a little bit that were rolling in to start our main tool about AWS options. I’m looking forward to that conversation. Thank you, Rich and Chris. It’s such a fun conversation today.
Chris; Yeah. Thanks guys, I’ve enjoyed it. Thanks.
Jon: See you next week.