September 4, 2019

76. An Encryption Deep Dive – Part Four

Show Notes
Transcription
Discussion

Chris Hickman and Jon Christensen of Kelsus and Rich Staats of Secret Stache conclude their encryption series by discussing symmetric cryptography and Amazon Web Services (AWS).

Some of the highlights of the show include:

Quality Control of Voice Assistance: Apple claims only your phone, not Apple or Siri, knows unique IDs to delete recordings
Encryption Liability: Apple didn’t encrypt, only anonymized, your recordings with Siri
Can you keep a secret? Symmetric encryption is a single key that encrypts/decrypts data
Key Management Service (KMS): AWS’ core service involves symmetric encryption to generate and manage cryptographic keys
Hardware Security Module (HSM): Stores secrets/master key and provides encryption/decryption services
Pros and Cons of Protocols: Control of authentication and authorization, lower latency, managed device, durable and scalable firmware updates, and unfamiliar to auditors
Customer Master Key (CMK): KMS hierarchy to manage cryptographic key
Create Key Passwords: Allow KMS or AWS to generate password, or provide it yourself
Recommendation: If doing KMS at scale, envelope encryption is a must; primary encryption-decryption involves creating new data key
How to deal with breaches, compromised keys? Enforce waiting period prior to deleting or rotating master key

Links and Resources

There’s a privacy explanation for why Apple doesn’t let you delete Siri recordings

AWS Key Management Service (KMS)

AWS CloudHSM

AWS CloudFront

AWS FIPS 142

AWS Customer Master Key (CMK)

AWS Elastic Block Store (EBS)

AWS CloudTrail

AWS SDK for JavaScript in Node.js

PKCS 11

CNG

Advanced Encryption Standard (AES) 256

Mobycast’s toll-free voicemail: 844-818-0993

Mobycast’s Email: ask@mobycast.fm

Mobycast on Twitter

Kelsus

Secret Stache Media

Rich: In Episode 76 of Mobycast, Jon and Chris finish our series on encryption by digging into AWS’ encryption services. Welcome to Mobycast, a weekly conversation about cloud-native development, AWS, and building distributed systems. Let’s jump right in.

Jon: Welcome, Chris and Rich. It’s another episode of Mobycast.

Rich: Hey.

Chris: Hey, guys. It’s good to be back.

Jon: Yeah, good to have you back. Rich, what have you been up to?

Rich: Looks like we’re going to have our first full time senior developer hire. It’s going to be a promotion from within, but this will be the first actual full time salaried employee. I spent the weekend trying to figure out what the job role would look like, what the expectations would be.

Jon: Mobycast listener I assume?

Rich: No, it’s the developer who’s been with me the longest. It’s just ready for him to take that full-time commitment. It’s more or less me giving the commitment to him, as he’s already proven that he’s ready and willing. It was just really hard for me to come up with what that salary should be, commensurate with where he lives, and also what he’s done.

I spent probably 15 hours this weekend thinking through it, but it was one of the best exercises I’ve done because it also forced me to define all of the different rules that were there but never defined in our company. The outcome is that we’ll have a growth plan for anyone who works for us moving forward, which is pretty sick.

Jon: I’ve been really pleased with the WordPress work especially the back-end parts and pieces that your team at Secret Stache has done. I know that Alex, your new person, is a big part of that. So congratulations, that’s great.

Rich: It’s scary because it’s a huge commitment. But I feel like this is the risks you’re supposed to take in entrepreneurship. This is the right move regardless of whether or not […].

Jon: How about you Chris? What have you been up to?

Chris: Recently I transitioned away from some of the day-to-day client work here at Kelsus. I’ve been working on account for, basically since day one for about 2½ years. That took up a lot of my time, in fact most of my time.

Recently, I transitioned just some other folks on the team, and that’s freed me up to be working on some more strategic stuff for Kelsus. Part of that is having more time to be able to put more effort into Mobycast, and thinking about ways that we can improve it, from both the content perspective as well as from the production perspective. We are looking to continually improve.

Jon: Yes, indeed. We haven’t gotten around to the production stuff yet.

Chris: No. We’re putting together that plan on a deck. Don’t hold us to the bar yet.

Jon: A new cool thing that we’re going to try to do is we just want to hear from you, our listeners, a little bit more. In order to hear from you, a big part of that is actually asking for it. You seem that we don’t want to hear from you, and we just blabber on and talk about our own stuff, but we do want to hear from you.

Just today, I set up a new phone number it’s 844-818-0993, you can just hit rewind as many times as you need to know what that is. You can ask us questions there, also you can get in touch with us via #mobycast, and also at ask@mobycast.fm. All three of those are set up today to prove that we really do want to hear from you, and if we get some good questions—which I’m sure we will—then we’ll start to take some time to answer those.

Also today I just ask our designer at Kelsus to put together a t-shirt design and sticker design for Mobycast. It could be a good reason to ask us some questions with our Mobycast are just pretty cool now and you might want to wear it around.

Chris: It is. I may have to call that number and leave a question.

Jon: All right, what are we going to talk about today Chris?

Chris: Today we’re going to wrap up our multi part series on encryption, lay the framework, the foundation for all the ins and outs of encryption and the various types, definitely go into pretty deeply symmetric encryption versus public key encryption, and how that works in the real world with things like TLS and end-to-end encryption.

Today we’re going to talk about encryption services available and AWS, since that is definitely one of our favorite cloud platforms and just talk about how this rolls all into that.

Jon: That sounds great. Before we get there, I was thinking about our last episode last week and that came up over the weekend. Apple is in some trouble because people are complaining that they’ve been recording Siri, things you say to Siri. You can’t go and delete the things you said to Siri. Did you hear about this Chris?

Chris: Yeah, I did. This feels like it’s a rolling issue. Everyone that has voice assistance, they’re doing this. Whether it’s Google, whether it’s Amazon, it’s all end up the guys do like, “We want to get better,” it’s all about quality control.

Jon: I guess with Google and Alexa, it’s possible to go and say, “Hey, delete all my recordings.” With Siri, it’s not possible to do that. The reason is surprising. It’s because Apple was so careful to make sure that people were not associated with these recordings, that they made it impossible for themselves to go figure out which recordings are yours.

So, when you fire up Siri, a unique ID gets created, and then that ID is something that only your phone knows and Apple doesn’t know. That is what ties you to the Siri recording that you created. Later if you say, “Well I want to delete those recordings,” Apple is like, “I’m sorry, I don’t know how to find them. We don’t have access to this ID that got created, it’s different for every recording. It’s gone,” but it is not gone, if you know what I mean.

It just really reminds me of our end-to-end encryption conversation yesterday about liability, about companies driving this, and companies wanting to stay hands off. But in this case, the problem is that they didn’t encrypt the recordings. It only anonymized them. So, that’s not really anonymous because you can still tell who a person is if you can hear their voice.

Chris: It’s a good point. Apple leads the way here with privacy and really treating us as first class citizens in everything that they do, and to anonymize the voice calls is a really good step in that direction. But as you point out, your voice is a signature. With just technology, you can imagine being possible, if somehow you got a slew of recordings of Siri and then you run it through some other system using ML and you can figure out who this belongs to.

Jon: If you’re a famous actor and you say, “Hey Siri remind me about my meeting with Steve Spielberg later at 3:00,” you might be able to figure out who it is and what they’re doing.

Chris: I really think it’s […] what if it’s an impersonator?

Jon: Let’s get into how AWS works. I just thought that was a fun little side trip.

Chris: Most of the […] we’re going to focus on symmetric cryptography. There are ways to do public key encryption with AWS, but you need to do some heavy lifting yourself to make that happen. Out of the box, KMS, the key management service is one of the core AWS services that involves encryption and that is symmetric encryption.

We’re going to focus keep symmetric encryption top of mind today. Just a reminder with symmetric encryption, you have that single key for encrypting and decrypting data. It’s a very much an important secret. Both parties that are involved in that communication need to know that secret. It’s very important to protect it.

We’ve talked about in the past, in previous episodes, how you can protect this by encrypting it. You can use a master key, a master symmetric key, encrypted to produce a subsequent key and use that one. But at some point you’re going to need a plain text key to exist. You can’t keep encrypting and put it something inside the envelope.

At the end of the day, you need to have a plain text key somewhere. What do you do with that? Where you put that? It’s a well-known problem, it’s been around for a long time, there are pretty great solutions out there for this, and that’s pretty mature as well. A really common way of solving this is using an HSM, which is an acronym for Hardware Security Module.

What these things are, these are appliances, think of it as a black box, essentially. Really it’s only purpose is to store these secrets and then provide some encryption services around that. The intent is the key goes in, but the plain text key never comes back out of it. Is a key part of the tenet. Anything that would come back at it, it would be an encrypted key maybe but never the decrypted, the plain text version of the key.

You have these things are very much locked down and they’re tamper resistant. Some of them are design that if they detect any tampering, it just wipes itself. All the data is gone. It’s like, “This message will self-destruct.” It’s definitely pretty interesting. It just goes to show how serious this stuff is and how seriously we need to treat these secrets.

Jon: Are we going to get into how they actually work and what they do? I can’t imagine a machine where it’s got this plain text key in it and you can use it. If it’s locked away in there, then how can I ever even use it if I can’t get in there?

Chris: There’s two main things that these HSMs are doing. One is they’re keeping track of these keys, and the other thing is they are providing encryption services as well. If you put a key in there and you can never get it back, it’s like a black hole or something like that, then that’s not too terribly useful. What you do is, with that secret that’s in there, you can now do encryption-decryption with it. It will also do those encryption services for you with that particular key.

There’s really not much to these HSMs. It’s literally just an appliance they have. Again, very locked down for the most part. They have pin pads that are used to give access to them, it’s not like it’s a web interface or anything like that. By design they’re very hard to work with, to update their firmware, and whatnot. In a way it feels more like the technology from 20–30 years ago when you had to go and configure a router type thing.

HSMs are definitely a good place to store your secrets, your master key if you will. There’s definitely a few options there. One would be having your own HSM on-prem. This is something that a lot of companies do and definitely this is the historical path folks have gone down. Lots of pros here in that you control the device completely and you have control over all the authentication, authorization.

It’s really familiar to auditors if you’re having security audits done. They’ve seen these things before and they know exactly what it is, they’re like, “Okay yeah, check the box, you got it.” But with that, there’s also obviously a lot of cons, too. You have all the things that go into making this work in a cloud-native world, are now your responsibilities. Dealing with things latency, availability, durability, scalability, that’s all your responsibility, and then all of this is definitely limited integration with all your cloud-managed services and applications.

Jon: I’m hearing you and I’m like, “Okay, everything you say makes sense,” but I still can’t get my head around. Imagine in an on-prem situation, I can’t get my head around the flow of how one of these would be used like, “Okay I’ve got keys on this HSM. It provides encryption services. So what am I doing? How am I using it?” I’ve got a software. It’s running on another machine, obviously running on the HSM, and it needs to encrypt the message. How is this connected to this HSM?

Chris: Say you have an application. Let’s just say it’s our end-to-end. We wanted to do some end-to-end encryption or something like that. Let’s just say, for the sake of simplicity that our client that’s doing the client-side encryption is running on one of our web servers inside AWS. It’s a Python application. Let’s just say it says, “Okay, I need to send a message to this other user so I’m going to go ahead and encrypt it,” and let’s just say we’re just using symmetric encryption again to keep this simple.

I need you to symmetrically encrypt this. We’ll do it the right way with envelope encryption so that means that we need to create a new symmetric key from a master key. Our master key is in that HSM, and let’s just say that this is on-prem somewhere. We have our applications running in the cloud. We’re going to assume that we have the right connectivity between our on-prem and our cloud devices. So we got some […] connect, or VPN, or something like that, some ability to call into it. Let’s say we’ve put some API on top of it so that we can call into it.

Now our app’s going to say, “Well I need to go and I’m going to create this new key, this data key, and then I need to encrypt it. I’m going to encrypt it with the master key. The master key exists in the HSM, so what I will do is create that data key and then I’ll make that API call go in from the cloud to on-prem to that HSM saying, “Here’s the data. Turn this into ciphertext and encrypt it for me.”

Now my web app can go encrypt the actual contents of the message with the unencrypted version of the data key and then send it on its way. The reverse happens on the other side. It’s making that call, basically to do the encryption of the data key, and the decryption of the data data key via the HSM because that’s using the master key. Then the bulk of the encryption again will be done with the data key that’s produced from that.

Jon: The new thing for us in this conversation is this concept of the master key. We’ve talked about how for symmetric keys, they’re so important to keep secret. I guess what you’re saying is a master key is a thing that you can use to create other keys that you can encrypt other keys.

Chris: It’s really similar to what we’ve talked about previously with public key encryption, and using that for the envelope encryption. We have some way to securely share the secret amongst the two parties without them having to share that secret, so we’re going to use public key encryption to do that.

This is picking that same technique, but we’re not going to use public key encryption, we’re going to use symmetric encryption all around. So it’s going to be symmetric encryption for this secret, but the secret is now actually going to be centralized in an HSM instead of doing public key encryption.

Jon: So when I give this data key that you’re talking about that I ask that HSM for, are you saying that both sides know the master key? That the HSM knows the master key to encrypt the data key, and hand it over the wire, and then the side receiving the master key to decrypt the key and use it?

Chris: Both sides have the ability to call the HSM and say encrypt and decrypt.

Jon: They have access to the master key, but they don’t necessarily have the master key […] HSM?

Chris: Yeah. That’s the really important part of this. The HSM, that master key, it goes in and it never comes back out clear text.

Jon: Basically having access to the HSM is a proxy for having access to the master key.

Chris: Yes. You have access to the HSM. Use that key that you have internal to you to do these encrypt and decrypt operations.

Jon: And the encrypt and decrypt operations that you asking to do are typically on-keys, like I need a key so make one for me, and send it to me encrypted, and now decrypt it for me. Like make one for me and then I’ll ask you to decrypt it.

Chris: It depends on the use case. There’s just a bunch of different options here and just ways to use them. Some folks use HSMs to offload TLS. All the TLS encryption and decryption is handled via basically the HSM. Some of them are really geared towards high throughput and do things on that order of 300 megabits a second encryption-decryption.

Jon: That was back in the day and more specialized hardware there probably like, “Oh we’re doing encryption. I’m going to get this special encryption computer,” and now it’s like, “Now the encryption computer is maybe more than just about keeping that secret and less about doing all the encryption for me at a super high throughput in a super scalable way,” yeah?

Chris: They probably still exist and there’s probably still a very good use case. Part of it is we’re spoiled because we predominantly are in the cloud, we’re using AWS, and they provide all this stuff. They’re taking advantage of it, so to us we don’t need to worry about it. But if you’re on-prem, chances are these become real concerns for you, and you need to solve it. It’s just like having something like Nginx do it. Probably not going to scale. It’s not going to scale nearly as well. It’s not going to be nearly as performant as a dedicated appliance.

Jon: I just have this thought like, “Oh, who knows how CloudFront really works.” AWS CloudFront is their CDN maker and have a bunch of HSMs helping them out […].

Chris: One option is just get your own HSM, host it on-prem, and then you do basically all the heavy lifting around it. Another option would be to go with a cloud-based HSM. AWS does have a product here. It’s called Cloud HSM, where literally, just like you can provision any C2 VM in the cloud, you can provision an HSM in the cloud.

Some pros here, again you control authentication, authorization, definitely have lower latency now to your apps in cloud because it’s all together in the same network. You get the great benefits of now someone else is managing the device for you at an operational level. Things like updating firmware, making sure that it’s available, durable, scalable.

The other pro of this is again it looks familiar to your auditors because these are industry standard devices. There is an industry spec for this called FIPS 142, that says this is what you need in order to be able to do the encryption, and protect your secrets. This is what’s required for that. If you comply with that and you get this FIPS 142 compliance rating. This Cloud HSM is FIPS 142, which is important.

The cons are, again that limiting integration with other cloud management services. The Cloud HSM, just like other HSM, is going to be using the cryptographic protocols for communicating with it, so it’s not a REST-based protocol per se. It’s going to be into more bespoke. It’s very much a cryptographic protocol. Some of those protocols are PKCS 11, which is industry standard to one, and CNG which is a cryptographic protocol from Microsoft. Basically, it’s encrypt-decrypt, its how do you archive, the archive keys, and things like that.

Jon: […] might have things like byte order and other not developer-friendly things.

Chris: Probably. It depends on the SDK used to implement that protocol. It’s its own thing. That’s the con to this. You end up having to do a lot of application work on top of it in order to get this to work […] whatever it is you’re doing.

The third option which we’ll spend the rest of the time talking about, is in using an HSM in a Cloud that’s a managed service for this. AWS has KMS which is its key management service. What this is, again it’s HSM, it’s in the Cloud, but now it’s a managed service completely where you’re not directly communicating with those HSM, so instead you’re going through a translation layer. They’ve built services on top of these HSMs to make it more friendly, to make it more cloud-native, and then to do that deep integration with all the other cloud managed services that AWS provides.

The pros now, you can still control authentication and authorization, low latency, you don’t have to worry at all about availability, durability, scalability. All that stuff is taken care of by someone else via Amazon. But now you also get in deep integration with all the cloud managed services. All the important services that you would expect to have encryption capabilities like S3, or Dynamo, or EBS, all these integrated deeply with KMS and it just works. It just makes it so much easier to go and build your application, build architectures, utilizing these kinds of services.

The big con to it though is it’s unfamiliar to auditors. This is no longer just a pure FIPS 142 appliance. This is a service that’s built on top of that. It’s still uses the FIPS 142 devices, so at the end of the day, KMS is using HSMs to store the master keys and the secrets. But there’s software that’s built on top of that, to provide all this deep integration, this developer-friendly environment, and just make it more cloud-native.

Jon: At that point there is a certain amount of trust in AWS that has to happen. They wrote some software and that’s not the same as a certified hardware device.

Chris: Maybe just to point out some of the other major differences between cloud HSM and KMS, is that with cloud HSM you do complete management of all the encryption keys. It is single tenant access, which is for some people that might be really important. It means no one else, no other customer gets access to that HSM. It is dedicated just to you. It’s also going to reside in your own VPC.

Some really pretty major isolation benefits here. There may be some folks out there that are just running in a really tightly controlled environment, they have strict requirements that they have to comply to, and cloud is maybe the only option for them, given that it’s single tenant and it’s inside their VPC. KMS is an AWS-managed service. So, it’s not inside your VPC, it’s outside of that. Obviously it’s not multi-tenant. Your data is going to be with other customers as well inside that system.

Jon: Got it. It seems useful, especially if you’re not operating in a highly regulated environment.

Chris: Let’s just talk about KMS now because, again, unless you have really tight requirements that dictate otherwise, KMS is definitely a go-to service, definitely the easiest use, the most integrative, the most benefits, and it’s rock solid. They’re doing billions of requests a day with KMS. It is just one of those fundamental core services of the cloud now. I can’t even imagine what would happen if KMS goes down. It’s just everything would go down. In the order of 10 billion requests a day go in through the KMS system. That’s a lot. That’s a lot of requests.

Jon: KMS is typically the requester like, “Just give me those keys,” right? Or are we saying decrypt and encrypt?

Chris: Almost all those operations are encrypt and decrypt. It depends on the use case in the application. Some applications are creating lots of data keys. They’re just burning through them constantly because it’s basically for every session or transaction, they’re just going to create a new key. But even for those, the keys themselves are being created by the application. They’re then making a call to KMS to encrypt it with the master key. They’re probably not creating master keys too terribly often. It’s really just the, “I want to encrypt and decrypt the master key.”

Quite a few services are using the data keys approach to do that. You can directly encrypt data blocks of four kilobytes, nothing bigger than that. It really does make you think about, “Okay, I’m going to really be using this for that envelope encryption style, or I’m going to use my master key to go create data keys and the data keys, the encryption […] one of them is outside of KMS.”

Jon: I guess that’s what I meant, it’s hard to say this. It’s still getting the keys out of KMS is actually decryption operation like, “Give me this thing so I can use it to do encryption somewhere else,” is actually telling KMS decrypt this thing for me and the thing just happens to be a key.

Chris: What you’re decrypting is the data key. It’s not a key that’s actually managed by the HSM itself. The master keys are the ones that are managed by the HSM itself, and those will never leave that box. You cannot see that clear text coming out of it. You could say, “I want to decrypt master key, give it back to me.” There’s no such operation, it just doesn’t exist. There’s a special way for inserting those master keys, or creating those master keys, and putting them into the HSM. That’s a pretty important process and we’ll get into that a little bit as well.

So KMS, the main functionality is that it allows you to generate and manage cryptographic keys. It is operating this, it’s giving you crypto services, so again the encrypt-decrypt. It is using AES 256 which we talked about in the previous episode as well, and go in to some of the principles […] with the confusion, diffusion, and using that secret as what makes the algorithm work, and not the algorithm itself in the secret.

We talked a lot about customer master keys, CMK, that’s the acronym. Again, that is the building block, that’s the hierarchy, and that’s really what KMS is managing from the cryptographic key standpoint, are those master keys. With those master keys, you can encrypt up to four kilobytes at a time. If you need a crypt data of a larger size, you want to go create that data key, and then you’re going to use that data key to do the encryption-decryption outside of KMS, not have KMS to the actual encryption-decryption.

Jon: This maybe the right time for this because I was recently playing with this. This is one of the things where I’m getting a whole bunch of context from something that I’ve played with before. I hope that our listeners are, too, because knowing the deep details of this is not a requirement to be able to actually have some success using it, but it sure makes it life better when you know what you’re really doing.

This is what I was doing, I’ve got this API key that I got from Slack. The API key is the secret and I don’t want it to go anywhere. Because I don’t want it to go anywhere, I don’t want it to get out, I’m going to put it in AWS Secrets Manager. That seems like a fairly straightforward process, but as part of that process, I was required to create a KMS master key and essentially hook Secrets Manager up with that KMS master key. After that was all done, then Secrets Manager was able to give my code access to that Slack secret API token that I had.

That process was a little murky. As I said, I was like, “Ugh, what is really happening here?” and it was particularly murky when everything didn’t work. The reason that it didn’t work is because some piece of my infrastructure needed direct access to KMS and I was like, “What? I thought that KMS was totally separate and I just set it up for Secrets Manager. Why can’t I just say infrastructure here have access to the secret in Secrets Manager. Why do I have to tell it about KMS?” Now, it’s making sense. I had to tell that piece of my application about KMS or give it access to KMS because that’s what we talked about at the beginning of the episode. That part of my application needs to be able to access essentially the HSM—the key manager—in order to be able to get access to anything that’s inside Secrets Manager. That KMS is the lock and the gate to everything in Secrets Manager.

Chris: Again, it started off with this premise of you can get by with not knowing how a lot of stuff works, the difference in doing symmetric encryption, or public key encryption, or how KMS works. But by knowing it will save your bacon. It’s important to know how all this stuff works and really what is doing under the covers.

So Secrets Manager, what is it doing? Of course it’s going to use KMS for how it does encryption and decryption, and securely storing that. It’s absolutely going to go and create a data key to do that and in order to create data key, it needs access to a master key. That’s what you need to tell it to do. That’s what your doing, you’re saying, “Hey go use this master key for creating data keys that needed to be encrypted and decrypted,” because those now have to be stored somewhere.

A big part of KMS, one of the great things about it is that you get really fine grained access control over it as well. You don’t want it to be wide open, that anyone can use the master key to do encrypt-decrypt because then it’s not really so much of a secret anymore. If you just say anyone can call on to KMS and say, “This is the key I want you to use. Go decrypt.”

Instead, it’s got to be “No. Who has access to use that master key? Who has access to make calls to KMS and what calls can it make?” You have very fine grained access control to all of these and by default, it’s going to be pretty locked down. So, that’s the other part that you ran into there, too. It’s like, “What permission need to be enabled in order for this all to happen?”

If you know how it works, at least in principle, and how these things are working together with Secrets Manager’s working with KMS, what’s happening between a master key versus a data key, where the encryption-decryption happening, identity access management and resource policies, and all that stuff, it all ties together. Your understanding all that is going to make life a lot better, it’s going to make things smoother, it’s going to make it much more secure, and it’s going to save you from a lot of banging your head against the wall trying to figure out what’s going on.

Jon: That was really nice what you just said, but a part of what I just heard is maybe you sort of saying, “Hmm, that sounds a little fishy what you’re doing.” Maybe this piece of your infrastructure didn’t need access to KMS, because KMS made a data key that could’ve been shared some other way?

Chris: If you’re using Secrets Manager, then whatever app you’re using to store and retrieve that secret from Secrets Manager, it doesn’t need direct access to KMS, but you need to delegate access to KMS to Secrets Manager for the key that you’re using. Because it’s going through that […]. It’s not an application is going to be talking to KMS on that particular case.

Jon: Great.

Chris: Maybe we can just really quickly talk about these master keys. There’s two main categories of them. One is AWS-managed. These are hands off. You don’t really have to do anything. These master keys are only used by that service. A really good example of this is S3. S3 has an option for doing server-side encryption. One of the options is SSC KMS and that’s what this is. It’s an AWS managed CMK. That master key that’s being created by S3 is only used by S3. You don’t have to do anything special. It’s just all managed for you and really not to think about it. But again it’s per service, it’s not very flexible.

The other main category is customer-managed. This is a bit more almost always have to do this if you’re going to do anything with KMS, beyond just really basic use cases. So for the customer-managed keys, you basically have three different options for how to create those keys because a big part of this—we talked about this in previous episode—at the end of the day, a key is really just a secret. Think of it as a password, so it’s pretty important to pick a good key, that’s not going to be guessable, that’s not going to be compromised.

When you create a key, you need to provide the cryptographic material for creating which is essentially its—for lack of a better term—password. You have three options when you create these keys. One is you just let KMS do it for you. You don’t really think about it, KMS has its own way of generating this stuff, some of the industry standard random number generator algorithms for generating a sufficiently large random key, so you can do it that way.

You can provide it yourself. When you create a key and you want to provide the actual key cryptographic material, you first create basically an empty key saying, “I’m going to give you the material later,” and then the second step is to say, “Okay, I’m importing the cryptographic material that’s been signed and go ahead and use this.”

Then a third option is, AWS now actually has the ability to integrate cloud HSM with KMS, so you can actually store your keys, have complete control of them in cloud HSM, and tell KMS to use those, which is the best of both worlds for those folks that have those really high requirements for control over their secrets. This is a way to leverage some of those great additional benefits of KMS, the layered functionality on top of it.

Jon: Interesting.

Chris: We’ve talked about identity and access management with this. That’s a big part of KMS. That’s a great feature you can use, IM policies, and resource policies to control access to these master keys. You can lock it down on based on identities like user’s scripts and rolls. You can specify the kinds of operations that can happen on them, so really fine level of control over it.

Full auditing support for this of KMS integrates completely with CloudTrail. Every API call made to KMS is going to be logged into CloudTrail as long as you have CloudTrail enabled for your account, so you know exactly what operations are happening and by which identity.

Something else that’s pretty important is the recommendation for if you’re going to do KMS at scale is envelope encryption is a must here. We’ve talked about this. It is a pretty core principle. It’s important to understand. For your primary encryption-decryption, you’re creating a new data key. That data key is what’s used for the encryption-decryption that’s happening outside the KMS. You’re really using KMS for protecting that data key.

You encrypt that data key with your master key, and then the encrypted data key resides alongside the encrypted data. That way decryption just becomes reading the encrypted data key making the KMS call to decrypt it to get the plain text data key and then now you can do the decrypt operation again outside the KMS and that’s how you get scale.

Jon: I guess I’m still a little stuck on the example that I came up with earlier, Chris, because I was like, “You know what? If I’m not getting this, then is our audience getting this?” So I took a look back to this code that I wrote and it’s doing something really simple. Let me see if I can try to explain it, and maybe the listeners can come along with us on this quick journey where you correct my mistakes, and hopefully I’ll understand if there was a mistake here.

There’s some code that I wrote that goes and gets a secret out of Secrets Manager. In order to do that it’s using the AWS Node.js SDK. In the Node SDK, you can create a new client for the Secrets Manager. There’s a Secrets Manager object and it could say, “Hey, I want a new […].” Once you have a client for this Secrets Manager and you tell it what region—in this case […]—and then once you have a client for the Secrets Manager you can say, “Get secret value,” and then you pass it as a secret ID. In this case the ID that I had given the secret was Slack OAuth token. It’ll actually go and get that Slack OAuth token and you don’t give it anything else. That’s it, that the code.

At least in the way I invoked it, I didn’t pass the data key. I didn’t pass anything that feels like the encryption part of this. I just as directly for it. That didn’t work at first and then later in order to make it work, I had to go to the Lambda function that this thing is based on. There’s a Lambda function asking the Secrets Manager for this secret. I have to tell the Lambda function, “Here’s your AWS KMS key URN,” and I gave it like a specific AWS KMS key URN which is […] string that’s got an ID in it. I assume that that ID represents a data key in KMS.

Chris: That’s the master key.

Jon: So basically, I gave Lambda access to the master key, not to a master key inside of KMS?

Chris: You told Secrets Manager, this is the master key to use when creating your data key. Then Secrets Manager has to have access to that, but the only URNs that you’ll have with KMS are basically […] kits, it’s the master keys. You can’t have an URN to a data key because despite definition of what data keys are and how they get created.

Jon: That makes sense because KMS is just managing master keys. That’s what it does. The secret is protected with the master key. Yes?

Chris: No, the secret is protected with the data key, but the data key is protected by the master key and that’s what you’re telling it to do. You’re saying, “I want you to keep the secret,” and you’re going to say, “This is the master key to use.” So, it’s going to say, “Okay, I’m going to create a brand new data key. This is my symmetric key that I’m going to use for. I’m now going to make the call to KMS to encrypt that key.” So now, I have an encrypted data key. Secrets Manager will then encrypt that secret with the data key, and then alongside it, store the encrypted data key.

So now, it needs to decrypt it. It says, “Okay, here is the encrypted data key. I need to ask KMS to decrypt it and get back the plain text data key. Then using that, I can now decrypt the data and give it back to you.”

Jon: I guess I’m still wrapped around the axle of, does this seem like I did it in an insecure way by handing Lambda this URN to a master key?

Chris: No you have to. That’s just how Secrets Manager work. The same thing with just about any service that you want to do encryption. You’re going to be telling it what master key to use.

Jon: Now it’s starting to make sense.

Chris: It now have access to that key to do these encrypt and decrypt operations. A good best practices for building this out is you don’t have a single master key. If you just have one master key of your entire account and you’re giving access to all these services to go use it, well then you’re blast radius is quite large. So you’d break that up, have multiple master keys, and break them out into different types of use cases.

Jon: This does makes sense. Anybody that needs access to the thing that’s inside the secret needs access to that KMS master key.

Chris: Yes, which is how you lock it down.

Jon: The other thing that’s on my mind is, doesn’t this feel like almost an extra step because if KMS can encrypt and decrypt up to four kilobytes, couldn’t I have just had encrypt the Slack token itself instead of putting that in Secrets Manager?

Chris: Again, it depends on your use case. You can decide just to use KMS directly. In some of our apps, one of the ways that we manage secrets are those secrets are encrypted using KMS and stored in S3 as JSON objects. For that we’re specifying a master key, we’re doing the encrypt-decrypt on the actual JSON object itself. There’s no data key that’s being created there. It’s just using the master key to do the encrypt-decrypt.

Jon: I thought on the JavaScript side it looks pretty much the same. You probably used some SDK from AWS as well and you create a client to KMS instead of creating clients to Secrets Manager and you say to KMS, “Decrypt this.”

Chris: Yup. When you encrypt, you have to tell it what CMK to use. But when you decrypt, you don’t because with the ciphertext, part of its header includes ID to what CMK was used to encrypt it.

Jon: I guess some of the advantages of using Secrets Manager are mostly organizational like, I can associate these things together and I can group them, and a few various things to stay organized as opposed to having a ton of […] data keys and a ton of random things that are encrypted with KMS.

Chris: Use it for what it’s for. If you have some secrets like credentials, passwords, connection strings, passphrases, whatever it may be, these bits and bobs of highly confidential pieces of information, then something like Secrets Manager is a great thing to use because it gives you standardized access to it. It’s giving you an API versus the scheme that I talked about where you just use KMS directly. You have to build the API yourself essentially and then you have libraries in your code to implement that and know what bucket are they being placed into, who has access to it, what’s the format of it.

Jon: This is the thing, though. This is classic AWS in the sense of if I were building the Secrets Manager, if I was making a company and we’re going to make a secrets management product, I wouldn’t make some separate managed HSM thing, that somebody has to know about to get access to in order to manage their secrets. I’ll make it part of a management of secrets API and that would be how it works. Not surprise people by like, “Oh yeah, you got to know this other thing.”

The way AWS has so forcefully just decoupled everything means that sometimes use cases are difficult to get your head around. Just like we had a hard time in this whole conversation coming to terms with what I was even doing, it’s because everything is so decoupled and everything has such good strong barriers between them. You have to know a very sophisticated system KMS and Secrets Manager in order to just use Secrets Manager. Or somebody who’s not used to AWS might think, “Oh, Secrets Manager. That should be easy. That’s intuitive,” when I say that word, it’s not intuitive.

Chris: Yeah. We talked about this quite a bit where it’s getting so […] the cloud is the new operating system, there’s so many services, there’s so much functionality, there’s so much surface area, there’s so many things to think about when it comes to availability, scalability, security, and everything else. Trying to keep track of it all is just getting more and more challenging and that’s not going to change.

It is a good model to have services that basically do one thing and one thing well, like the micro services model, but as you get bigger, you start getting more and more integration between them. They’re each becoming tumors of each other and that’s what’s going on here. It makes total sense. If you’re going to roll out Secrets Manager, it’s going to leverage KMS to do the managing the encrypt-decrypt of basically the master key. All the great benefits that come with that like the IM and the resource control policies. But the downside is, like what you said, you got to know both those things.

Jon: Totally. We’re past 45 minutes now, Chris. Do you want to wrap up with talking about with incident response?

Chris: Just the other thing that’s interesting about here is that, how do you deal with breaches, or what if your key did get compromised somehow and you want to delete your master key? The thing with encryption and decryption, if you don’t have the key, you’re not going to be able to decrypt the data. It’s gone forever.

Deleting or rotating a master key is a big deal and it’s one of the things KMS does is they do and enforce a waiting period before a key is actually deleted. You can say, “I want to delete this master key.” The minimum waiting period is going to be seven days. Basically seven days to figure out like, “Oh no, there is something here that is encrypted that I need to be able to decrypt, and it’s now failing because the key is pending deletion.” It gives you a chance to restore that. But after that happens and a key is deleted, it is gone forever. Anything that was encrypted with it, will never be readable again.

Jon: Yeah, […] computing.

Chris: Yeah. Something definitely to think about. They do support rotation and basically what’s happening is it’s two keys. It’s the new active one, the existing one goes inactive, but it’s not deleted because it’s used for the decrypt operations. The new one is the one used for the encrypt operations. That allows you to go and re-encrypt all your data using the new key.

Just keep in mind, there’s some tools that are there to help you, but it’s a big deal. They typically recommend you do that once a year and then deleting of master keys. That’s a very big deal and you got to be really careful to make sure that you really don’t need that key anymore. By default, the minimum is going to be seven days and you can increase that up to a maximum of 30 days.

Jon: Interesting. Why bother deleting? It takes hardly any space. I guess if you really want to make some data be impenetrable, then maybe that’s when you delete a key. It’s in the trash forever for sure it’s really gone even if somebody were to find an old hard drive, they wouldn’t be able to do anything with it that for sure. That’s the same as shredding.

Chris: Yup. You’re dealing with a case where somehow the key has been compromised, someone has it, and you just don’t want to be useful anymore.

Jon: Very cool. Thank you for sticking with us, listeners. We’ll talk to you again next week. Next week if you’ve been bored with encryption—which I hope you haven’t been—you have a treat because we won’t be doing encryption anymore next week. Not sure exactly what we’re going to do but not encryption. Talk to you later Rich. Talk to you later Chris.

Chris: All right. Thanks guys.

Rich. See you.

Well dear listener, you made it to the end. We appreciate your time and invite you to continue the conversation with us on and offline. This episode, along with show notes and other valuable resources is available at mobycast.fm/76. If you have a specific question, we encourage you to call into our toll-free voicemail at 844-818-0993, shoot us an email at ask@mobycast.fm, or on twitter with a #mobycast. Thank you and we’ll see you again next week.

The Docker Transition Checklist

19 steps to better prepare you & your engineering team for migration to containers

76. An Encryption Deep Dive – Part Four