81. Virtual Machines vs. Containers Revisited – Part 1

Sponsor

Summary

Back in January 2018, Jon, Rich and Chris were having lunch together in Denver. The subject of virtualization came up, and Rich said he was confused on the difference between containers and virtual machines. As we answered Rich’s question, we realized that explaining a complicated technical concept in a straight-forward manner would make for a great podcast format. And thus the idea of Mobycast was formed.

When we first discussed “Virtual Machines vs. Containers” in episode 1, we got most of it right, but there were some inconsistencies and holes. We didn’t prepare as well as we should have for that first episode, and frankly, it shows. Now, more than 80 episodes later, we have learned a lot and expect more of ourselves. So, it’s time for a “do over” on episode 1.

In this episode of Mobycast, Jon and Chris kick off a four-part series on virtual machines, containers and how they compare. We revisit this important subject to fill in the gaps, and dive a whole LOT deeper this time around.

Show Details

In this episode, we cover the following topics:

  • VMs vs containers – why revisit?
    • Originally talked about this in episode 1
      • Got most of it right, but some inconsistencies/holes
      • Let’s revisit to fill in the gaps, and dive a whole LOT deeper this time around
  • Types of virtualization
    • Full virtualization (“virtual machines”)
      • Simulates enough hardware to allow an unmodified “guest” OS to be run in isolation
      • Resources of computer are partitioned via hypervisor
      • Examples:
        • VMWare, Parallels, VirtualBox, Hyper-V
    • Operating-system-level virtualization (“containers”)
      • Resources of computer are partitioned via the kernel
        • “Guest” OSes share same running instance of OS as the host system
      • Based on the virtualization, isolation, and resource management mechanisms provided by the Linux kernel
        • namespaces and cgroups
      • Examples:
        • Docker, LXC, FreeBSD jails
  • Hypervisors
    • Also known as a Virtual Machine Manager (VMM)
    • Creates and runs virtual machines
      • It is a process that separates OS and apps from underlying physical hardware
      • Multiple VMs share virtualized hardware resources
    • When you create a new VM, the following happens:
      • Hypervisor allocates memory and CPU space for VMs exclusive use
      • Complete OS is installed onto the VM
      • The VM’s OS communicates with the hypervisor to perform tasks
    • Host OS is able to see all physical hardware, whereas guest OS (VM) can only see hardware to which hypervisor has granted access
    • Two types of hypervisors
      • Type 1 (also called “native” or “bare metal” hypervisors)
        • Run directly on the host’s hardware to control the hardware and manage the guest VMs
          • runs in ring 0
        • Are an OS themselves (simple OS on top of which you run VMs)
          • the physical machine the hypervisor is running on serves only for virtualization purposes
            • Exceptions: Hyper-V, KVM
        • Examples
          • Xen, Microsoft Hyper-V, VMware ESX/ESXi
      • Type 2 (also called “hosted” hypervisors)
        • Run on conventional OS, just like other apps
        • Guest OS runs as a process on the host
        • Hypervisor separates the guest OS from the host OS
        • Examples
          • VirtualBox, Parallels
    • Protection levels (rings)
      • x86 family of CPUs provide a range of protection levels also known as rings
        • Ring 0 has the highest level privilege (kernel/supervisor)
        • Ring 3 lowest level (applications)
      • Hypervisor occupies ring 0 of CPU
      • Kernels for any guest operating systems running on the system must run in less privileged CPU rings
        • But most OS kernels are written explicitly to run in ring 0
        • Techniques to deal with this:
          • Full virtualization
            • hypervisor provides CPU emulation to handle ring 0 operations made by unmodified guest OS kernels
            • emulation process requires both time and system resources
              • inferior performance
          • Paravirtualization
            • Technique in which hypervisor provides an API and the OS of the guest VM calls that API
            • Requires guest OS to be modified (to make API calls)
              • Replace any privileged operations that will only run in ring 0 of the CPU with calls to the hypervisor (“hypercalls”)
            • Allows tasks to run in host OS (instead of in guest OS where performance would be worse)
          • Hardware virtualization
            • Requires a CPU with hardware virtualization extensions, such as Intel VT or AMD-V
              • Intel virtualization (VT-x)
                • Virtual Machine Extensions
                • Adds ten new instructions
                  • VMPTRLD, VMPTRST, VMCLEAR, VMREAD, VMWRITE, VMCALL, VMLAUNCH, VMRESUME, VMXOFF, and VMXON.
                  • These instructions permit entering and exiting a virtual execution mode where the guest OS perceives itself as running with full privilege (ring 0), but the host OS remains protected.
            • Reduces/eliminates any OS modifications in guest OS
            • Provides an additional privilege mode above ring 0 in which the hypervisor can operate
              • essentially leaving ring 0 available for unmodified guest OSes
            • Better performance than paravirtualization

Links

End Song

Time for Trees – Sad Livin in the (New York) City – (David Last Remix)

 

We’d love to hear from you! You can reach us at:

Voiceover: In January 2018 at the Denver Central Market, Jon, Chris, and Rich were planning a business venture around training for Docker. Jon and Chris were explaining Docker to Rich and he said, “I wish we could have recorded that.” And the idea for Mobycast was born. In episode one, the three of them try to recreate that conversation on virtual machines versus containers. They got most of it right, but there were some inconsistency and holes. The Mobycast crew admits, “We didn’t prepare as well as we should have for that first episode and frankly it shows.” More than 80 episodes later, the crew has learned a lot and are ready for a do-over. In this episode of Mobycast, Jon and Chris kick off a four-part series on virtual machines containers and how they compare. They correct old mistakes, fill in the gaps and dive a whole lot deeper than before.
Welcome to Mobycast, a show about the techniques and technologies used by the best cloud native software teams. Each week your hosts Jon Christensen and Chris Hickman pick a software concept and dive deep to figure it out.

Jon: Before we get started today, I’m so happy and so proud to be able to announce that we have a sponsor, so Mobycast is no longer ad-free. But our sponsor is one that we really do care about. We use CircleCI and we’ve talked about CircleCI in a previous episode. I’ll link to it in the show notes. So I just have this to say, this episode is brought to you by CircleCI, the continuous integration and delivery service used by companies like Twilio Intuit, and Tinder. Yeah. CICD is so important for keeping teams building. It’s all CircleCI does. They focus on creating powerful, flexible CICD pipelines so that you and your team can focus on doing what you do best. Whether you’re a company in five or 500 CircleCI can build, test and deploy your Linux, Windows, and macOS projects from GitHub and Bitbucket in their cloud or installed on your servers. And anyone can sign up and start building for free since CircleCI gives both private and public projects a thousand free build-minutes per month. Sign up and start building for free at circleci.com
Welcome Chris. It’s another episode of Mobycast.

Chris: Hey Jon, good to be back.

Jon: Yeah, good to have you back. Today we’re going to go back in time and redo episode one of Mobycast. And then they’ll probably be more than one episode this time because virtual machines versus containers is not something that you can talk about briefly. That’s so important and such a big topic that I think it’s going to deserve at least three but maybe even four episodes to really get deep into the stuff that’s driving cloud development today. And this, Chris, maybe you can talk about this a little bit. This is part of why we want to do this is because of some feedback we’ve gotten on the show.

Chris: Yeah, absolutely. So, I mean we love feedback. It’s good to hear from folks that are listening to these episodes that we’re making and we want to make sure that they resonate with you and they’re useful and provide value. So we do look at the ratings and reviews and one … So it’s nice. We have quite a few nice reviews out there.

Jon: Yes, thank you for those.

Chris: Yeah. Don’t get me wrong. Those are fun to get about like, “Wow, you guys are just awesome. Right? You’re the bee’s knees.” But we did find one, I think it was in, it’s actually in the Canadian store.

Jon: Canada.

Chris: Yeah. For iTunes.

Jon: Being real with us.

Chris: Yeah, the truth from it. Had a review there where the reviewer essentially was saying, “Look, listened to a bunch of your episodes and the first episode it’s kind obvious you didn’t do your homework right? You actually had a few things inconsistent. There was some holes there. And in later episodes it’s obvious you guys did do homework and got better.” So not really great feedback to hear in a way. But also at the end of the day, the dude is right. He was absolutely 100% correct. And it actually is great feedback. And I think part of that is when we first sat down to do the first episode of Mobycast, we really didn’t have any idea where was this going? And I think if you had told me here it is today, where we’re recording Episode 81, would we be doing that many episodes? Back when we first started doing this with Episode One I was like … I don’t know if I would, if I would believe you.~

Jon: I would have gotten the anxiety if you had said that to me.

Chris: Yeah. Right. So let’s just start with one. And it was a learning process for us too as well. And also very busy with just holding the fort down with our day jobs if you will. And it was perhaps a little bit too off the cuff, so …

Jon: Right. I did have one other point to make about the Canadian review. He gave us three stars and he said we weren’t very good, but he also outed himself for having listened to the whole thing. Would you do that? If I think something’s not that good, I stop. Keep going.

Chris: Yeah. And there’s a little bit of a contradiction there, right. To say, “Hey, this is not so great of a podcast.” Yet I’ve listened to obviously multiple episodes, right? Many episodes and yeah. I mean I think most people, you might even give a podcast on it like five minutes before you’re like, “You know what, this is not working for me.” So to sit through to listen to hours worth of content. I mean, something’s going right there, right?

Jon: Must be doing a lot of driving or something or just out of other tech podcasts. But giving him a hard time, we really appreciate the feedback and if you ever want to meet us in person, I’ll buy you a Horton’s Coffee or whatever they drink up there in Canada.

Chris: Indeed. Yeah. I have a Tim Horton story though I will save for some other day. Right? We have a lot, we have a lot to cover here with VMs and containers and breaking them all down again. So maybe we should continue on with that.

Jon: Yeah, let’s get into it. So today we’ll mostly be talking about VMs. Less about containers. Let’s just really understand what VMs are all about. They are what drive the cloud if there is no cloud without VMs. So maybe Chris, you can kick us off there.

Chris: Yeah, I mean I think just can’t stress that point enough, right? VMs really led to the creation of the cloud and that you just can’t do it without it. And it’s really one of those key core technologies that we live with every day. And we throw around this term VM every day, but really understand what that is, how it works, the different flavors, the different technologies that are around it. A lot of folks are a little sketchy on the details there. So we want to dive into and kind of really understand that and I think it’ll be helpful for all of us, right? And as we work in with cloud native software and spinning up VMs and it will lead a lot of value and kind of understanding. Okay, what is the difference between VMs versus containers? And spoiler, there this new thing called micro VMs, which is a combination of VMs and containers. And just where the technology is going as well. So this is going to be just really understanding VMs at a fundamental level is crucial foundation knowledge. And so it’s worth spending some time on.

Jon: Right? And just like in our episode, a series that we did on encryption in the first episode, kind of laid a lot of foundations and then we built on those foundations in the following episodes. So today we’ll be laying some foundation.

Chris: Absolutely. So let’s get started with that. So first let’s talk about the types of virtualization. And really we can talk about it two broad types. So one is full virtualization. And this is typically what we’re referring to when we talk about virtual machines. So VM, so this is full virtualization, which is a mouthful. This is simulating hardware to allow another operating system to run on it in isolation, right? But unmodified, so it’s virtualizing everything. So we talked a lot about this in episode one where it’s virtualizing the entire machine, right? It’s virtualizing the hardware and then the software, right? So to that running guest OS, unless it doesn’t know the difference, it doesn’t know that it’s actually running inside software inside of this virtualization.

Jon: Yeah. You can think of back in the day, if you had a Windows operating system and literally came in a box that you could buy at Best Buy you could take that thing and install it. And it has certain expectations of what the hardware is going to be and those would all be met by full virtualization.

Chris: Mm-hmm (affirmative). Indeed. Yeah. In fact, back to the device driver. Device driver hell.

Jon: Right. But I think that’s a good … It’s a good way to think about it because if you could take the disc out of the box and install it, even though you’re not installing it directly on the machine you are installing into the virtual machine, that’s a really helpful way of thinking about it. Then the software doesn’t know any better. It doesn’t know the difference.

Chris: Oh, exactly. So one of the key fundamental enabling technologies for full virtualization is-

Jon: Say it Spanish.

Chris: Yeah. Okay.

Jon: Let’s hear it all the different languages, full virtualization. And it’s hard for me to say too.

Chris: I know. Can we just say that V-word? So anyway, yeah. So a big component of this is going to be the … is the hypervisor. And the hypervisor is kind of what allows us full virtualization to happen. And we’ll talk more about that as we get further into this episode. But just keep in mind hypervisor is going to be really important for the full virtualization. And some examples of full virtualization technologies are things like VMware, Parallels, VirtualBox, Microsoft Hyper-V, KVM. There’s just a bunch of these things that we talked about with virtual machines, that’s full virtualization.
The other type of virtualization is operating system level virtualization. And this is actually what containers are. So containers are taken … Basically it’s if the virtualization is happening at the operating system level. So here are the resources of the computer are being partitioned via the host OS kernel and the guest OS is … And by the way, so we’re going to talk a lot about host versus guest. And so host is the native operating system running on unvirtualized on a machine. And the guest is the virtualized OS running in whatever technology it’s using. So host and guest. So in the operating system level of virtualization those guests OSs, they’re going to share the same running instance of the OS as the host. So the operating system kernel is shared. It’s the constant across these things. Versus in a full virtualization environment, that’s not the case. There’s-

Jon: Mm-hmm (affirmative). That makes sense.

Chris: They are separately, separate kernels, right? So operating system level virtualization and some core technologies there that enable that. Things like Namespaces and cgroups. And when we’re going to … When we talk about containers, we’re really going to dive deep into just how that stuff works. But some examples-

Jon: I was so looking forward to that because I think that’s where I think that’s where maybe we were a little weak in Episode One and where I’m still to this day. I’m a little weak, so I’m going to be depending on you Chris to teach me.

Chris: Yeah. We’ll make you strong.

Jon: Good, good, good.

Chris: Yeah. And so some examples there for operating system level virtualization implementations obviously Docker, Linux containers, LXC and the original technology here is FreeBSD jails. Which came on the scene in 2000, I believe it was. So almost 20 years old now, but it didn’t take off as much as [crosstalk 00:13:15].

Jon: … some while it’s like came on the scene in 2000 just before virtual machines got popular.

Chris: Mm-hmm (affirmative).

Jon: Not that virtual machines weren’t already around but they weren’t really popular in 2000. People weren’t really saying much … VMware I don’t think was really a big company at that point. So yeah, it’s interesting.

Chris: Yeah. Well I mean, and that’s a whole another part of the history is why VMs even were created. What were some of the things that … The conditions that caused them to explode. So I know for me personally, back in the late ’90s and early 2000s, VMs were … Really the reason for doing those was for developer reasons. Where you have to be able to test different versions of an operating system. You’re writing some code, you want to make sure that it runs on this version of the OS. I was at Microsoft, we were writing client code for Windows and needed to make sure it run on both Windows 95 and Windows 98 [crosstalk 00:14:19].

Jon: But for us at server FORMAT we had one we … it also wasn’t cloud stuff. It was like, “Here sales team. This is a VM. We’re installing it on your laptop and you can start it up and it’s going to be right where you want for your demo to be running.” And it was like magic for them.

Chris: Before that we had to dual-boot.

Jon: Mm-hmm (affirmative), yeah.

Chris: Just horrible. Right? So there instead of having virtual machines, you just spliced up your hard drive into partitions and had different OSs installed on each one of those petitions, but you had to choose at startup which one you wanted. So I’m so glad that we now have virtual machines.

Jon: Exactly.

Chris: Then later it became … It was all about how do you more effectively run, take advantage of your resources. How do you have big beefy servers but run multiple virtual servers on that to get the ideal density and efficiency of using the CPU and memory and what not right? That’s what then really drove the cloud. From there, there was often races.

Jon: Right, right.

Chris: Right. So, let’s focus now on the full virtualization. So this is again the virtual machines. And we talked about hypervisor as being kind of core to this. And so let’s kind of peel that back a little bit. So hypervisors are also known as a virtual machine manager, VMM. And really what the hypervisor does is it’s responsible for creating and running virtual machines. It’s a process. It separates the OS and applications from the underlying physical hardware, right? So it’s that layer between them and multiple virtual machines running on the same physical machine. They’re going to share virtualized hardware resources. So again, it’s the thing that is responsible for allowing for that virtualization of the machine itself, of the hardware. So that those guests don’t know the difference.

Jon: So I get that idea and I think if we’re going to answer my questions coming up. So I’m going to save them, but just know that I have questions about this. I kind of get this idea that the hypervisor is managing my machines, my virtual machines. But I’m like, “Oh I’ve got some questions.” So let’s keep going and let me just see if my questions get answered and if not I’ll ask them.

Chris: Sure. Yeah. So I mean, so this might be kind of helpful, just that … So broadly there’s two types of hypervisors. There’s Type 1 and Type 2, right? Really creative with the names but kind of makes it easy to keep track of them as well. So let’s talk about Type 2 first. Because Type 2 is actually the … For us it’s not as interesting. And so with the Type 2 hypervisor this is running on top of the host operating system. And it’s running like any other app actually, right? So you have your host machine with the host OS and then you have this Type 2 hypervisor that’s running kind of basically as like an application

Jon: Yeah, like some program, some process, yeah.

Chris: Right? And then you have your guest OSs as is running as running on top of that. Right?

Jon: That’s so weird, right? That’s so weird. Because the operating system is installed directly onto the hardware. It has direct access to the hardware. It runs a program that then lets other operating systems get installed into it, and that see the hardware virtually?

Chris: Mm-hmm (affirmative).

Jon: Oh, that’s weird. That’s weird. And that was one of my questions. That was exactly one my questions. Maybe you knew that was going to be one of them. It’s where’s this hypervisor? Is it in the chip? Is it something that gets installed somehow in RAM? Where is this thing? So yeah, the Type 2 hypervisor you were saying is just like any old software that runs on the operating system.

Chris: Yeah. Obviously with some special-

Jon: Special capabilities right.

Chris: … calls, right? And everything like that. But it is very much important to read. It is running on top of the host OS as a separate program that you install. So some examples of this are VirtualBox and Parallels. And so these are … You can think of these as more … These hosted hypervisors, these Type 2 hypervisors, really more for personal use. So this would be something again, you install on your MacBook or install on your Windows machine. You go get VirtualBox and now you can run different operating systems and whatnot. But you’re going through this hosted hypervisor, which doesn’t have nearly the kind of capabilities and access to the hardware and the ability to more effectively use the resources of the system.
So these Type 2 hypervisors, you have to specify when you create a guest OS exactly how much memory it’s going to get. And how much disk space is going to get and that is just going to be carved off. So if you have 16 gigs of RAM on your machine and you create three guests OSs and give them each four gigs of RAM, you only going to have four gigs left over for your actual host. And then if you created another one with four gigs of RAM, another guest then you’re out of memory.
Versus with the Type 1, it has much more flexibility where you can actually … It’s more dynamic in it’s resource that you can actually do the same kind of scenario but it’s only going to use memory that as it needs it. So you’re not pre-allocating those resources. So there’s just more flexibility with Type 1, more efficient use of the hardware. And that’s why Type 1 hypervisors are what we’ll see running on servers in the cloud and the Enterprise.

Jon: Well you’ve said now that Type 1 seems to give me some more flexibility. Better in running in the cloud, but you haven’t really told me what they are yet.

Chris: No. So let’s dive into that. So these Type 1 hypervisor, they’re also called native or bare-metal hypervisors. And these, for the most part, they run directly on the host hardware to control that hardware and to manage any guest VM.

Jon: Okay.

Chris: Something to wrap your head around is that they are an OS themselves, right? These hypervisors. So it’s a very simple OS, right? And it’s so it’s a simple, you can think of it as a simple operating system is running on top. It’s running on that hardware, that machine, and it really only exists for you to spawn VMs. Okay. So this is so again … You can imagine like you boot up a machine and instead of going into a Windows OS or Linux OS instead, it’s this is the hypervisor OS. And it says, “What do you want to do?”
“Well, I want to go create VM or whatever like that.” Or, “Here’s how you interface to me and talk to me and whatnot.” For the most part, again, with these Type 1 hypervisors, the physical machine that the hypervisor is running on, it serves only for having VMs on it.

Jon: Right? So it knows how to talk to the hardware, but it’s not going to worry about things like users and other things like that, that operating systems worry about.

Chris: Right. I mean it is based, it’s purely … Think of it as OS for VM and so whatever VM, whatever it needs to do with VMs, like back them up and create them, delete, destroy them, those kinds of things. That’s what it’s focused on. And so things like … Some examples here of Type 1 hypervisors are the Xen hypervisor, which is an open source project that’s been around for quite some time and quite popular. Microsoft has its Hyper-V hypervisor and then VMware big player in this space as well.

Jon: Their hypervisor looks like you have it here. It’s called the ESX or ESXi.

Chris: Yeah, they have quite a few different products and flavors. I mean, VMware has definitely been in this space for almost 20 years now. And so there’s a lots of different flavors, but that ESX is one of them.

Jon: I’m actually trying to picture you’re setting up a data center, you get a machine and you put it on the rack, you give an access to a boot disc. That boot disc might already have this hypervisor installed on it and bam! It comes up. And then here’s a bunch of VMs that you can … On some other disks that you can have access to and start running. Those would be AMIs, right? You’ll grab me an AMI.

Chris: Yeah. We’ll talk a bit, okay, what is AWS doing in this space? And how are EC2s implemented? And what are the hypervisors that they’re using there? And what do images look like? And what’s, how is the inline different than another VM image and whatnot.

Jon: So we’ll get to that in a future episode. But yeah, I guess that was the thing that occurred to me is, okay, so these things run on the bare metal. Who installs them? Where do they come from? Does every computer have them now? Is that just what we’re running now? Even my Mac book that I’m sitting in front of right now, is that running bare-metal hypervisor these days or not? I don’t know.

Chris: Mm-hmm (affirmative). Yeah. And no. for the most, for your MacBook’s not. It does have a hypervisor in it. I kinda said, okay, there’s Type 1 Type 2. And so Type 1 it’s the hypervisors running directly on the … It’s bare metal, right? It’s running directly on the hardware. It’s kind of you can think of it as an operating system in of itself. And they really want to exist to allow your machines are run virtual machines. So in that particular case you would know if you had a Type 1 hypervisor, because whenever you started your computer, you’d get a kind of a very boring screen saying, “Okay, what do you want to do? What VM do you want to run?” So, and then Type 2, it said, “Okay, these are kind of applications. You go and install and you’d launch them and then you’d say, “Okay, what VM do I want to run?” And now you are … Again, it’s pretty, pretty prescriptive. There has been a bit of some cross pollination here. An example for this would be …
So really an easy one to talk about here is Hyper-V from Microsoft. So Hyper-V is the Microsoft Type 1 hypervisor. It comes with Windows Server. And by default it’s not enabled, but you can go in and enable it by adding a role in Windows where you can … There’s various different ways to enable it, right? But so Hyper-V you can install or enable in Microsoft Windows, but it still has the host OS of Microsoft Windows, right? So it’s a little bit different, right? It’s not this pure like what we talked about Type 1 where it’s like there’s just this VM operating system, right? Called the hypervisor. And that’s all that’s on there. And then you can install your … You can have whatever VMs want to run. And if that was the case with Hyper-V, you would just have a computer, you’d install Hyper-V and then after that if you want that computer to run Windows Server, you’d then install that and as a VM on the machine. Right? And that’s not how Hyper-V works.

Jon: Okay. I guess what I had been picturing, when I was wondering are these things on my machine already? Is that what just runs these days? Is like kind of installing Microsoft Windows might be doing something like installing a hypervisor and then installing Windows and then just kind of scripting, “This is what happens.” The hypervisor starts and then it starts the Windows VM and you don’t get a choice. You don’t get that screen you’re talking about because it’s just automated. And that’s what I had been kind of wondering if that was happening. Because I swear I’ve seen the word hypervisor and in the startup script of my Oss. Literally in my OS of my computer that I’m sitting next to you right now. But maybe I haven’t, I don’t know. Just that word feels familiar to me from the boot sequence.

Chris: Yeah. I mean it’s there by default in modern versions of Windows now. So part of this is, okay what is a hypervisor and what do you need it for? Right? You need it for … Its whole purpose is to virtualize the hardware so that you can run virtual machines. And so you want that there when you have the need for running virtual machines. Right? But if you don’t have a need of running virtual machine then you don’t need it. So this is one of the reasons why Hyper-V is implemented the way it is. There’s for the most part, most people in your laptop or your desktop computer, you don’t have a need for running a VM. So like me, I have a MacBook. For the most part I don’t need a VM until I run Docker.
And then when I run Docker I do need a VM. And so the Docker implementation on my MacBook then takes advantage of a hypervisor to launch a VM in which Docker can run. And we’ll talk about that more in the container when we talk more about containers. But I guess think of it as like for the most part hypervisor is going to be something that’s going to be pretty explicit of a choice. Whether or not it’s actually enabled and you’re running virtualized software through it. Does that make sense?

Jon: It does. And it kind of answers the thing that was also going on in my head which was, if VMs are super, super efficient at this point. You can turn around what you just said, I don’t have a need to run a VM to be, I don’t have a need not to run a VM. Why not? Why not just always be running at the end? Because maybe if everything was always VMs maybe the abstraction and dealing with VMs kind of would be more same, right? You would need to have a difference between … You would just be always in a VM, right? And so everything is a VM. There’s no such thing as pure operating system that talks directly to the host hardware anymore. But it sounds like we’re not there yet. Or, and maybe we’ll never go there. But I was imagining in my head, “Oh, we could be it, we could be there. Why not go there? Why not have everything always be VM. Even my MacBook Pro OS X software. But sounds like we’re not there.

Chris: No. And if we were there, that would there would probably be for implementation benefits of people writing software and hardware and less for users.

Jon: Exactly. Yeah, exactly. Because then everything looks the same. Everything’s a VM to the people that are writing programs.

Chris: Yeah. Again, because you have something sitting between it, there’s always going to be some performance penalty. Right? So if you don’t need to be running, if you don’t need virtualization then you don’t want it. And so that’s again, why like for the most part, your laptop and your desktop computers, you’re not running a virtualized guest OS. But in the cloud, this is all that those servers run. Right? There is no host OS on those machines. Right? It is just that hypervisor and the only reason why those machines exist is to run VMs. So totally different kind of use case and how they’re used.

Jon: Cool.

Chris: And like I said, gets a little blurry with, do you have a host OS with Type 1 hypervisors? We talked about Hyper-V, how that’s a bit of a mishmash. KVM which is for Linux. It’s basically kernel based virtual machine extensions. And this is also a Type 1 hypervisor, but it requires Linux, right? So Linux is a host OS, so it’s blurring the line there a little bit as well. But again, for the most part Type 1 bare-metal native hypervisor. And that’s really kind of what we’ll be talking about going forward. Since that is the predominant one in the space that we’re in when we’re running cloud software.

Jon: Great makes sense.

Chris: Yeah. So I wanted to talk a little bit about protection levels and rings. One of the things that we talked a little about in Episode One where we were talking about virtual machines was, how do you virtualize the hardware? There is these things and the processors, I know that they have some capabilities and you can … Sometimes in the past you had to enable that in BIOS and you know what’s going on there. And so it’s kind of helpful to understand, again this virtualization of the hardware, how it’s done. And by talking about these protection levels, the rings. So this applies to the to the Intel family of CPUs, the x86 family of CPU. The CPUs themselves, they have this range of protection levels. And they’re known as rings.
So there’s essentially four levels of rings. So there is Ring 0 that has the highest level of privilege going down to Ring 3, which is the lowest level. And so Ring 0 having the highest level of privilege, that’s where your kernel or your supervisor is running. Right? And so by the way the kernel, another name for kernel is supervisor.

Jon: Okay.

Chris: And this is where the term hypervisor comes from, right? Because the hypervisor is really a supervisor of supervisors. Right? So hypervisor. So Ring 0 that’s where the kernel is running. That’s where the highest level of privilege, direct access to the CPU. And then moving on out from those rings, there’s less and less privilege to where your … Basically your applications that you install and run on your OS, those are running in Ring 3. Right? So with virtualization, with VMs, the hypervisor is going to be occupying Ring 0 of the CPU. Right? Which again makes sense. It has to, right? And it’s the highest level of privilege. But this kind of now gives rise to a bit of a problem because your guest operating systems, those kernels, they’re written explicitly to run in Rings 0. Right? But they can’t because the hypervisor is running in Ring 0.
So they can’t be in that. So how do we … So we need some way of dealing with this, right? And this again is going to be … This is one of the key things of how the virtualization happens, what the hypervisor is doing and the interaction between the guest OS and the hypervisor. So we have to deal with these guests OS kernels that were designed to run in Ring 0 that they can’t. How does that, we bridge that gap?
And so there’s basically three broad ways of dealing with this. So one is full virtualization and with this, the hypervisor is providing CPU emulation to handle those Rings 0 operations made by the guest OS kernels. So this is complete emulation of the CPU, right? So that requires both time and resources. It’s kind of the easiest in the sense that the guest OS kernels they don’t have to know anything different. But because it’s complete emulation of the CPU, the performance isn’t great. Right? This is the worst, worst performance you can have, right? Because you’re now you’re just completely emulating the CPU, it’s no longer, it’s not hardware. Right? So that’s full virtualization and that’s how things first started off.
So then we need some techniques, okay, we need better performance than that. How are we going to deal with this? And so this leads to the other two techniques, one’s called para-virtualization and the other one is hardware virtualization. And you may have seen this like sometimes if you’re launching EC2s. If you’ll see options there for PV versus hardware virtualization on your EC2, right? Just you may have seen this just mentioned in passing and said, “What is that?” So this is what it exists for, right? This is that it’s that problem of you have a virtual machine. The guest OS needs to be running in Ring 0, but it can’t. So how do you do that? And so para-virtualization, what it is, is the hypervisor provides an API that allows the guest OS to make calls on those API when it needs to run Ring 0 operations.
Okay. So the good thing, right? Performance wise it is no longer doing the full emulation of the CPU, the operations that need to be running at Ring 0 do run at Ring 0 through these hypercalls to the hypervisor. But the downside is that the guest OS has to be modified, right? It has to be aware of this. So the guest OS has to make changes to its code to say instead of making these Ring 0 direct calls, I instead have to make these hypervisor API calls.

Jon: Oh, interesting. Okay.

Chris: Right. So that’s para-virtualization. So good performance, but the bad thing is you got to go change code. Right? So there’s limited compatibility there, right? You have to have a guest OS.

Jon: So that would be the guest OS like Microsoft Windows or Ubuntu or whatever has to be capable of doing this.

Chris: Mm-hmm (affirmative). Yup.

Jon: Okay.

Chris: Yup. Absolutely. It has to be aware of that hypervisor and know that it’s running in that environment. And again, replace those and when it would’ve normally done those rings or calls, instead it’s making these hypervisor API calls.

Jon: Okay. Got it.

Chris: So that’s para-virtualization. And then the last one is hardware virtualization. So what this is, is this is now the CPU itself has hardware virtualization extensions built directly into the chip. And so this started happening in 2005, 2006. Both Intel and AMD added this to their, to their CPU chips and basically added a bunch of new instructions that allow for these guest OSs to run what normally would have been Ring 0 operations in using these new instructions. Right? So the actual virtualization support is coming from the chip itself to do these things. So these new instructions, they permit the entering and exiting of a virtual execution mode where … So the guests OS thinks that’s running with full privilege at Ring 0, but it’s not, right? It’s carved out and the host OS is still protected. So the true Ring 0 is protected, right? By running these new instructions.
So the good thing with this, your guest OS doesn’t have to have any modifications, right? So it doesn’t know any better, but the performance is great because you’re actually, you’re not doing emulation, you’re running directly on the chip through the … You’re directly on the CPU.

Jon: So are you saying the hypervisor is like, “Okay chip, going to handoff instructions now for a bit to VM A.” Let’s call it, and the chip is like, “All right, go for it. I’m ready.” And then VM A starts making calls to the chip as though it normally would and that the chip is like, “Yeah, you’re who I was expecting to be making these calls. Go ahead. I’m doing these operations for you.” And so the hypervisor is like, “Okay, I want it back. I want my access back, no more virtual mode and I’m going to do my own stuff again.” Is that kind of how it works do you think?

Chris: Yeah, I mean, you probably can think of this as just, it’s actually creating a new ring above Ring 0 for the hypervisor. And so that’s essentially what this is enabling. So the important thing here is you’re carving out that protected ring of core privilege to only the hypervisor and the guest OSs have some amount of protection so that they can’t pollute that. And so these instructions are essentially creating a ring above zero for the hypervisor to run in. A lot of time on that.

Jon: Yeah, it is a lot time talking, we could go on and on because then I’d start to have questions about what exactly is getting protected? And kind of, I want to think of it in terms of examples a little bit. I don’t know if you would be able to kind of give me an example of a sequence of events that might happen. But if you could, that’d be awesome. If not, we’ll save it for next episode.

Chris: Mm-hmm (affirmative). And I know a big part of this doesn’t have to do with necessarily with CPU instructions, but it’s also address based too, right? So carving out that virtual address space so that these VMs have their own isolated address spaces. Right? So they-

Jon: But that’s confusing in and of itself too, right? Because it’s like, well if the chip is the thing that’s supporting the … If it’s supporting the virtualization and I guess that there’s registers which are very limited and maybe that’s the address space we were talking about. But then there’s also the memory which you can put in and pull out. And there can be four gigs, eight gigs, 16 gigs, 64 gigs, a terabyte of memory. And that also has address base. And I wouldn’t think that the chip would know anything about that.

Chris: No. Yeah. So the CPU does know about that. Right? And that’s kind of what we’re talking about here. Is this the-

Jon: It doesn’t know about that. I thought that there was the operating system’s job to kind of know about that stuff.

Chris: The CPUs they have direct support for that and then the OSs they’re allowed to … They have to allocate that, right? But something has to control the access to it, right? Something has to talk to the memory chip. And so it’s going through the CPU instructions to do that. So there’s CPU specific capabilities dealing just with this, that makes virtual machines safer and protects it, right? So with Intel chips it has something called Extended Page Tables (EPT). So this is a part of that page table virtualization. So this is dealing with … The memory that we think of, right? Won’t work the same. My application needs a gigabyte of RAM, right? That memory is expressed as pages of addresses. And so the chip itself has support, right? For having these virtual address spaces that are isolated from each one of these VMs, right?
So there’s that going on. There’s actual operations, CPU operations that are added to it to kind of give us that ring above zero for the hypervisors, so that we’re making room for Ring 0 for the guests OSs, but again, without overstepping their bounds and not causing a blue screen of death. Right?

Jon: Right, yeah.

Chris: So means that you can blue screen of death in the guest OS, it’s running in what it thinks is Ring 0, but it doesn’t bring down the hypervisor. Right? That’s that kind of the isolation there that we’re talking about that it is doing.

Jon: Yeah. Yeah. Super fascinating.

Chris: Yeah. So I think that’s a good foundation for okay, virtual machines, full virtualization hypervisors, the two broad types of it, kind of going deep into the Type 1 hypervisors since that’s really what’s powering the cloud. And then talking a little bit about, okay, well how do these … How does the guest OS actually access the hardware? And some of that virtualization take place through these rings of privilege and the ways to get around this problem of hypervisor needs to run on Ring 0. How do the guest get access to Ring 0? And that’s where we talked about emulation and pare-virtualization and then the actual hardware support for it.
Maybe that’s a good place to leave off today. And next time we can kind of dive in a little bit with some actual implementations of hypervisors. So I’d like to talk about Hyper-V and its architecture because it’s really kind of interesting. And I think it’ll actually help really bring to light some of the stuff that we’ve talked about here. Talk a little bit about KVM, which is Linux’s kernel-based virtual machine extensions, which is very important. And this is actually where of a lot of the technology is going to today, especially in AWS. And then we can kind of talk more about just AWS and, and it’s history of hypervisors and what it does. So when you’re spinning up an EC2, what are the options there for hypervisor? And how are they been addressing performance? So they’ve actually been going down the route of making custom ASICs to deal with some of this stuff and to make it faster. So we can talk about that as well.

Jon: Cool. I can’t wait. Yeah. And already I’m feeling better about this. I’m feeling like I have a better foundation than I did in episode one. So thank you very much, Chris.

Chris: Awesome.

Jon: All right, so I’ll see you next week.

Chris: All right, will see you.

Voiceover: Nobody listens to podcasts outros. Why are you still here? Oh, that’s right. It’s the outro song. Come talk to us at mobycast.fm or on Reddit at r/mobycast.

Show Buttons
Hide Buttons
>