March 26, 2018

A Deeper Look Into How Containers Differ From Virtual Machines

When asked to explain the difference between virtual machines (VMs) and containers, CIOs often struggle to provide a clear and adequate response. Whether we’re talking with junior developers or senior engineers (even those who have implemented containers in the past), the response is very much the same.

Conversations often begin by suggesting that containers are similar to VMs, but are lighter weight. While this is true, it doesn’t fully answer the question. A better explanation should include the reasons why these systems are lighter weight, a proper benchmark of “lighter weight,” as well as the other benefits of containerizing your applications.

Does Knowing the Difference Matter?

With so much confusion surrounding the difference between VM’s and containers, it begs the question: Does knowing the difference really matter? And, like most technical questions the answer is: it depends.

It’s very possible that decision makers are comfortable transitioning legacy applications to new technology based on hype and industry attention, but in our experience, that’s the exception to the rule.

Transitioning to Docker requires a fundamental change to how you build software. This transition comes with a learning curve and plenty of early frustration. It seems irresponsible to make a technology decision without understanding how a temporary halt in development velocity is worth the investment. Furthermore, if you’re selling this decision upstream to executives and a board of directors, “lighter weight” can’t be your smoking-gun.

The truth is, if you don’t have a fundamental understanding of the differences between containers and virtual machines, then you can’t take full advantage of the many features that make products like Docker a significantly better solution. You need to ask yourself, “What are the gains I’m getting? What are the hurdles we’re overcoming? How well are they going to apply our business?”

So let’s just agree it’s worth exploring a bit.

A (Somewhat Simple) Definition of Virtual Machines

A virtual machine carves (partitions) a single physical computer into what appears to be multiple computers. This partitioning is called “virtualization” and a VM’s job is to take a single computer and run several virtual computers within it.

For example, a 16GB memory card assigned to a physical machine could appear to be four 4GB memory cards equally shared across four environments (three of which are virtualized). Furthermore, the machine’s hardware is virtualized: everything from the processor to the memory to disk and network.

When you instantiate a new virtual machine, this VM depends on the system’s hypervisor. An “operating system of operating systems,” the hypervisor regulates the computer’s processor, effectively allowing a VM to virtualize and make use of system resources without conflicting with the memory, registers or I/O of other VMs or the computer itself.

When you create a virtual machine, the following happens:

The hypervisor carves out processing and memory space for the VM’s exclusive use.
A configuration file determines what the VM needs.
A complete operating system is installed onto the VM.
The VM’s operating system communicates with the hypervisor to perform tasks.
The hypervisor ensures each VM has access to necessary resources, according to the rules that were defined for it in step 2.

A Look at Computers From a Lower Level Perspective

The base operating system — the software that’s running after your computer finishes booting — is loaded from disk and communicates with the hardware on the machine. It communicates with the CPU, keyboard, monitor, memory, and disk via architecture specific machine code and hardware specific device drivers.

For example, this machine code knows about how an Intel chip or an ARM chip needs to receive instructions. The device drivers understand the hardware all the way down to the metal, where streams of bits go back and forth over the connections between the CPU and the hardware. Drivers are built to the specifications of the hardware and there are a lot of them in modern machines.

Just as the main operating system (OS) of a machine — the one that loads when it boots — talks to all the hardware through this architecture specific machine code and hardware specific drivers, so do virtual machines. They are in almost every way the same as the main operating system. The difference is that the main OS can see anything and everything about the physical hardware and the OSs of the virtual machines can only see the parts of the hardware to which the hypervisor has granted access.

Virtual Machines create huge advantages because a single physical machine can be partitioned to do multiple different things, and each VM’s resources is completely independent of the other VMs.

Virtual Machines offer a step in the right direction for maximizing hardware utilization and, as a result, provide better cost efficiency, speed to implement and overall performance when compared to spinning up multiple physical servers.

Why?

For example, if a single server costs $10K, and it’s only at 40 percent utilization for the work it was intended for, you can double the value of your infrastructure investment by adding a VM to it — not to mention saving on rack space, cabling, etc. Having a 40 percent utilized server was common because it can be dangerous to have two server applications running on one machine for fear they’d interfere with one another. Carving the machine up into virtual machines makes operations leaders comfortable with dual-purposing the machine.

An Evolution into Containers

The main reason why it’s tough to create a simple narrative explaining the difference between VMs and containers is because they act very similar. In fact, they are both technically “virtual machines.” They both allow you to run multiple instances of “something” virtually on a single piece of hardware. The difference is the level at which the virtualization takes place.

VMs virtualize the entire computer, while containers go up a level and virtualize everything within the operating system.

For example, an OS might think it has 4GB of memory to work with and it can carve that into hundreds of small sandboxes, each logically separated from one another and guaranteed not to conflict. This sandboxing is called OS level virtualization. Often, OS level virtualization is just used to completely secure applications from one another — for example, making sure that MS Word doesn’t step on memory that belongs to Google Chrome.

OS level virtualization can also be used to create a virtual machine, which is called a “container.” Containers depend on their host OS to do all the communication and interaction with the physical machine.

Let’s look at this more practically:

Think of containers as just another program — like MS Word or Chrome — whose job is to behave like a whole computer. But, containers don’t have drivers that talk directly to hardware. They just assume that the host operating system will take care of that part — and it does.

Container Benefits

Sandboxing

One of the modern features of operating systems that makes containers particularly robust and secure is the idea of sandboxing. Sandboxing is the idea that each program running on a computer should do so independently of any other program. It should have its own memory, its own identity when accessing files, and it should have its own place on the machine to store things (like preferences and user information).

Each program is isolated from the other programs. Because of sandboxing, each container can be trusted: it will not share anything with any other container — and therefore can be treated like a completely independent machine.

Even Better Hardware Utilization and Efficiency (“Lighter Weight”)

Since containers depend on its host operating system to talk to hardware, there’s a good amount of machine code (drivers, etc.) that doesn’t need to be copied and loaded when a container is run. This makes containers wildly more efficient to start and stop than VMs. Everything it needs to talk to hardware is already running before it starts.

They’re also significantly smaller to ship around. An image of a VM might require several gigabytes of stored information while a similar VM of a container might be only a few hundred megabytes. It’s not uncommon for containers to be an order of magnitude smaller.

Containers Can Run Within a VM or on Bare Metal

Since containers only depend on its host operating system, it doesn’t matter if its host operating system is the main operating system of a computer or a VM. Another way of saying this is that containers can run on a VM or on bare metal. When run on bare metal, the work the container is doing can be as performed as the work that any other program running on bare metal might be able to do — remember, the container is just a program. There might not be an appreciable difference in the performance of that task within the container as compared to the performance of that task run directly on the host operating system. Basically the software “running” inside the container is calling the same OS core APIs as it would be calling if it were running on the host operating system. The technical details might warrant a whole new blog entry, but the bottom line is that containers are not necessarily slower than VMs or bare metal.

The Docker Transition Checklist

19 steps to better prepare you & your engineering team for migration to containers