07. How to Create Docker Containers (Part 2)
In Part 2 of a technical series about the creation of containers, Chris Hickman and Jon Christensen of Kelsus and Rich Staats from Secret Stache dive deep into Docker networking, including containers talking to each other and best practices for setup.
Some of the highlights of the show include:
- The Docker daemon acts as a network proxy to the host machine. A container’s inbound & outbound network traffic is proxied through Docker.
- Outbound network traffic typically just works automatically by relying on DNS requests. Docker’s networking allows that outbound request.
- Because multiple containers can run on the same host machine, inbound requests may require port mapping to avoid collisions. For example, you can’t have 2 containers both listening to port 80 on the same host.
- The Docker daemon acts like an internet gateway on the host machine. It provides an isolated network space for containers that is independent of the host machine’s network. Docker sets up it’s own subnets & routing tables for containers, and has it’s own internal DNS that allows different containers on the same host to address each other. For containers on different host machines, you’ll rely on public DNS.
- Docker creates routing tables to make individual containers accessible from the internet, if needed. The host machine network doesn’t know about Docker networking. The Docker daemon brokers traffic to/from individual containers on the host.
- AWS does have technology that makes individual containers directly addressable with virtual network adapters. This is beyond the scope of this episode, though.
- On a single development machine, how can you get multiple containers (e.g. micro services) to communicate with each other? For example, you have a Restful API in one container, and a database in another container. Use Docker Compose to define the two containers, each with it’s own name. Docker Compose creates DNS aliases using those container names, and places the containers on the same network. The two containers can now talk to each other (on the same host) using those DNS names.
- If you have multiple containers, you can also tell Docker Compose to use a specific named network instead of creating a new network for each container. This works well if you have a handful of containers that you need to run on a single developer machine.
- With more than a handful of microservices, it can be difficult to manage them all on one development machine. In this case it may be best to have dedicated environments (e.g. in the cloud) that run the latest version of each microservice, rather than trying to run all of them on your own machine.
- For a company just starting with Docker and transitioning a monolithic application to Docker, you probably don’t need need to worry about this complexity initially. If you re-architect your system to have many microservices, then you’ll need to give it some thought.
- In cloud deployments, such as AWS ECS, a public DNS entry typically refers to a load balancer, and the load balancer knows which hosts are running the container with the requested service. In this environment, your containers are running on a cluster of hosts, and you don’t directly control where your containers get deployed – it’s managed by your orchestration service (e.g. ECS or Kubernetes).
- Docker Swarm allows you to create a cluster of nodes, but this is above & beyond Docker itself. ECS & Kubernetes have their own mechanisms for networking across many host machines.
Links and Resources:
Docker
Docker Compose
Docker Swarm
AWS ECS
Kubernetes
Kelsus
Secret Stache Media
In Episode 7 of Mobycast, we continue with part two of a technical series around the creation of Containers. Specifically, Jon and Chris get super deep into Docker networking. Welcome to Mobycast, a weekly conversation about Containerization, Docker, and modern software deployment. Let’s jump right in.
Jon: Welcome Chris and Rich. Good to see you here for the Mobycast number 7. Last, we got a little technical and we talked about what you do inside a Container. When you’re making a Dockerfile, what you do to make it useful in different ways that you might configure it for development and for production. This time, I think we’re going to continue along that path and stay a little technical.
But before we get started, let’s just talk about what we’ve been up to for, really we haven’t talked for a couple of weeks, but what have you been up to this week, Chris?
Chris: Celebrating my son’s birthday. He’s 17, which is kind of a frightful number. I can’t believe that I’m a parent of a 17-year old. That’s going on. Also, Jon and I, we got together. Jon was out here in Seattle to speak at an event. We got to spend some time together, which was really nice. I enjoyed that, hanging out with Jon. And just definitely trying to keep out of trouble. Very, very busy at work. Launching lots of new software and features for an exciting client.
Jon: It was fun to see you. I really liked seeing your home there and spending some time going outside to good restaurants in Seattle. I almost hesitated to mention this because anybody listening might want us to start talking about something entirely different.
The talk that I gave was about Blockchain and IoT. It was so fun because I snuck onto this panel as somebody who knows about Blockchain and IoT because I had done some panels before for this SIIA Conference. I did not tell them beforehand that I don’t really think that there’s good use case for Blockchain and IoT. Then, we had a lively conversation, where I was the naysayer. That was fun. But yes, this is not a Blockchain podcast and it’s not going to be.
Rich, what have you been up to this week?
Rich: Last couple of weeks, I’ve been struggling with trying to build business automations. We started an outbound sales campaign in October or November of 2017. Things are starting to pick, which is a good problem to have. There’s just so many little tiny steps that need to be taken when I onboard a new agency or a new project and so I’ve been trying to figure out how I can use software. In this case specifically, a tool called Zapier, which provides a gooey for integrating applications.
Two things when an agency is onboarded in Pipedrive. It fires off a folder in Google Drive. It copies over an SOW template, a contract or template that fills in all the information like through a mail merge, adds in the QuickBooks. All of those little tiny steps that by themselves only take a couple of minutes but you always forget to do them or you name them differently and then they’re hard to find. I’ve been painfully trying to go through all those manual processes and build them into a system that just does it for me so I don’t have to hire an admin.
Jon: That sounds great. Rich, you’ve always been impressive to me in terms of how hard you work to automate things in your business and how many tools you try out and evaluate to see which ones actually work to get the job done. You’re a constant source of information for me when it comes to good business automation tools.
Rich: Sometimes, it’s good to not now how to do it. You have to find the tool to do it or you have to really determine if it’s worth to do it. I think one of the problems with building a software is that if you can do it, a lot of times, you just do it without taking the time to see if you should do it. Sometimes, not having that engineering background has actually been a benefit for me.
Jon: Makes sense. Getting into today, last week, as we were talking about Docker, Container and setting them up, the two things that we talked about were making your life easier by punching a hole and do this Container file system through something called […]. Those are a little dangerous outside of development so you want to have […] when you’re developing. You definitely don’t want to have them when you’re deploying, at least not for the purposes of hot updates. There maybe other reasons you want […] for staging your production, but not for the purposes of doing hot updates via for your development code.
The other thing we talked about was what’s the right balance for your base image? What things should be included in it? Should you build your own base image and store it somewhere? Can you base a base image on something else like GitHub? [..] can we talk about all that? I think we came to a pretty good place when it comes to how you set up an individual Container. But a lot of applications, in fact, almost any sophisticated system will involve more than one application and those applications need to talk to each other.
Recently, everybody loves calling that microservices, especially if each application only has a small, little thing that it’s responsible for doing. There are some things about having Containers talk to each other that are worth exploring. That’s what we’ll talk about today. How do you get Containers to talk to each other and what kinds of set ups are best practices for doing that?
The three things we’ll talk about today are Container networking, dealing with timeouts when you have a chain of microservices talking to each other, and then just dealing with authentication across different Containers. Let’s start with networking.
In order to talk about networking across multiple Containers, I think we first need to just talk a little bit about how networking works for a single Container. We’ve never talked about that from a technical perspective. Maybe we can just dive right in with that. Chris, do you think you can give us an explanation of how Container networking works?
Chris: Let’s see if we can keep it simple because it’s definitely a big topic to talk about in a comprehensive way. With the single Container, I guess we can think of networking as at the end of the day, being pretty straightforward and simple, where it needs to make connections outbound to known things and it’s reaching out the […]. Likewise, it’s able to receive requests as well that are forwarded into it. All of this is proxy-d through the Docker networking services since the Containers, like we’ve talked about before, […] Containers has a virtual environment off all the hard work, actions, and whatnot.
Jon: Chris, I’d stop you there. I just wanted to summarize what you said so far by saying the Container is like any other computer in a network. It can talk outbound, it can receive inbound communications but even though, it’s like an independent machine, it depends on its host operating system to be able to achieve that. That’s where things get a little tricky. That’s what I think you’re about to talk about.
Chris: Yeah. For the single Container, I guess just the important thing there is just knowing that it acts like anything else that you would have. You really don’t have to do anything special other than you may need to do some set up to tell Docker itself like what connections it’s listening to versus the outside that’s port mapping.
Let’s say you want to set up a web server and it’s being hosted inside of a Docker container. You want to want your browser on you host computer to go access that and you may want to access that at port 80 or maybe it’s port 8080 but inside the Container, maybe the code itself is designed to listen on port 80. You can set up that mapping.
The networking for inbound, definitely there’s some set up there because you’re doing that translation versus the outbound stuff. It just usually works out of the box because Docker’s just handling that […] for you. With a single Container, it’s pretty straightforward.
Jon: I guess I’m still a little confused. If a Container is like any computer and any network and say that web server is running on that Container, I don’t really understand why I would need to map ports. Couldn’t I just say, “Here’s the IP address of the Container, I want to hand it out at port 80 and anywhere. I can be on the host operating system or I could be on some other computer somewhere else and can’t I just talk to that Container as though it is just out there available as any old computer on the network?”
Chris: Where the difference is, is that Containers themselves are supposed to be like this isolated thing. They can’t see anything else in the world, but at the end of the day, they’re running on the same machine. If you had a bunch of Containers that were all just independent applications and they all were set up to say oh, I’m going to listen on port 80. If that ended up being the actual port that they were listening to on the host itself, then there would be collisions. It’s not going to work.
Jon: […] Do Containers not have their own independent IP address?
Chris: They do inside their Container and on the Docker network.
Jon: Okay. They have a private IP address. […]?
Chris: There are networks that are being spun up by Docker and so they have their own internal networking inside their Container space, which is what they have to. To the Containers themselves, they look like they’re a computer on a network so they do have an IP address that is a different IP address than the actual host IP address.
Jon: Let me ask you another question. I’m just trying to get my head around this. A container cannot be on the same subnet as its host machine?
Chris: No. They will not be. Again, the Docker itself is instantiating its own internal networks for these containers to run on.
Jon: A host machine may also be on an internal network and it might be on like 192.168.1 maybe its IP address is 5. Then, the container says there’s three of them on the host machine, each one of those might be like 10.0.0.1, 10.0.0.2, 10.0.0.3, the host machine almost acts like a, what’s the word for that when a machine has two IP addresses, one to talk to in internal network and one to talk to the outside network, a gateway.
A host machine is almost like a gateway to the Docker network. It’s got an ability to talk to the Docker network containers and it’s got the ability to talk to whatever network it’s also sitting on, right?
Chris: Right. The gateway is basically the Docker itself. The Docker daemon that’s providing that gateway, that proxy, if you will.
Jon: Okay, cool. Alright, I’m getting my head around this. Makes sense. Rich, that was just a little bit of a deep dive on some networking stuff. Working in your area, I know you do a lot of DNS work but I’m not sure you’ve had to do too much with subnets and things like that. Do you have any questions already or were you able to follow any of that?
Rich: I’m able to follow it, but not much more than that. I think I’m a little bit underwater but I can’t even ask you a question right now.
Jon: No, that’s okay, I think for those people listening, We can’t really dive all the way in and explain how networking works from the ground up, but I think the thing that comes out of this is you probably will be a little confused and a little underwater with Docker and using it if you don’t have a fairly solid base understanding of how subnets work and how IP in general works. That would be your take home homework, is to go read a little bit about that.
Chris, now, you had said, “We need to do port mapping.” That’s where I lost you. I was like, “Why do we need to do port mapping? Can’t I just send messages directly to these Docker containers?” You were like, “Yes, you can.” But now, I understand, since they’re hidden in their own subnet, a thing for you, Rich, is that machines in one subnet cannot, if it’s a private subnet, they cannot talk to machines inside another private subnet.
If my IP address is like 192.168.1.5, I cannot see a machine whose IP address is 10.0.0.5. I just can’t see it. It’s not available to me. I can’t send a message to it. I can send a message to another machine in my own subnet, and then that machine can forward that message inside of it. If it has access to that private subnet, it can forward that message along to the machines in that private subnet.
You might need to do some port mapping because it might just spray, actually, now I’m a little confused again. A message may, it might end up having to go to any and all of the containers inside that subnet, is that right?
Chris: I think there are two distinct topics here. One is the port mapping that we talked about and the other one is basically subnets and routing. When we talk about public versus private subnet, it’s usually like in the context of cloud networking, basically, public subnets are subnets that are reachable via the internet. They do have an internet gateway or they have a route and a routing table that allows them to receive traffic from the outside internet as well as to send traffic out to the outside internet.
Jon: Right.
Chris: It’s really just there’s a route there that allows for that transmission to happen.
Jon: Right. Good point. I just want to jump in and say that the mistake I was making in the public private subnet terminology that I was using is that I was saying private subnets mean a network of computers that have private IP addresses, but Chris is much more correct in saying that pretty much all subnets have private IP addresses, it’s just a matter of whether those subnets are reachable from the outside world with the gateway that has a public IP address. Continue from there, Chris.
Chris: Private subnets are basically subnets that there’s not a route in the routing table that allows them to directly connect to the internet. Instead, usually, they’ll have a route to a NAT gateway, which is a network address translation server that does the proxying on behalf of that. This allows those machines on a private subnet to talk to the internet but not directly. They’re going through a […]. It’s definitely a security mechanism.
Docker itself, when you’re instantiating your containers, it’s creating its own subnets, if you will. It has actually DNS built into it as well. It has routing tables that it’s setting up accordingly. This stuff happens and that’s part of that whole Docker networking services. The analogy of private and public subnets definitely carries over into the whole Docker networking ecosystem. I guess the important thing there is that in order for anything to talk to anything, there needs to be a route to it in a routing table.
Port mapping, where that comes is again, it’s all a matter of just if you have multiple apps running inside containers that are all doing the standard way of listening to things, like again, most web servers, they’re going to listen to port 80 for regular http traffic and port 443 for TLS and encrypted traffic. But if you want to run like four different web servers on your machine, something has to get right. You can’t hit all of them on port 80 outside the container because there would be these collisions. Port mapping just comes into play when it’s just out of necessity, where you have to change that.
Jon: That’s a piece that I was […]. You said there will be these collisions. It’s because outside that subnet or that little network of containers, you can’t see them individually so you can’t say this request is for container A and this request is for container B. Because you can’t see their individual IP address. You cannot address a message to them individually, right? That’s what I’m trying to get my head around. They’re all hidden. They cannot be seen individually or addressed individually from outside of the network that they’re in.
Chris: Kind of. It’s definitely a bit more nuance than that. This is where ports really do come into play. If you are just running all this stuff locally on your own machine, you spin up a container and again, it’s a web server. You now want to hit it from your browser. You’re launching Chrome on your host machine. What address do you go to? With latest versions of Docker, you could just go to localhost and the loopback address, that’s what Docker is set up to bind to. You can just say, “Hey, go hit localhost.” If you don’t give it a port, your web browser is going to assume port 80.
If you launch a container and didn’t do anything with port mapping inside, you’re basically just saying, “Hey, can I use the defaults? I’m just going to let this listen at port 80.” Then, that’s all going to work for you. You don’t have to do anything. Docker knows. It just says, “Hey, I’ve got these containers. Here are the ports they’re bound to so that when I get these outside requests, I’m going to route them accordingly.”
That’s where the port mapping really comes into play because if you now have another container that you want to spin up web server B, you just need to expose a different port for it. That way, when you request it from your browser, basically, in the browser, you’re saying, “I don’t want to go to just the localhost, I want to go to localhost: whatever port it’s listening to.” That’s the information that then allows Docker to say, “Oh, I’m going to route this request to that thing.”
Jon: I think I finally had my aha moment because here I was thinking that the containers have their own IP addresses and you just said you would go to localhost. But that’s the IP address of the host machine, so why am I not going to some other IP address like 10.05.0.4 to hit my container?
Chris: This is a fun talk today because this is super technical.
Jon: I know. Yeah, it is.
Chris: We’re going super deep because there’s definitely a lot of layers that are going on here. You have the networking at your host computer level. It can’t see the networking that’s going on inside these container spaces so there’s this private subnet spaces and networks that Docker’s creating inside of it. Your host machine doesn’t know anything really about them. Probably, you could manually change that and set all that up but there would be a really big pain and it’s very dynamic and it’s changing all the time.
Jon: I thought that the host machine was ending up as the NAT gateway between the Docker network and its own network and it sounds like it’s not unless you actually set it up to do that.
Chris: Correct. Basically, you’re talking to the Docker daemon. The Docker daemon is doing that brokering for you. Again, it’s going off things like okay, what address am I binding to on the host machine, and what port am I binding to, and what protocol am I using?
Jon: This is pretty fundamental because we had this big conversation about what’s the difference between VMs and containers. We didn’t really get into networking that much but we said, “Well, one of the big things about containers is that they depend on their host operating system for everything that they do, for talking to hardware, talking to network, talking to this, and talking to the CPU itself.
You’re really seeing that because if from a networking point of view, you talk to the containers as though they are the localhost, […] you’re saying, “Localhost, do this http request for me.” You’re not saying, “Container IP address, see this http request for me.” That really is a pretty big dependence on the host operating system. It’s really saying, “Yeah, these things are computers. They’re their own machine but their networking is really fully the network of the host in a way.” That’s the aha I’m having here. It’s like, “Oh my goodness, we’re not hitting these things as though they’re individual nuance on a network but rather, we’re just depending on the network of the host operating system.”
Chris: Absolutely. You are always going through that obstruction layer. You’ll come back again to the Docker daemon itself, the Docker services. They’re providing all of these different functionality of subsystems, whether it be networking or storage, input output, Docker is providing the actual Docker service itself. It’s providing that as a proxy to its containers and its host, whether its host is […] or VM itself.
When you’re outside the container, you’re going through that proxy. You’re not talking directly to the containers themselves. Again, we could talk for hours on this because there are technologies that are happening where you can actually directly connect to containers. Basically, they can have their own virtual network adapters that are individually addressable and so, folks like Amazon and AWS, they are doing this for lighting up some really great security features and performance and whatnot.
For right now, you can think of it as that, there’s a virtual network interface that from the host, all the communication goes through. And that’s provided by the Docker services itself and then that then, connects to the proper container.
Jon: Now, I completely understand this collision thing and this need for port mapping. I think I can explain it in a way that can help Rich out a little bit. Rich, if you’re at a computer and you wanted to run a web server on it, you would run that web server, it would startup and it would be listening on port 80. Then let’s say you want to run another web server. Say the first one you run is NGINX and the next one you want to run is Microsoft IIS. If you start it up IIS, IIS will complain, “Hey, there’s somebody else already on port 80. I can’t run on here. This is already taken. I’m not going to even start.” That’s called a collision and it doesn’t work. And so you would have to run IIS […] on some other port.
It sounds to me like containers have the exact same problem. If you have two containers and they’re both going to run NGINX, you got to make sure they’re running on different ports because they’re using the port of the host operating system.
Chris: […] is that you only need to worry about that if you want them to be hot from the outside world, from the host. Sometimes, you could very well spin up to a container or both listen on port 80 and they may not expose.
Jon: That’s so interesting.
Chris: It just depends. They may not expose themselves with the outside world. You can just have one container talk to another one via the Docker networking.
Jon: Now, I think we understand this and why you need to have port mapping so that you can avoid these collisions. Is there anything else we need to know just about how networking works for a single container in order to be able to start talking about , “Okay, now that we have multiple containers, what are we going to do?” And they have to talk to each other. Is there anything else we need to understand or can we move on to the next […] to this conversation?
Chris: I think we’ve gone deep enough. We’ve definitely talked about some pretty hairy topics. I guess this is a good segue because we’ve mostly been talking about what are the networking ramifications […] to containers from outside of them? To go from the host to the container space. That’s a good segue into now, let’s focus on the problem of how do you get these containers to talk to each other because each of them, again, this is one of the core principle of Docker, is that these things are isolate. They’re running in their own bubbles. How do you actually get these things to talk to each other?
Jon: Alright. It sounds like there may be more than one approach to this. What are our approaches? What can we do?
Chris: Sure. To set the framework here, again, if you just have one container, there’s no containers to talk too and it’s pretty simple. You’re just making request to the outside world and it’s via DNS, […]. It’s going to google.com or something like that. It’s doing a DNS lookup to find out what the IP address is and then go and make that connection.
What happens when you have now two containers? A good example might be you have a web server or let’s say you have a restful API server that has a database associated with it, it’s using […] as its storage engine for data and so, it needs to go and make this as requests comes in for doing creates or updates on its data and wants to store them on […]. It needs to make a call to that. Let’s assume that that’s running in a separate container.
How do these things talk to each other? How do you know how to refer to them? There are a couple different techniques there. One of the important things to realize here is that Docker itself has its own built in DNS system and it does a lot for you right out of the box. One of the common use case scenario, let’s assume we’re building a microservice. It’s an API […] and it has a container for the PAI server as well as a container for the database, you’re going to use something maybe like Docker Compose to define these two containers in a Docker Compose file. You’re going to give these containers names and then you can use Docker Compose to instantiate these two containers.
When you do that, Docker Compose does some really nice things for you. That is it sets up its internal DNS to create aliases for those containers by those container name. If I’ve named my one container API and the other one is called DB, I can no actually, when I’m running inside the container, there’s now DNS entries for it. I can do a DNS lookup on DB. That will actually resolve to the IP address assigned to that DB container. Likewise the same thing for API.
These two containers can now talk to each other by […] Docker DNS system. It added those entries. It made those available. Also, it created a subnet, a network for these things and it put both those containers on that same network. That’s the important thing.
Jon: That’s so interesting. Once you’re inside the container and inside this world of containers, the containers look more like normal networking to you than when you’re outside of it and on the host operating system and trying to talk to them. All of a sudden, the world starts to look like what we’re used to with just computers on a network again.
Chris: That’s exactly what it is. It ends up being pretty nice and pretty straight forward when you have a single repo that you’re working on and you have the single Docker Compose file. You don’t really have to do anything too terribly special. It’s pretty simple. There’s the common network for these things to be running on and the aliases get it set up and end up via Docker’s DNS. You can talk to each other. That’s the simple straightforward case.
The more challenging one is what happens when we really adopted the microservices pattern and now, we have multiple microservices and each one of those is hosted on its own repo, each one has its own database, and maybe you have 5 of these things or maybe even 10 of these things. Each one of these may have two, three containers that it instantiates inside each one of their Docker Compose files. Now, it gets much more interesting and challenging because if I don’t do anything different, if I go into a Docker Compose up on microservice A and then do a Docker Compose up on microservice B, by default, those will be created on separate networks and there’s really no way for those things to easily talk to each other, to discover each other or to actually make network calls to them. We have to do something different.
There are various techniques to make that work. One of the most straightforward things is basically tell Docker Compose to instead of instantiating a new network when you’re actually starting up your Docker Compose instead, tell it to use an existing one or to use a specified one. Now, you have two different Docker Compose files. In those Docker Compose files, you’ll tell each one to say, “Don’t create a new network. Instead, go use this network.” Some specific names network that we’re going to create.
What ends up happening is that then, these containers get spun up on that common network and now, they can talk to each other and we can do some further steps inside there to say, “Here are the DNS names that we want you to create for them inside this so we can create these aliases for each one of these containers.” Now, it’s no different than DNS. You’re just saying, “Microservice A, its name is going to be like I just said, microservice A.” Then, when microsevice B spins up and it needs to top the microservice A, it just knows. It can just refer to that as microservice A and all that’s just going to work and vice versa.
You can extend that pattern for as many different services as you want that need to talk to each other. That work well when you have a couple of these separate microservices. But if you’ve really adopted the microservices pattern and you have 5, 10, who knows, maybe even more of these things, then it gets a little complicated running locally. You really have like 10 different Docker Compose files with 2 to 3 containers each.
Let’s just say you’re just working on microservice A and you added and endpoint, do you really have to do a Docker Compose up on 10 different things and run all these containers just to get it to work? That becomes an issue. Then, you start asking yourself, maybe I need to have a dedicated environment where the latest versions of these things can run that I can dev against, that I can test against.
That’s where the CloudMASTER is coming into play. It makes sense if you’ve got this full up microservices architecture with many dependencies, then you have to balance that out. Does it make sense to actually instantiate them locally on a machine or do I just instantiate the one that I’m working on and I just trust that these dependencies are up to date and running in this well known location.
Jon: Is that what we do at […], we have dev environment on AWS and we just plan on that being there for us?
Chris: We’re not quite there yet because the need is not quite there. We don’t have enough microservices to reach that pain point where we have that problem. We architecture design phase but this will become much more of an issue. We’ve been going through and looking at the various services that we have. There is a a bit of duplication in them and that they are a bit more independent but there is that duplication and that’s causing us for our velocity to slow down so we’ve been going through and re-architecting to say, “Okay, what is the common core services that we have here and let’s break that stuff out. Put it into independent microservice units and then build on top of that.”
We are a little bit more to this hub and spoke approach there and that will increase some of the complexity and start forcing some of these that we’ll have to start changing the way we do things. But for right now, many of our services are pretty much independent.
Jon: That’s interesting because that gives a timeline of how things work. We’ve been at this for about a year now. We’re using platforms as a survey, monolithic applications to using Docker and starting to break out some microservices. We’ve been going at it for about a year. Just now, when some of the stuff has started to become relevant. Companies that are starting to make this journey, this won’t be relevant to them right away. This sounds sophisticated and difficult. You have to get a lot of opportunity to learn before you have to tackle this stuff.
Chris: Indeed. […].
Jon: One thing that just sticks out to me that’s pretty wild is just at the risk of dropping down the rabbit hole even further. It’s just that the containers might even be running on different machines and those machines might even be running on different networks or subnets that can’t even see each other. But somehow, it seems like Docker is able to just overcome that by creating a system. If the host machines can talk to each other, then it can figure out a way to route to the containers within the host machines. Even if one container way over here at a subnet A needs to talk to some other container way over there on subnet B. Is that right?
Chris: Yeah. For the most part, you are going through the Docker daemon as the proxy that’s running on the host. If you’re on one host, then you need to talk to a container running on another host. You’re basically going through that other host.
Jon: But from the point of view of the container, it’s just […] a DNS name. It doesn’t even realize that it’s going through another host. You have said microservice A is the DNS name for one container and microservice B might be the DNS name for another container. Those are created by the Docker networking system. As long as host A has microservice A and it can talk to host B that has microservice B in it, it doesn’t really matter anything about the network architecture between host A and host B. Microservice A can talk to microservice B just with the DNS name. That’s what I’m gathering.
Chris: All of that networking has to happen on the same host. That Docker network has to be on the same host for those things to be […]. You can’t spread that network across multiple host because those are two separate Docker daemons now and they’re not aware of each other.
Jon: That just pops my bubble. That’s like, “Wait a minute. ECS has this cool feature that lets me just add new host whenever I start running out of room on my existing host. Why can’t all this just talk to each other like one big happy family? All of the containers, and all of the host, and my ECS cluster.
Chris: By the way, the way we typically do that in something like ECS is you do it via DNS entries talking to load balancers and then the load balancers are the ones that manage which host have the containers of that service that are currently running, and then they’re able to address them.
Jon: Okay.
Chris: There are other technologies and ways of making this more sophisticated and to do several things. Docker itself has swarm, which allows you to create a network of a cluster of nodes. Those are able to see each other and talk to each other. That’s separate from Docker itself. It’s something different. Same thing with ECS and Kubernetes, Amazon, they’re also coming out with a different way of looking at networking for Docker with network interface, virtual NICs type of thing. This is a very evolving topic. It can go very, very deep. It’s very complicated and there’s many different layers to this cake.
Jon: Okay. But for the purposes of what we just talked about, just to make sure I get it right, if we have an API that’s exposed to the outside world, the outside world is going to make an API call that’s really going to hit that load balancer. The load balancer is going to go send it to a particular host. That host might have microservice A container in it.
If microservice A depends on microservice B, then it might use that DNS name, microservice B, to ask microservice B for something and then ask microservice C for something. But that chain of dependency is all going to happen within the host that was chosen initially by that load balancer.
Chris: This gets a little bit more complicated just because of the fact like when you’re running inside something like AWS and you’re using network […] like ECS, you don’t have control over where you’re containers are being placed and you have multiple host usually. In that space, you typically aren’t using Docker networking and Docker’s DNS system for resolution. Instead, you actually are using […] on DNS proper. The result is at the host level.
That’s again why just pure DNS and load balancers end up being away of doing service discovery and address resolution when you’re running inside the cloud, when you’re running across the cluster machines.
A lot of what we talked about before was basically the assumption that you have a single host. You have a single node and you can have multiple networks on that node inside the Docker space and how do those things talk to each other. It is really different when you’re developing and using Docker locally, like on a MacBook or desktop computer versus what happens now when I’m running in a production environment, in the cloud using the orchestrator. I have cluster machines that changes the game significantly.
Jon: Got it. Your configuration for Dev, your URL to talk to microservice B might be configured to be the DNS name of microservice B provided by Docker. But in stating a production, when you’re on the cloud, the URL for microservice B will be whatever the load balancer’s URL is for.
Chris: Right. You’re using something like we’re at route 33.
Jon: Okay. Got it. Cool. Super helpful. If we just do network in the desk, is there anything else for you to talk about on networking?
Chris: Did we lose Rich?
Rich: I am still here, physically. I can say pretty confidently that you’re explaining it really well. I think this is one of those episodes that will encourage some of the less experienced listeners to go back and listen to it a few times because I certainly will.
Jon: Right on. There were two other things we wanted to talk about, which were timeout and authentication. I guess we still have time if we feel like timeouts and authentication are small enough topics to fit in a couple of minutes. Otherwise, we can save them for another episode and I can wrap up with what I’m feeling is the conclusion that’s coming to my mind. What do you think?
Chris: I think that timeouts, that’s probably a whole episode because it’s in the context of hey, I’ve got some platform architecture I’ve adopted on microservices approach where microservices have dependencies and they’d call other microservices what happens when […] microservice A makes a call to B, makes a call to C. The call from to A to B might take 10 seconds. The call from B to C might take 50 seconds and that causes an ELB 500 timeout for the original caller. How do you deal with that and do error reporting accordingly like who’s the responsible party type of thing. It’s a pretty complicated topic.
Jon: Yup. I agree. We’ll save that for another one. Hopefully, maybe next week, we’ll try to have a little less technical one and come back to that sometime when we’re ready to go technical again. And then, authentication feels like it can be part of that same conversation because that’s about more software level things. They’re really not so much Docker configuration. They’re more about how do you design your software.
Chris: Absolutely. They’re very much microservices type of problems. In this case, how do you secure your endpoints between these microservices? How do you trust them and what are the techniques that you can do there when they don’t have necessarily specific user, there’s no user context. It’s just service A wants to call service B. But you don’t want that unauthenticated. That needs to be verified that they should indeed be allowed to make that call.
Jon: Right. Here’s the conclusion that’s coming to my mind from today’s conversation. Remember, a while back, we had the conversation of why should you do this? What’s the value here? One of the big things that came out was well, it’s sort of like doing software development the hard way. It forces you to learn certain things that you may be glossed over or didn’t have to really understand to accomplish your job.
Did you just notice how we could’ve even have a high level conversation about networking? Because we just had to get deep. I just couldn’t make sense of it without getting on into the weeds and having you really explain to me how networking works and how that’s different with Docker. That just happens over and over again.
Now, I’ll walk away from today just more knowledgeable about computer networking in general. Having that happen to the rest of yourself or development team is just such a great thing.
Chris: Absolutely. Wait till we start talking about storage engines and […] versus XFS or something else. There’s a lot going on I need to cover and it’s one of those things where, once you start peeling back that onion, it makes things come together so much and it gives you that core foundational knowledge that just makes you a better engineer, a better developer.
Jon: Right on. Well thanks everyone for joining.
Chris: Thanks guys.
Rich: Yeah. Thank you.
Well dear listener, you made it to the end. We appreciate your time and invite you to continue the conversation with us online. This episode, along with show notes and other valuable resources is available at mobycast.fm/07. If you have any questions or additional insights, we encourage you to leave us a comment there. Thank you and we’ll see you again next week.