The Docker Transition Checklist

19 steps to better prepare you & your engineering team for migration to containers

43. The Birth of NoSQL and DynamoDB – Part 5

Jon Christensen and Chris Hickman of Kelsus and Rich Staats of Secret Stache conclude their series on the birth of NoSQL and DynamoDB. They compare the NoSQL database, Leviathan, created by Chris’s startup in the late 1990s to today’s DynamoDB. A lot of things haven’t changed, even though technology has evolved. It’s cyclical. There are patterns and problems that continue to dominate.

Some of the highlights of the show include:

  • Reason for Creation of NoSQL Database: How to scale database with Internet-scale applications to have a virtual pool of infinite storage that can be scaled out
  • Main Architecture Components of Leviathan:
    1. API client
    2. Update distributor (UD)
    3. Base server (storage node)
    4. Shepherd (housekeeping management system)
  • Additional core components included smart IP and storage abstraction layer (SAL)
  • Leviathan mostly used C code and minimal Java code to support users
  • Big difference between DynamoDB and Leviathan is request router and partition metadata system living on the server vs. living on the edge
  • Leviathan was a closed system with an instance for every network or data center; not designed to run as a software as a service, like DynamoDB
  • Leviathan was strongly consistent, unlike DynamoDB’s eventually consistent model
  • Definition and Different Types of Transactions
  • Shepherd was used to identify and address consistency, synchronous, and timing issues
  • Rather than using a file system, Leviathan used relational databases

Links and Resources

DynamoDB

Microsoft SQL

Oracle DB

AWS IoT Greengrass

Kelsus

Secret Stache Media

Rich: In episode 43 of Mobycast, we conclude our series on the birth of NoSQL and DynamoDB. In particular, we take a deeper look at Leviathan, the NoSQL database created by Chris’ startup in the late 90s and we compare it to DynamoDB today. Welcome to Mobycast, a weekly conversation about containerization, Docker, and modern software deployment. Let’s jump right in.

Jon: All right. Welcome, Chris and Rich. Another episode of Mobycast. Hey, Chris, you know this is the second time we’ve gotten together now after the holidays and I never asked you what you did. Anything interesting over the holidays?

Chris: Yeah. It was really nice and relaxing. Got to steal a few days down on the Oregon Coast with the family.

Jon: Ah, beautiful.

Chris: Very beautiful area and very relaxing.

Jon: Excellent. How about you, Rich? What have you been up to? Did you do anything interesting or fun during the holidays?

Rich: Yeah. I went back to New Jersey and spent time with my extended family. I’m pretty sure I didn’t get out of sweatpants the entire time, which is usually the goal, so I was good.

Jon: Cool. We do Mobycast from our sweat pants as it is, so there we go.

Rich: Speak for yourself. I dress up for it.

Jon: As for my holidays, I stuck around Eagle, Colorado because this is the place to be during the holidays. We’re close to the Eagle Airport. You can see the planes coming in as they go towards the Eagle Airport from our front window, and around the holidays we just see that traffic pick up from a plane an hour to a plane every two minutes. People were streaming in, getting ready to get their Christmas in the mountains. It was a good thing to do. It’s a white Christmas here in the mountains.

Over the last four episodes, we’ve been talking about DynamoDB and we are not done. We got into just some real detail of the physical and logical architecture of DynamoDB, I guess really the logical architecture of DynamoDB, how the sharding works, what a storage node consists of and the request router, a partition metadata system. We didn’t really talk much about auto admin, except for towards the end.

This week, we’re going to keep all that architecture in mind and go back and revisit where this all started, which is with the company that Chris founded called Viathan, and the company that he founded several years before DynamoDB was even a twinkle in Werner Vogels’ eye. Tell us maybe, Chris, if you could just help me do a little bit better job at the recap, just in the couple of minutes of where we were with the DynamoDB architecture? Go ahead.

Chris: Sure, yes. With DynamoDB, we can call out the high-level components. Those were the request router that was the front-end to the system. That was what all clients using DynamoDB are hitting that front-end and that request router is responsible for figuring out where the data that is being requested to either be read or written to, where does it live, which storage node partition is it on.

That was one of the other major components we talked about, were these storage nodes and these are those home partitions for all the data that’s in DynamoDB. They’re composed of three components there. They have the storage node leader, and then there’s two secondaries which represent the replicas.

We also talked a little bit about the fact that the default behavior for DynamoDB is to be eventually consistent. It doesn’t wait for all secondaries to be updated before it returns back a response so that you can configure that to say that you do want to it to be strongly consistent. But there’s trade offs for that kind of decision.

We also talked about the partition metadata system, kind of like the routing tables, if you will, for DynamoDB and given a particular piece of data, where does it live? And then you have auto admin, which is there kind of integrated monitoring, healing—

Jon: Repartitioning.

Chris: Yeah, basically all that kind of housekeeping that goes on with running a system like this. It’s so dynamic and when you do, you need to reshuffle things periodically. So they have the auto admin component.

Those were the four primary components that we talked on DynamoDB. Keeping those in mind will serve useful as we go into this next discussion about what came before that or what was Leviathan.

Jon: Great. All four of those things sounds fairly complex when you just listen to their names but it’s really just easy. It’s like you got some data, you want to stick it somewhere. You have a lot of data, it’s in a lot of different places. You have a request router to get you to where your data is. You have a partitioning metadata system to kind of remember where everything is.

For some reason, I’m thinking of that whole idea of having a castle in your mind, where it has different rooms so you can remember things, like, “Oh, I put this memory in this part of the castle.” That’s your partition metadata system.

Let’s move away from DynamoDB and roll back the clock a little bit and talk about the Leviathan architecture.

Chris: Sure, you bet. Maybe just a quick recap of what we’ve talked about in the previous episodes. If you haven’t listened to the previous episodes in this series, definitely go back and do so. But before DynamoDB, in the late 90s I was working at a startup and we had the same kind of problems that DynamoDB had. The reason why DynamoDB came about where how do you scale your database, dealing with internet scale applications, and have this virtual pool of just internet storage that can be scaled out. That was what that company was trying to do. The product we called NoSQL database that we’re building, the code name was Leviathan.

Jon: Just have to interrupt you. Remember, it’s part of the whole Moby Dick theme of everything we ever talked about, here at Kelsus and […].

Chris: Indeed. Yes, and maybe digging into that a little bit. Again, it was really interesting for me whenever I see the discussions about DynamoDB, about what were the pain points, why it came about, what are the high level components. Then when I sat on this deep dive during re:Invent of what’s the architecture look like for DynamoDB, it’s super interesting because it is really similar to the work that was on Leviathan.

Just like with DynamoDB, we talked about the four main architecture components. We can also talk about this primarily four main architecture components for Leviathan as well. A little bit different but pretty close.

The first one would be the API client itself. Looking back to this but in the Leviathan architecture, the API client itself is very much a key piece of the distributed system.

Jon: This is the front door to Leviathan?

Chris: It is. I mean, it was the API implementation. This is late 90s, so things like RESTful APIs didn’t really exist. The ecosystem wasn’t there for having this same kind of API trip in development that we have now. Typically, what you did was you built an API. It was a custom API and it may have your own wire protocol. You may go HTTP or you may not. Then you typically delivered that functionality via an SDK. That SDK was basically client-side code, libraries. Someone that wanted to use this API, they would link in those libraries and get that code to use it.

That ended up being the way that we delivered our function. The way that folks could send this function on the internet and hook into this storage system. Because it wasn’t just an API but it was also code, we could then put additional value, added functionality into it.

Jon: That’s the only part that doesn’t happen anymore. Even with HTTP APIs, I’m yet to meet a developer that isn’t like, “Ah, do you have an SDK? Is there any way I can not have to write all that HTTP and checking logic, and just use the SDK, please?” But the part that we don’t see anymore stick much beyond just the communications into that SDK.

Chris: Exactly yes. In our particular case, we didn’t really have a choice here. So yes, that was one of the key components. Another one we named the update distributor or UD for short. This was a piece of the system that was responsible for handling write request. A third component was we call the base server, for lack of a better name, and this was really our storage node. This is basically where the data was stored. It dealt with everything else around. It’s very, very similar to the storage node concept in DynamoDB.

The fourth component is called shepherd. Shepherd was our management system, the way of doing the housekeeping, the way of knowing when to do partition splits. It was responsible for migrating data when it had to from one partition to another one. Basically, everything that you… managing state, it also kept track of the cluster and partition maps in our system. When we talked about DynamoDB, they broke out. Their partition metadata system is separate from auto admin. For the purposes of this talk, it was all combined into the one component which we called shepherd.

Jon: Interesting.

Chris: So those are the four key components. There’s also some things that we may talk, that may come up during the rest of this conversation, some additional core components that we have. One we called smart IP. This, you don’t need anymore because now we have things like elastic load balancers. But back then, we didn’t have that so we came up with this term, smart IP, which was essentially a virtual IP. A way of having a single way of addressing a cluster of resources, having the load balancing being done across that set, and being able to dynamically change the set of nodes that are behind that particular IP address. We had to build that ourselves, so we had that.

Another kind of core system component we had was storage extraction layer or SAL. It was a layer that was used by our base server, does that storage node capability, to have it interface with what was actually being used as the persistent store. We had this concept of basically abstracting that.

Jon: I’m a little confused by this. This is the one piece so far that like, “Huh? What are you talking about?”

Chris: This system and this technology went through many iterations. In the first iteration, the storage nodes were using relational databases as a storage […]. The storage abstraction layer, what it would do is it would make it so that it could speak a common language and you could plug in maybe Microsoft SQL Server or Oracle DB as your storage node. That’s what that was. Eventually, it grew to become even more capable where it’s not just talking to relational databases but file systems as well.

There was this recognition that the base server has some common logical features and functionality it served and that was irregardless of how was it storing it on persistent to disk or what not. Things like caching and the query engine, all that stuff was in the base server level. Then, when we had to go actually persistent […] then we went to that storage abstraction layer.

Jon: I just can’t help myself. At the moment you’re building that part of your system, it is when you could have also been like, “Hmm, this company is now spending a great deal of money on a piece of the system… what’s the best way to put this in startup terms, you’re building something and you don’t have anybody using it. In a startup, if you can’t get people to use that steel thread that goes all the way through like, “Hey, here’s the opinionated version of this.” If you’re not willing to use the opinionated version, then making each little piece really super configurable probably isn’t the solution. But I digress. As soon as I heard, then I was like, “Oh, my God. That sounds like a lot of engineering for a startup that probably didn’t have any customers at that point.”

Chris: Yeah. I mean, maybe kind of paradoxically, it was actually probably cleaner to do it this way because it forced us to componentize the way we were thinking about things. We could have had just hard-coded Microsoft SQL queries littered throughout the base server. We have a few people on the scene that were SQL experts. Not everyone was, so there were definitely benefits to doing that. It wasn’t so much for the ability to go and support anything out of the gate. A lot of it is more for just architecture reason.

Jon: Okay. I don’t want to let you do it. I’m just kidding. Anyway, I got to get off of that. This is a constant thing for me. It’s sort of over-engineering that startups do. That’s why it’s grabbing on that.

All right, let’s just keep talking about what were some of the aha moments that were happening for you as you’re thinking about your old architecture and thinking about DynamoDB.

Chris: Maybe just dive in a little bit deep into what those four primary components do. Again, the API client, in a way, you can think of this as the equivalent of parts of the request router in the DynamoDB system. We didn’t have that. Because we had code running on the clients, we could include them as part of the actual system itself. Things like the equivalent of the partition metadata that was pushed into the clients.

What this gave rise to is that when they were doing read operations, they wouldn’t have to go through an intermediary service. They can actually go talk directly. They would have all the information that they needed to go straight to the right base server. There’s definitely pros and cons to it but you are reducing. That’s one last path that you’re going through on a read case.

Jon: Was there a situation with that where different clients could get out of sync with each other and that would be a problem?

Chris: There’s lots of consistency problems and synchronization issues and like, what happens if there’s some state changes and how do you get it distributed to all of them, to all the players in the system in a way that’s safe and still meaningful. Absolutely, there are lots of pretty challenging problems throughout this whole system. But in general, without diving deep into all those various techniques, can say that for the most part, the simple case of like, “Hey, I got to go read some data.” Having information necessary there to go get it, that was doable and there was techniques and parts of the system that handled what happens when it do get out-of-date or what not.

Jon: I imagined that a connected client is always saying, “Hey, let me know if this sort of map of where the data has changed and if it does, then it grabs a new map of where all the data is, and then if a client is not connected for a while and then connects, it probably has to go ask for that before it’s allowed to go anywhere.

Chris: Yeah. There’s versioning and there’s a lot of event-driven components into this system. I mean, this is essentially distributed caching. Not an easy thing to do but definitely, it’s solvable and that’s really kind of what it came to. A lot of times, you can’t have a 100% consistent state across everything. It’s just impossible. There’s just timing issues. So, how do you detect when you’re out of state? How do you do the right thing? What happens when you now have a cache conflict type thing?

All that stuff has to be found through and dealt with. I get a lot of that function that went into these API clients. Again, given that we just written in either C code or Java code, there was the code to handle that in there.

Jon: That was going to be one of my questions. Was most of the Leviathan mostly C, mostly Java, mostly C++?

Chris: Yeah, it was all C++ and really, just using C++ is a better C, so not tremendously object-oriented, use the classes family where it made sense, and over here since it made sense. But for the most part, it was C code and the only Java code was for our Java-based SDK and the Java API. We knew we had to support that out of the gate.

Jon: Yeah, for sure. No question. 2001 Java, 1999 Java.

Chris: Yeah. It was either Java or Windows. That’s what people are running. We knew we had to support those two environments. Java was our answer also for Unix- and Linux-type thing.

Jon: Right, that’s pretty interesting. I guess I’m just thinking about the fact that the big difference between DynamoDB and Leviathan that we talked about so far is the request router and partition metadata system living on the server versus living out of the Edge. It’s just fascinating to me because that sort of back-and-forth just happens and happens over and over like, one year it’s better to push stuff out to the Edge because you get better scalability and less work for the servers to do. Speedier little distributed system that you’ve got going.

Then the next thing is like, “No, but then it’s hard to do updates and you can have out-of-date clients, so let’s do everything on the server.” Then the next thing is like, “Oh, well you could push out code from the servers so let’s just use that way again.” We can never really decide as a group, community, as software developers on really where the balance is best and we keep vibrating across either side of the ballot.

Chris: Yeah and just like technology in general, just the constant iteration and different techniques, there’s technological advances and things like bandwidths changing, number of CPU powers changing, the cloud versus on prem that changes things, and whatnot. Every time you go design a system, it’s not necessarily one right way of doing. Maybe yeah, this year doing something, pushing it out, and having it on the Edge, the work been done on the Edge makes sense for your particular application versus some other application, because of some changes and cloud or some new service that you have available to you. Maybe there’s some constraint that you can put into your system where it makes sense that you should do stuff on the server-side and keep it central.

Jon: It would not surprise me whatsoever if AWS did something like, “Oh, we’ve created a new feature of DynamoDB’s Smart Edge Processing,” where they essentially push the request router and partition metadata system out until Greengrass Edge-type stuff. It can be even faster for systems that are doing IoT or whatever. I wouldn’t be surprised at all especially because, can’t you already run DynamoDB locally? Is that a thing?

Chris: Yeah, you can run DynamoDB locally. That’s mostly for the testing and just development. But with IoT and what AWS is doing with Greengrass, you also have Snowball and adding in each Edge Computing capabilities. The things that can be done at the Edge will be done at the Edge and usually almost always make sense to do so. You just have to balance out, like what’s the overhead of doing that?

If you can make it so there is very little state shared between these things that are out on the Edge and kind of phoning back home to a central data center, then you can scale out very easily. You do want to push that computing to the Edge and not have to do that computing inside your cloud, inside your data center. When it’s more a bidirectional communication path, then you start running into a whole bunch of problems.

Now as you scale up, it becomes exponentially more difficult to handle and more complicated. It just really depends on your situation and what it is you’re trying to do, what the constraints are, what the access patterns are, to decide what’s the best thing. This is a good point because with us, this is very much kind of a more closed system.

Again, these are heavy clients. They’re going into application code for services that are consuming this storage service. This is a closed system and you would have an instance for every network that you had or every data center that you may have had. It didn’t really support the concept of like, “This wasn’t designed to be run as a Software as a Service-type thing.” There were ways to do name space partitioning, to have multiple application, but for the most part, being this is like each customer would have their own implementation. It wasn’t one implementation for millions of other clients out there.

With that having these fit clients that were active members of the distributed system, that kept that number finite. It wasn’t really an issue. We knew that there was going to be tens of these things, hundreds of these things. It wasn’t like with DynamoDB, DynamoDB is offered as a service and you can have tens of thousands or hundreds of thousands of clients. Pushing state out into those clients like that, that would be more problematic.

Jon: Right, and another thing you mentioned that you were able to do because it is a closed system is being strongly consistent versus DynamoDB’s eventually consistent model?

Chris: This gets into that update distributor component. We’ve talked about the read pass because we have these thick clients and they have access to the metadata in their system. They can figure out which smart IP address for which storage server they need to go talk to for the read pass. For the write pass, they went through this component called the update distributor, and, again for every mutation, every created update request, when delete, those request would go through it and its responsibility was basically do the replication. It was essentially doing a two-phase commit across multiple replicas.

That’s why our system was strongly consistent. This was just the design approach we took from the get-go is that basically we would have this synchronous two-phase commit style replication of these write commands so that they would be replicated to at least two base servers. But it would do all that synchronously and the return back to the caller, now that that write is complete. That’s why we were strongly consistent.

With the difference with DynamoDB, they’re doing those writes to the secondaries via true replication. I don’t know exactly the details, whether they actually have like their natural storage node has code where it’s doing the actual write itself or whether they’re writing to logs and there’s a replication that’s reading from the logs and then making those updates on the secondaries.

Jon: I guess the latter but yeah.

Chris: Yeah, but that was the approach that they took there.

Jon: I want to interrupt you for just a second because you’re talking about two-phase commits and I’m just thinking about our audience. One of the things I’ve noticed in the past 20 years is that there’s less and less talk of transactions among developers for whatever reason. I just remembered it being a constant theme of daily design and architecture conversations around the office from 2000 to 2005 or so, and then it’s like the whole idea just disappeared.

I think it’s because it all got subsumed into libraries and tools that people used in a way that is good enough, that if you have to think about transactions, now you can count yourself among the senior developers. But yeah, can you just tell us what a transaction is and tell us what a two-phase commit transaction is?

Chris: Yeah. Two-phased commits are basically just a way to implement a transaction in a distributed system with multiple participants. Transaction just means you’re wrapping a sequence of one-end commands and you want that all to run together as one unit. So either all of it succeeds or if any one of those things doesn’t succeed, then the entire thing fails and you don’t change any state whatsoever.

You can have a series of commands like five different commands that you’re dealing with. You can be updating five different pieces of data and maybe it’s a cross even two systems. You want all that to happen as one atomic unit and if it doesn’t, then nothing should be changed. You have to have a way of reverting, if you will. Again, if you have five commands, the first three commands succeed, and then the fourth fails, then you have to roll back. You have to undo what was done on those first three.

Two-phase commit is a technique where basically it’s just saying the first phase is going and doing this update across all of the components and then the second phase is getting back the response that yup, that happened correctly, and then go ahead and committing it, if you will, to say that this should be available now in the system and it’s now committed to the system.

Jon: So phase one is like send out the work so that it all gets done some else. Phase two is get all the acknowledgments back that the work was done and collect them all. Once you got them all collected, then go ahead and lock it down.

Chris: Yeah, exactly. That’s what our update distributor basically did, so it did that, and we actually have the concept where you didn’t have to have just two replicas. You can have as many as you wanted. This allowed you to scale out for read operations. We were very much interested in read performance. That was by far the dominant traffic coming through the system.

To go and have multiple replicas of your base server and allot any one of those to be read from, the more you had, the more your read throughput was. If your write throughput was a small fraction of your read throughput, then the overhead of the two-phase commit to go across end resources definitely was a good trade off for having the multiple replicas for your increased read performance. That was the update distributor’s job.

Jon: Cool.

Chris: I guess the last thing to talk in a little bit of detail would be that shepherd component. This was one of the ones that the really hard discussions, this is where the hard problems were and how do you deal with these consistency issues, how you deal with these timing issues and synchronous issues, how do you know who is what and who’s the manager, who should be the leader, who’s the master as opposed to who the replicas and how do you deal with failures and what not.

It was an essential service but then we had agents. Part of that was an agent that would go on each one of the machines that was part of the system, agent running on it. There is this bidirectional communication between them. We had a separate management protocol, basically a control plane, if you will, for this system and that was admitted through shepherd and its agent-based architecture.

Shepherd would give you an overall view of the system. It would allow you to add new nodes to the system or remove nodes. It would detect failures or be responsible for health checks. It handled all the metadata that partitions, the clusters, and everything else associated with it and then it did data migration, which was a pretty big topic for us as well to deal with.

When you figure that, these splits that had happened, partition got too hot, you need to split it and now you have to move data from one base server to another one, and how do you that in a kind of way that’s consistent and safe so that it had functionality for that as well.

Jon: Yup, right on. While it’s a shame that so much really good engineering didn’t get much use, it’s cool to see that intellectual property that you created and registered with the patent office got revisited and got essentially listed as the foundations of DynamoDB. All good.

Chris: Yeah, it would be interesting someday to find out if there was any cross-pollination there or if it was just serendipitous, like same problem space and similar solutions being developed type things.

Jon: It could be. On the one hand, it’s like, “Well, this is a really specific system and it sure is interesting to see how closely the two are aligned,” But on the other hand, you have some very specific problem to it, like how do you get data and access it across many, many nodes efficiently. It does lend itself to a certain way of thinking about things.

I have a question for you, though. Do you when is the last time that Leviathan got booted up and run on a machine or multiple machines?

Chris: Oh, that was a long time ago.

Jon: I was wondering if you tried to give it a […]?

Chris: No, it would be so challenging to do so. This was all built on Windows 2K.

Jon: It was all Windows. Yeah, that makes it harder, I think. If it had been Linux stuff, it might be easier. The rate of change in the Linux system is sort of slower and backwards compatibility better.

Chris: Yeah. It was very low-level system code. You actually use some of the kernel APIs I think as well with NT, some of the undocumented kernel APIs. It would be very challenging but not impossible. I mean, source code is still here. I’m sure if you go into AWS right now, there’s an AMI for Windows 2K Server.

Jon: Oh yeah, definitely.

Chris: Given then, at that point, you need Visual Studio for the compiler, the Microsoft C++ compiler. If you can get that, like you could get going and fire it up. I think the last time this is probably run was probably 2001.

Again, at Viathon, we went through some iterations and some different domain changes. This was the first iteration of this technology was using this adaptive partitioning architecture against using relational databases as the storage mechanism and really abstracting at the database level. We then took the same technology and repurposed it for file systems. That became much more of the focus. That actually continued to live on beyond the company a bit as well.

We did that just because the friction associated with adopting the technology was so much less about the file system because there were no code changes that need to happen. Basically, our clients act like CIFs or NFS drivers. By doing so, it just became very easy for folks to now adopt this technology.

This ended up in, again like I said, almost the exact same technology architecture and what not. It’s just a different base server is really what we’re dealing with and basically, you had a full-blown virtualized file system with infinite scale and that was the next iteration of this technology that our company—

Jon: How interesting. You built more or less EFS out of your Leviathan.

Chris: Yeah. It was more like S3.

Jon: More like S3?

Chris: Yeah. You could store files. But also yeah. I mean, kind of similar. I guess you’re right. It was actually closer to EFS cos the way that we hooked into it was through the file system driver. We didn’t want to require people to write code. That meant more okay, this is just a big virtual file system.

Jon: It’s your C drive or whatever.

Chris: Yeah. The actual technology and the primitives are actually closer to what S3 does but the way that it is adopted kind of felt more like EFS.

Jon: Super interesting. I guess we should probably wrap it up but this has been fascinating to talk about Leviathan, your history, DynamoDB, its future, and how 20 years on, they are still getting around to some of the features that you built 20 years ago.

Chris: It’s been super interesting and fun for me to look back and again just be amazed by how much is just the same, like the same problems and a lot of things haven’t changed, though as fast as technology changed. We talked about this before in previous episodes. It is cyclical and there are certain patterns and techniques that do dominate. I think this is an example where that’s indeed the case.

Jon: Right. I’m reminded of a conversation I had with you a few years ago, I think, where I was telling you about this company that I worked for earlier. I did some contracting with called Taz and I was telling you how they had one developer that decided to just do everything in C. This was long after anybody was building any kind of web backends in C but for some reason, he was. Not only was he doing that but he had decided to build his own document database despite the existence of Mongo at the time. I was telling you how he had done sharding and had done some replica stuff, and I was like, “Why is he building all of this? This all exists,” and I just kind of remember you being like, “Mm-hmm, yeah.”

Chris: And again, it’s one thing to do the base stuff, to handle 80%. The first 80%, you can just do those main pieces, having things like sharding, some replication, and the document portion of the database but it’s the last 20% that is so hard. That’s the stuff that prevents you from actually deploying and from actually running in some kind of available fashion.

Jon: All right. Let’s leave it for next week. Next week we’re going to have a new topic. We haven’t decided yet what it’s going to be but looking forward to talk to you again and help everybody enjoy.

Chris: Sounds good.

Rich: Well, dear listener, you made it to the end. We appreciate your time and invite you to continue the conversation with us online. This episode, along with show notes and other valuable resources, is available at mobycast.fm/43. If you have any question or additional insights, we encourage you to leave us a comment there. Thank you and we’ll see you again next week.

Show Buttons
Hide Buttons
>