In my last post, we looked at the first phase of change driven by web applications that started to crack the design of relational databases. In this post we will explore the complete breaking of the dam being driven by the infusion of digital DNA into every aspect of our lives.
While it is hard to even imagine that anyone questioned the efficacy of the Internet as a true platform for the masses, much nearer to our collective memory is the downright skepticism that the world shared around the rise of mobile devices as the keepers of our personal digital world. Even the iconic iPad was initially showered with scorn by many. And before that, recall that cell phones were getting smaller and simpler until just 2007, when the iPhone was released.
The result of this personal digital explosion reached an important milestone in 2013 when, for the first time in history, online shopping overtook brick-and-mortar shopping for the holiday season in the United States.
Internet companies like Amazon and Google were eating the world, moving into traditional business after business after business. Today, that trend shows no signs of slowing. I previously suggested that just four years ago, it was a real question as to which type of companies would really need the compute power that relational databases could not handle. In 2015 the answer is: Any company that wants to be relevant in the next five years. (And I do not say that for dramatic effect).
Digital DNA is disrupting just about everything. As if our smart phones and tablets weren’t enough, the final digital tsunami comes in the form of smart devices and sensors, otherwise referred to as the Internet of Things (IoT), which is driving estimates of connected devices to 26 billion by 2020. That’s in addition to the 2 billion smartphones estimated by that same year. Finally, by 2020, we’re also looking at 44 Zettabytes of data at a rate that doubles every two years after that .
Facing our present circumstances, with an eye towards that future, technical teams have realized that the nature of the problem is so fundamentally different than anything imagined at the time of relational database design (45 years ago) that an equally fundamentally different solution is needed. Otherwise, you are left with architectures so complicated that you can spend all of your time and money trying to work around fundamental design anachronisms that never envisioned today’s digital scenario.
To say something is not up to task is one thing, but to propose something better is quite another. How do we know what better means for this new world? The answer can be found by asking another question: What, specifically, are the problems you are trying to solve? By asking that question, you will realize that the nature of our problems have changed.
In 2007, here is what one technical team wrote:
“Reliability at massive scale is one of the biggest challenges we face … even the slightest outage has significant financial consequences and impacts customer trust.”
The authors who wrote that knew a little something about the new world. It was the team at Amazon. And those are the opening words to their landmark white paper that fundamentally shifted how we thought about databases in the modern era. They realized that “always-on” was the most important aspect to running a business in the digital age. And “always-on” means “always-on” — even in the face of catastrophic failures like losing entire datacenters, regions, or even countries. In our modern world, “always-on” also means “extremely performant.” Anything beyond a couple hundred milliseconds response times, and your system is heading for trouble because: (a) users will get frustrated and abandon the “unresponsive” application for easily accessible alternatives; and/or (b) the system fails to keep up with throughput and must be stopped in order to catch up. Sounds crazy, I know, but all you need to do is watch a young person try and buy something online when the system slows even a little bit. They bail, and why not? They can buy it somewhere else, for almost exactly the same price anyway. That’s the scary part for traditional companies — they cannot rely on the same kind of customer loyalty that many have enjoyed for years or decades. They can leverage that loyalty in the modern era if they evolve, but they cannot rely on it exclusively while moving slowly.
Exciting as the new era of technology is, as the old saying goes, “there is no free lunch.” Designing a system that functions to today’s standards comes at a cost, which is re-thinking how we design applications. For new applications (e.g. Web, Mobile, IoT) the first-order design requirements are to solve for: (a) low local latency AND (b) always-on availability. Low local latency means very fast response times for users hitting your system no matter where they happen to be geographically located. Due to the limits of known physics, the best way to do that is to distribute the data closer to its intended endpoints. That kind of solution requires a truly distributed database — not a database built on the concept of a master (or “primary”) and secondary (or “slave”) configuration that hampers your “always-on” requirement. In fact, any database (relational or other) whose architecture is built on the concepts of primaries and masters will encounter extreme difficulty (in the form of complexity and/or performance) in scaling geographically to large sizes.
What the Amazon Dynamo paper brought into focus is a way to think about the database as native to this new world of radically connected and fully distributed endpoints, whether those endpoints be users, devices, sensors, or whatever.
Now that we understand how we got here, next I will close the loop by looking at where we are today in light of our past, as well as projecting a bit into a possible future.
 See inset box in upper left quadrant of the information graphic at: http://www.datastax.com/resources/infographic/5-years-of-explosive-growth-set-to-continue-infographic