How we created an illustrated guide to help you find your way through the data landscape.
Designing Data-Intensive Applications, the book I’ve been working on for four years, is finally finished, and should be available in your favorite bookstore in the next week or two. An incomplete beta (Early Release) edition has been available for the last 2 1/2 years as I continued working on the final chapters.
Throughout that process, we have been quietly working on a surprise. Something that has not been part of any of the Early Releases of the book. In fact, something that I have never seen in any tech book. And today we are excited to share it with you.
In Designing Data-Intensive Applications, each of the 12 chapters is accompanied by a map. The map is a kind of graphical table of contents of the chapter, showing how some of the main ideas in the chapter relate to each other.
Here is an example, from Chapter 3 (on storage engines):
Don’t take it too seriously—some of it is a little tongue-in-cheek, we have taken some artistic license, and the things included on the map are not exhaustive.
But it does reflect the structure of the chapter: political or geographic regions represent ways of doing something, and cities represent particular implementations of those approaches. Similar things are more likely to be close together, and roads or rivers represent concepts that connect different implementations or regions.
Most computing books describe one particular piece of software and discuss all the aspects of how it works. This book is structured differently: it starts with the concepts—discussing the high-level approaches of how you might solve some problem, and comparing the pros and cons of each—and then points out which pieces of software use which approach. The maps use the same structure: the region in which a city is located tells you what approach it uses.
For example, in the map above, you can see a high-level subdivision into two countries: transaction processing and analytics. Within transaction processing, there are two regions: log-structured storage and B-trees, which are two ways of implementing OLTP storage engines. Within the B-tree region, you see databases like MySQL and PostgreSQL, while within the log-structured region you see databases like Cassandra and HBase. On the analytics side, you can see that the mountain range representing column storage reaches into both the data warehousing and the Hadoop regions, since the approach applies to both.
The maps are in black and white, both because of practicalities of printing and also because I was looking for a Tolkien-esque style. You are, of course, welcome to color them in yourself. In fact, by coloring them in, you would be following a fine tradition: for over three centuries, maps were printed in black and white from an engraved copper plate, and then colored in by hand.
Each of the chapters has a map like that, focusing on the particular aspects discussed in that chapter. This means that some cities appear on multiple islands—the data landscape is multidimensional, so a city may lie in more than one (conceptual) realm. For example, the map below is for Chapter 5 (on the topic of replication):
Cities representing Cassandra, MongoDB, MySQL, and others appear on both this map, the Chapter 3 map above, and some other maps, too.
Shipping routes connect some of the ports shown in the maps, in cases where there is a noteworthy link between chapters. Most of the maps are of islands, but there are some exceptions. (I won’t give away too much, but I just want to say...beware of the Kraken.)
I am incredibly delighted that O’Reilly was willing to take on this crazy idea of creating maps.
Data Innovation Summit 2017
30% off with code 7wData
Big Data Innovation Summit London
$200 off with code DATA200
Enterprise Data World 2017
$200 off with code 7WDATA
Data Visualisation Summit San Francisco
$200 off with code DATA200
Chief Analytics Officer Europe
15% off with code 7WDCAO17