Some of the fundamental differences between Data Modeling for NoSQL and relational databases pertain to the way these respective technologies operate. Whereas relational databases are centralized, NoSQL data stores are distributed. Most relational options are founded on the principles of Atomicity, Consistency, Isolation, and Durability (ACID). A number of NoSQL systems, especially the earlier ones, are based on Basic Availability, Soft-State, Eventual Consistency (BASE), though there are now some options that have viable ACID transactional consistency. In BASE systems, there are multiple copies of the data simultaneously, which are in a soft state that is constantly changing to get eventual consistency. Hsieh mentioned that:
“The data can be stored in nodes not just in a local data center, but across multiple data centers. This is done by a mechanism called replication; it is part of the inherent capability of NoSQL systems.”
Enterprise Data Architecture, EDA is an alignment of an organization’s information assets (both hardware and software) with its business objectives. EDA functions as a foundational pillar of data objectives pertaining to Data Governance, lifecycle management, data quality, and security. It offers an integrated view of information across the enterprise. NoSQL can significantly impact an organization’s EDA (and its Enterprise Data Modeling) by significantly accelerating the process of performing offline analytics and ETL on online OLTP transactions—which previously took days, if not weeks. NoSQL options can hasten both the offline and the online loads, while providing a critical medium between the two for near real-time dashboards and reporting.
“This is where we start to introduce the concept of near-line processing, where we have what’s called ‘fresh analytics’ and ‘fast analytics’ comes into play. To handle that kind of workload and functionality, this is where NoSQL technology comes into play, to fill the gaps.”
In addition to a comprehensive EDA, the most substantial difference between relational and NoSQL Data Modeling is the role played by developers. Developers tend to “sometimes be overzealous, and sometimes be informal,” yet are naturally empowered with NoSQL Modeling by “the capability of flexible schema and with the ability for having a semiautonomous management capability of the database.” Subsequently, the need for full-time modelers and DBAs is greatly reduced—if not outright eliminated. Instead, developers are tasked with collaborating with the business owners of the data and product managers who issue practical requirements for the models. The developers take ownership of the data model design, which is greatly informed by business logic.
Whereas most Data Modeling in the relational world consists of the conceptual, logical, physical modeling methods, the modeling responsibilities of developers in NoSQL databases require a slightly different paradigm.
The process continues with input from product managers and business owners regarding requirements such as storage, peak, rewrite, service level agreements, and other business needs. From there they create domain models in addition to query access pattern and application models. The former requires approaches that are database agnostic (such as Domain-Driven Design), while the latter frequently involves UML.
The three groups primarily collaborating on the design process include a business team, a team of operations, DBA personnel (which is largely relegated to the role of support), and a team of developers. Nonetheless, it is important for developers to understand much more of the business logic and the value of the data when modeling for NoSQL than others do when modeling in relational environments. “If you want to really understand the business details of your underlying data model, you have to look at the source code,” Hsieh remarked. Developers can augment this approach by looking at objects mapped to the database. More importantly, the new role of developers in NoSQL Data Modeling can also call for additional training for these professionals—so that they can better understand the business logic of the model and the requirements issued by the business.
Once developers understand the business logic, the process ends with the critical documentation of the source code. Such documentation ensures that there is continuity in the NoSQL model if the specific developer who designed the model leaves the organization.
The need to engage in NoSQL Modeling and to account for its numerous differences with relational database modeling should only increase as enterprises become more accustomed to time-sensitive Big Data. The most critical differences involve a cohesive EDA, the roles of developers, data modelers and architects, and the design process of the model itself. So long as NoSQL options exist, relational ones for time-sensitive data are not as feasible. The true value in both of these options lies in their synthesis which can provide online, offline, and near-line processing at speeds conducive to utilizing Big Data in a timely fashion.