Why NoSQL Databases?
With growing use of social platforms like Twitter, Facebook, more and more unstructured data is being produced by people – including audios, videos, and more. With all these aspects of changing data, it has been recognized that relational databases are not the best fit for storing this highly unstructured data. So the need for other database schemas arose; hence NoSQL databases.
With the rapid developments in the NoSQL space, its growing popularity, and a lot of different solutions and technologies to choose from, anyone new coming into the NoSQL world faces many choices when selecting the right technology. That’s why in this section we’ll try to clarify the categorization of NoSQL databases (aggregate stores in this section, GraphDB / Neo4j in next section), and focus on the applicability of each category.
Aggregate stores (the term coined by Martin Fowler) includes 3 types of NoSQL databases (excluding Graph DB) – Key-value stores, Column stores, and document-oriented DBs.
As its name suggests, key-value uses a data storage paradigm where values are stored against keys. The data takes the form of key-value pairs. This design is normally referred to as Hash table or Dictionary. The values support only simple data structures like text or binary content, although some more recent key-value stores support a limited set of complex data types (for example, Redis supports lists and maps as values). Key-values stores provide simple and high-volume concurrent data access. Key-value stores are simplest type of NoSQL databases.
Key-value stores handle size well and are good at processing a constant stream of read/write operations with low latency making them perfect for:
- Session management at high scale
- Simple domain with fast read access
- Building a shopping cart
- User preference and profile stores
- Product recommendations; latest items viewed on a retailer website drive future customer product recommendations
- Ad servicing; customer shopping habits result in customized ads, coupons, etc. for each customer in real-time
- Can effectively work as a cache for heavily accessed but rarely updated data
Key-value stores are of different types – some store data in memory (RAM), others use SSD (Solid State Disk) or rotating disks, some support ordering of keys.
Key-value – RAM
Key-value – SSD or Rotating Disk
- Apache Ignite
- RocksDB (fork of LevelDB)
- MemcacheDB (using Berkeley DB or LMDB)
Key-value – Eventually Consistent
Key-value – Ordered
- Berkeley DB – acquired by Oracle in Feb 2006, – Berkeley DB’s open source architecture
The distributed key-value model scaled very well but there was a need to use some sort of data structure within that model. This is how the column-family store category came on the NoSQL scene.
The idea was to group similar values (or columns) together by keeping them in the same column family (for example, user data or information about books). Using this approach, what was a single value in a key-value store evolved to a set of related values
(You can observe data in a column-family store as a map of maps, or a key-value store where each value is another map.) The column families are stored in a single file, enabling better read and write performance on related data. The main goal of this approach was high performance and high availability when working with big data, so it’s no surprise that the leading technologies in this space are:
- Google’s BigTable
- Cassandra, originally developed by Facebook – Column based data store based on BigTable and DynamoDB.
- HBase data store for Apache Hadoop – modeled after Google’s BigTable.
Common Uses Cases:
A lot of real problems (such as content management systems, user registration data, and CRM data) require a data structure that looks like a document. Document-oriented databases provide just such a place to store simple, yet efficient and schemaless, docu- ment data. The data structure used in this document model enables you to add self- contained documents and associative relationships to the document data.
You can think of document-oriented databases as key-value stores where the value is a document. This makes it easier to model the data for common software problems, but it comes at the expense of slightly lower performance and scalability compared to key-value and column-family stores.
- Couchbase – JSON-based, Memcached-compatible document-based data store.
Common Uses Cases:
- When domain model is a document by nature
- Highly scalable systems (although on a lower level than key-value and column-family stores)
- Document-based data stores allow you to work with deeply nested, complex data structures – Nested information
- One of the most critical functionalities of document-based data stores are the way they interface with applications: Using JS friendly JSON.
In the next section, we’ll discuss 4th type of NoSQL databases – Graph database – Neo4j.