Key-Value Database – NoSQL Key Value, Application and Examples

Key-Value database has a Big Hash Table of keys and values which are highly distributed across a cluster of commodity servers. Key-Value database typically guarantees Availability and Partition Tolerance.

The key-value database trades off the Consistency in data in order to improve write time.

key-value database
Key-Value Database

The key in the key-value database can be synthetic or auto-generated which enables you to uniquely identify a single record in the database. The values can be String, JSON, BLOB etc.

Among the most popular key-value database are Amazon DynamoDB, Oracle NoSQL Database, Riak, Berkeley DB, Aerospike, Project Voldemort, IBM Informix C-ISAM.

Application of Key-Value Database – NoSQL Key Value

Let us take some real-life examples where the key-value database is utilized and the benefits they provide.

Managing Web Advertisements

Key-Value databases are mainly used by web advertisement companies.

User’s activity is tracked on web-based, language and location. On the basis of users online activity, web advertisement companies decide which advertisement to show to the user.

It is also important to note that serving advertisement should be fast enough.

It is important to target right advertisement to the right customer in order to receive more clicks and hence to maximize the profits.

Combination of factors such as user’s tracked activity online, language and location determine what a user is interested in forms the key while as all other factors that are needed to serve the advertisement better is kept as the value in key-value databases.

User’s session data retrieval

Your website needs to be efficient and fast to give a user the best service.

How much efficient your database is, if your website runs slow then from a user perspective your entire service is slow.

Websites primarily go slow because of user’s session are handled poorly. Instead of caching the information if every request requires opening a new session then the website will go slow.

User interactions with the website are tracked by the website cookies.

A cookie is a small file which has a unique id that can act as a key in key-value databases. The server uses the cookies to identify the returning users or a new set of users.

The server needs to fetch the data quickly by doing a lookup on cookies. The cookies will give the information about which pages they visit, what information they are looking for and about user’s profile etc.

Key-value stores are, therefore, ideal for storing and retrieving session data at high speeds. The unique Id generated by cookies act as a key while as the other information such as user profiles act a value.

CAP Theorem – Brewer’s Theorem | Hadoop HBase

In this post, we will understand about CAP theorem or Brewer’s theorem. This theorem was proposed by Eric Brewer of  University of California, Berkeley.

CAP Theorem or Brewer’s Theorem

CAP theorem, also known as Brewer’s theorem states that it is impossible for a distributed computing system to simultaneously provide all the three guarantee i.e.  Consistency, Availability or Partition tolerance.

Therefore, at any point of time for any distributed system, we can choose only two of consistency, availability or partition tolerance.

Availability

Even if any of one node goes down, we can still access the data.

Consistency

You access the most recent data.

Partition Tolerance

Between the nodes, it should tolerate network outage.

The above of the three guarantees are shown in three vertices of a triangle and we are free to choose any side of the triangle.

Therefore, we can choose (Availability and Consistency) or (Availability and Partition Tolerance) or (Consistency and Partition Tolerance).

Please refer to figure below:

CAP theorem
CAP Theorem

Relational Databases such as Oracle, MySQL choose Availability and Consistency while databases such as Cassandra, Couch, DynoDB choose Availability and Partition Tolerance and the databases such as HBase, MongoDB choose Consistency and Partition Tolerance.

CAP Theorem Example 1:  Consistency and Partition Tolerance

Let us take an example to understand one of the use cases say (Consistency and Partition Tolerance).

These databases are usually shared or distributed data and they tend to have master or primary node through which they can handle the right request. A good example is MongoDB.

What happens when the master goes down?

In this case, usually another master will get elected and till then data can’t be read from other nodes as it is not consistent. Therefore, availability is sacrificed.

However, if the write operation went fine and there is network outage between the nodes, there is no problem because the secondary node can serve the data. Therefore, partition tolerance is achieved.

CAP Theorem Example 2: Availability and Partition Tolerance

Let us try to understand an example for Availability and Partition Tolerance.

These databases are also shared and distributed in nature and usually master-less. This means every node is equal. Cassandra is a good example of this kind of databases.

Let us consider we have an overnight batch job that writes the data from a mainframe to Cassandra database and the same database is read throughout a day. If we have to read the data as and when it is written then we might get stale data and hence the consistency is sacrificed.

Since this is the read heavy and write once use case, I don’t care about reading data immediately. I just care about once the write has happened, we can read from any of the nodes.

But Availability is one of the important parameters because if one of the nodes goes down we can be able to read the data from another backup node. The system as a whole is available.

Partition tolerance will help us in any network outage between the nodes. If any of the nodes goes down due to network issue another node can take it up.