Exploring NoSQL using MongoDB

0Shares

What is NoSQL?

Data is everywhere. It is getting generated as the number of systems/softwares are increasing. It increases with number of customers and expansion of businesses. More da

Data in today's world
Figure 1: Data in today’s world

ta needs to be easily retrieved/updated/created without affecting the systems and users and need to provide 24X7 availability. Any temporary glitch will affect the businesses a quite amount of loss. Data sources are increasing and coming in different formats. No more, they can be stored in tables in a RDBMS. We need more than that to add a business value to the companies.

NoSQL”  as the name implies, it is a database structure that simplifies RDBMS constraints according to the use case. It is not a complete replacement of RDBMS, but an extension for RDBMS when you want to store the data in a flat structure with all the related fields at one place called “Document“. Each “Document” lives by itself as a separate entity and can be created/updated/retrieved/deleted as an entity. In a way its a de-normalized database tables where all the relevant data stays in one row and repeated across the rows for same set of keys.

Why do we need NoSQL?

To answer this question, first we need know what kind of challenges we face typically with relational databases.

  • Relational databases are normally designed up to 3rd Normal form in most of the cases. So a typical database will have many tables that stores the data in the form of records.
  • If we want to retrieve data, we will have to query multiple tables to get the related set of records using JOINS. If we have few tables and less number of records, JOINing the tables and retrieving results is a very trivial task and does not need any attention.
  • But, once data increases then the performance of the queries gets deteriorated. Then we have to add non clustered indexes on multiple columns in the tables where they are used to search the matching records.
  • Once, proper indexes are created queries may be performing well but when number of records increases and so the number of hits to the database, the performance will reduce and the response time increase gradually. This causing customer dissatisfaction in the case of the business websites and query timeouts for the applications.
  • Though scaling the database server helps but it will add lot of limitations.
  • The above all holds good if the data can be organized into tables. But when data comes from different sources like log files and other systems which generates the data in the formats like JSON, CSV it is difficult to process the data to store them in the relational database systems as rows and columns. But unfortunately the data coming from different systems does not follow the structure of same columns. Each set of data may contain more or less columns than the previous row in the same dataset. We cannot dynamically create new columns as they were added in the datasets.
  • Finally in the age of BIG DATA, data needs to be processed faster for analytics and other purposes. Querying the tables is a big NO for such unstructured data formats produced by different systems.

The answer for all the above challenge is going towards NoSQL databases like MongoDB, CouchDb augmented with search services like Azure Search, Elastic Search and Apache SOLR.

Advantages of going to NoSQL databases:

  • They can store data in the form of documents in a different structure formats like CSV, JSON etc.
  • They are commonly organized as databases and collections. Databases are like any other databases we see in RDBMS and collections are like tables that stores the data documents.
  • Each document can follow its own data structure like all the documents need not have the same number of columns and NoSQL databases can easily accommodate those changes.
  • Each document has a key and they can be searched through creating indexes on the documents.
  • Easily create web applications with customizable fields.
  • Most of the use cases where I saw NoSQL used is as a caching layer to existing databases for faster search and retrieval of data and in analytics where the read data is separate from write data and they gets synced periodically.
  • Supports JSON and can be queried with JavaScript. So the request/response can be directly handled from Javascript front end.

Mongo DB Specific features:

  • It stores documents in BSON format which is a superset of JSON.
  • Each document is a JSON object.
  • Easy retrieval and creating indexing.
  • Easily scalable horizontally,

I have recently did some learning on MongoDB and I will present my understanding about MongoDB as a NoSQL database. Follow along with me and share your valuable suggestions if you have any so that we can all learn together.

0Shares