Introduction To MongoDB

Introduction To MongoDB

The MongoDB name is derived from the humongous. MongoDB is a scalable and high-performance open-source database designed to handle document-oriented storage. MongoDB was written in C++. MongoDB is open source and a stand-alone product. It was started in 2007 and was initially released in 2007. In March 2010 MongoDB launched version 1.4. The latest version of MongoDB is 2.6, launched on April 8, 2014. The topics to be covered in this chapter are:

  • What MongoDB is
  • What a NoSQL Database
  • What Big Data is
  • The Need for a Document-Oriented Database
  • Differences between an RDBMS and MongoDB
  • How to install MongoDB in Windows

What Is MongoDB?

MongoDB is an open-source database. (Open source is a certification mark owned by the Open Source Initiative (OSI). The development of the software is intended to be shared freely and distributed by others. MongoDB is available for free under the GNU Affero General Public License. And the language is available under an Apache License.) The database uses a document-oriented data model. MongoDB was first developed by 10gen (now MongoDB Inc.). MongoDB is built on an architecture of collections and documentation. Documents comprise sets of key-value pairs and are the basic unit of data in MongoDB. A collection contains sets of documents and functions as the equivalent of relational database tables.

mongoDBDef

MongoDB supports a dynamic schema (pronounced SKEE-Mah; they are the organization or structure of a database) design allowing the documents in a collection to have multiple fields and structure fields and structures. The database uses document storage and BSON (a binary representation of JSON) format. MongoDB spread collections distributed across multiple systems for horizontal scalability as data volume is increased.

mongoArc

We can understand MongoDB using the following simple words:

  • Open-source database that uses a Document Oriented Data Model
  • NoSQL
  • Follows the architecture of Collections and Documents instead of tables, rows, and RDBMS
  • A Document contains sets of key-value pairs and this is the basic unit of data in MongoDB
  • A Collection contains sets of documents and functions as the equivalent of a relationship database table

NoSQL Databases

A NoSQL database is also called a Not Only SQL database. NoSQL is an approach to database management and database design useful for large sets of distributed data. It does not prevent the use of SQL (Structured Query Language) and it is non-relational. This avoids selected relational functionality such as fixed table schemas and join operations. It is the first alternative to relational databases, with scalability and fault tolerance. This is a very flexible and schema-less data model, is horizontally scalable, and uses a distributed architecture (NoSQL databases are sometimes referred to as a Cloud Database, Big Database, or non-relational database stored and analyzed by user-generated data and machine-generated data).

Types of NoSQL Databases

There are mainly four different types of NoSQL databases and they have their own specific attributes.

  • Key-Value Store
  • Column store
  • Document Database
  • Graph Database

Key Value Store

These databases store data in a schema-less way, in this all data within consists of an indexed key and a value. Cassandra, DyanmoDB, Azure Table Storage(ATS), Riak, Berkeley DB.

Column Store

These databases are also known as wide-column stores. This kind of database is mainly designed to store data tables as sections of columns of data, rather than as rows of data. Wide column stores offer very high-performance and highly scalable tables. Some databases that use this architecture are HBase, BigTable, and HyperTable

Document Database

This type of database works on key-value pairs where documents contain complex data and each document is assigned a unique key used to retrieve the document. The main features of this type of database are storing, retrieving, and managing the document. This database is also known as semi-structured data. Some of the databases are MongoDB and CouchDB.

Graph Database

This type of database uses graph theory. This type of database mainly works by making a graph on the basis of data and relationships. The data and relationships are interconnected, with an undermined number of relationships. Some of the databases are Neo4J and Polyglot.

Big Data

Big Data is a term indicating a voluminous amount of structured, semi-structured and unstructured data for getting information. It also does not refer to any specific quantity. Big Data has a key feature to make NoSQL popular. Suppose when we have a limitless array of data, in that scenario we remember Big Data. There are some more scenarios where the definition gets completed.

Velocity: When a huge amount of data is coming from a different location and the data is obtained very quickly.

Variety: Data variety means, data should be structured, semi-structured, or might be unstructured.

Volume: Data volume means that sometimes data comes from the user into the database in a huge volume; it might be terabytes or petabytes in size.

Data Complexity: Data complexity tells us that we can replicate our data or database in different locations or different databases.

Big data

The Need for a Document-Oriented Database

Some of the reasons for choosing MongoDB over any RDBMS are the following:

  • Document-Oriented Storage
  • Continuous Data Availability
  • Real Location Independence
  • Flexible Data Models
  • Full Index Support
  • Replication and High Availability
  • Auto-Sharding

Document-Oriented Storage

A Document-Oriented Storage architecture follows the paradigm of a Document-Oriented Database. A Document-Oriented Database is a new breed of database. It is designed for storing, retrieving, and managing document-oriented information. The main objective of this database is to store data in some Standard format or encoding. The encoding used includes XML, YAML, JSON, and BSON as well as binary forms like PDF and Microsoft Office documents (Microsoft Word and Excel).

Continuous Data Availability

In the present scenario, where any database suffers the problem of downtime, hardware fails. Downtime is the deadly cause for any website or any application that becomes paralyzed. The same for hardware failure. In those scenarios, a NoSQL database plays a vital role. If one database server or node goes down then another database server or node will able to make the website or web application continue, as in the preceding description we can understand how much NoSQL helps us rather than any RDBMS.

Real Location Independence

Location independence means the ability to read and write to a database regardless of where the I/O operation physically occurs and to have write functionality propagated out from the location so that it is available to users and the machine to another side. That kind of functionality is not available in an RDBMS.

MongoDB can maintain our database copies in separate servers depending on geographical region to improve access times. The response is as good as a local database for those users in the location the data corresponds to.

Flexible Data Models

An RDBMS is based on a defined relationship between tables with columns. An RDBMS schema is very strict and uniform, but in NoSQL, there is nothing like an RDBMS. A NoSQL data model is schema-less, it can accept all types of data, whether the data is structured, unstructured, or semi-structured, and also makes a relationship among them very easily. The data in MongoDB has a flexible schema. This flexibility means we can map our document as an entity or an object and each document can match the data fields of the represented data. The document also follows the same structure.

Full Index Support

As we know, indexes provide high performance for fetching data. MongoDB also uses a special kind of indexing that MongoDB supports.

Replication and High Availability

A replica (an extract copy or model of something) set in MongoDB is a group of MongoDB processes that maintain the data set. A replica set provides us with redundancy and high availability.

Auto-Sharding

Sharding is the process of storing data across multiple machines. The mechanism behind this logic is when the data increases in size they will balance the load of the data to also maintain the data across several networks and keep your file saved in any damage or any natural disaster. Sharding solves the problem with horizontal scaling, with sharding we can add more machines to support the data growth and the demands of read and write operations.

Difference between RDBMS and MongoDB

There are some differences mentioned in the following tables. These differences make us a clear view of the Document databases and relational databases, and exactly how they are different from each other.

RDBMS MongoDB
This is good for structured data Write once and read many for unstructured data.
Tightly structured with a schema and performance is slower (in other words low latency) with huge growing data. Performs faster for small amounts of data. Faster than RDBMS for growing data on a cluster/cloud in TB or PB
Transaction supported Does not support transactions.

8

How to install MongoDB in Windows?

Installation of MongoDB is a three-step process, as shown in the following image.

setup

Step 1. We need to download the file from the MongoDB website and choose the proper file (32-bit or 64-bit) depending on the Operating system. As the image tells us, we need to download MongoDB from the MongoDB website. There are many other OSs supported by MongoDB available on the website, like Windows, Linux, Mac OS, and Solaris. After downloading, install the file by just double-clicking on the MSI file.

selectOS

Step 2. If we download the Zip file then extract it otherwise we will get the simple setup file, in the setup file just click it and start the setup.

SteupWizard1

Step 3. This is the final step where we need to choose the Next button to start the installation process as in the following image.

SteupWizard2

This is the End User License Agreement window where we need to select the Terms And Conditions of MongoDB. After selecting the Accept terms and conditions section hit the Next button as shown.

SteupWizard3

At the end of the installation process, we can see that a directory in Program Files was created having the MongoDB name. After installation move the file into c:\mongodb. If you get any issue when you move the installed folder then use a move operation from the command prompt. For example: [move c:\example1 C:\example2] and create a DB file inside the MongoDB.

MongoDBfilelocation

How To Set Up The MongoDB Environment?

MongoDB requires a data directory to store all the data. MongoDB has a default data directory path ("\data\db"). We need to create a folder, for example:

c:\mongodb\bin\mongod.exe --dbpath d:\testing\mongodb\data

bdpath

Summary

In this article, we learned the basics of MongoDB, NoSQL databases, and Big Data. Also why there is a need for Document Oriented Databases. Also, we have covered the differences between RDBMS and MongoDB and how to install MongoDB in Windows. (The basic process of installation of MongoDB in the Windows Operating system.)


Similar Articles