Getting Started With MongoDB For Beginners

History of MongoDB

 
It was in Autumn 2007 that Kevin Ryan, Dwight Merriman and Eliot Horowitz, successful entrepreneurs, decided to found the company 10gen, with the aim of offering a Platform as a Service product, similar to Heroku, AWS Elastic Beanstalk or Google App Engine, but based on on opensource components.
 
Their experience through different web projects such as DoubleClick and ShopWiki has taught them that an application that becomes popular will run into scalability issues at the database level. In their search for a database to integrate into their PaaS product, no open source solution met their needs for scalability and compatibility with a cloud architecture.
 
This is why the 10gen team has internally developed a new document-oriented NoSQL database technology. They will baptize it MongoDB, inspired by the word "Humongous" which could be translated by "Gigantic", like the data it is supposed to host.
 

Why MongoDB ?

 
MongoDB was built for speed. The data is based on BSON documents, short for JSON binary. BSON allows MongoDB to be that much faster in calculating to find data in documents. In order to be even more efficient in its requests, MongoDB invites the denormalization of the data in its documents. Where a good practice in SQL was to have specific tables and foreign keys to refer to data during joins, MongoDB encourages denormalization by duplicating the data where it is requested. Although MongoDB offers reference mechanisms, they must be used wisely in order to benefit from the performance provided by a MongoDB database.
 

Speed

 
The data is based on BSON documents, short for JSON binary. BSON allows MongoDB to be that much faster in calculating to find data in documents. In order to be even more efficient in its requests, MongoDB invites the denormalization of the data in its documents. Where a good practice in SQL was to have specific tables and foreign keys to refer to data during joins, MongoDB encourages denormalization by duplicating the data where it is requested. Although MongoDB offers reference mechanisms, they must be used wisely in order to benefit from the performance provided by a MongoDB database.
 

Flexibility

 
Unlike SQL databases, the data in a Mongo collection can be completely heterogeneous. This is called the Schemaless. The advantage of not necessarily having a strict data structure is to be able to quickly change its data structure. This flexibility is greatly appreciated in projects at the prototype stage which are still discovering how their data should be structured. However, the Schemaless has its drawbacks. It becomes more difficult to perform data analysis operations if all the documents do not follow the same structure. This is why it is also possible to impose a Schema on the collection.
 

Cloud and distributed infrastructure

 
To ensure stability, one of the key concepts of MongoDB is to always have more than one copy of the database available in order to ensure an always fast availability even in the event of failure of the host machine. This ability to replicate the database across multiple machines in multiple locations easily helps improve the horizontal scalability of a database.
 

Install MongoDB on Windows 10

 
Step 1 - Download the MongoDB MSI Installer Package
 
Download the latest version of MongoDB from the official website. Make sure you select MSI as the package you want to download.
 
Getting Started with MongoDB for Beginners
 
Step 2 - Install MongoDB with the Installation Wizard
 
Log in with Admin privileges and double click on the .msi package you just downloaded. This will launch the installation wizard
 
Getting Started with MongoDB for Beginners

Step 3
 
Accept the licence agreement then click Next,
 
Getting Started with MongoDB for Beginners
 
Step 4
 
Select the Complete setup,
 
Getting Started with MongoDB for Beginners
 
Step 5
 
Select “Run service as Network Service user” and make a note of the data directory, you will need this later.
 
According to the documentation,
 
**Run the service as Network Service user (Default) : This is a Windows user account that is built-in to Windows
**Run the service as a local or domain user :
  • For an existing local user account, specify a period (i.e. .) for the Account Domain and specify the Account Name and the Account Password for the user.
  • For an existing domain user, specify the Account Domain, the Account Name and the Account Password for that user
Getting Started with MongoDB for Beginners
 
By default, the location for the MongoDB data directory is c:\data\db. So you need to create this folder manually or using the Command Prompt like below :
  1. C:\>md data
  2. C:\md data\db
Then you need to specify set the dbpath to the created directory in mongod.exe :
  1. C:\Users\hp> cd C:\Program Files\MongoDB\Server\4.2\bin
  2. C:\Program Files\MongoDB\Server\4.2\bin>mongod.exe --dbpath "C:\data"
On Windows, the location is <install directory>/bin/mongod.cfg. Open mongod.cfg file and check for dbPath option,
 
Getting Started with MongoDB for Beginners
 
Step 6
 
You don’t need Mongo Compass, so deselect it and click Next.
 
Getting Started with MongoDB for Beginners
 
Step 7
 
Click Install to launch the installation,
 
Getting Started with MongoDB for Beginners
 
Step 8
 
Click Finish to complete the installation,
 
Getting Started with MongoDB for Beginners
 
Step 9 - To verify that Setup was Successful
 
To check mongodb version use the mongod command with --version option. On windows you will have to use full path to the mongod.exe and mongo.exe to check mongodb version, if you have not set MongoDB Path. But if MongoDb Path is being set, you can simply use the mongod and mongo command.
 
Getting Started with MongoDB for Beginners
 

Working with MongoDB

 
Step 1
 
To start MongoDB, run mongod.exe from the Command Prompt navigate to your MongoDB Bin folder and run mongod command, it will start MongoDB main process and The waiting for connections message in the console.
 
mongod is the "Mongo Daemon" it's basically the host process for the database. When you start mongod you're basically saying "start the MongoDB process and run it in the background".
 
mongo is the command-line shell that connects to a specific instance of mongod.
 
Step 2
 
Download and unzip the dblp.json.zip file.
 
Step 3
 
Import data from the .json file,
 
mongoimport command is used to restore (import) a database from a backup(export) taken with mongoexport command. where, DB_NAME – Name of the Database of the Collection to be exported COLLECTION_name - Name of Collection of DB_NAME to be exported Type –JSON, it is optional.
  1. mongoimport –host localhost :27017 –db DBLP – collection publis < C:\Users\hp\Desktop\dblp.json\dblp.json --jsonArray   
Getting Started with MongoDB for Beginners
 
In the mongo console check that the data has been inserted,
  1. db.publis.count()   
Step 4
 
Find the list of all publications published in 2007,
  1. db.publis.find({year :2007})    
Getting Started with MongoDB for Beginners
 
The find() method returns a cursor to the results. In the mongo shell, if the returned cursor is not assigned to a variable using the var keyword, the cursor is automatically iterated to access up to the first 20 documents that match the query.
 
Step 5
 
List of all articles (“Article” type),
 
Using SQL,
  1. SELECT * FROM publis  
  2. WHERE [type] LIKE '%Article%'   
Using MongoDB,
  1. db.publis.find({"type" : "Article"})   
Step 6
 
Find the list of all publishers (type "publisher"),
 
Using SQL,
  1. SELECT distinct publisher FROM publis   
Using MongoDB,
  1. Db.publis.distinct( "publisher" )   
Getting Started with MongoDB for Beginners
 
Step 7
 
Find the list of publications by author "David Gelbart",
 
Using SQL,
  1. SELECT * FROM publis  
  2. WHERE authors LIKE '%David Gelbart%'   
Using MongoDB,
  1. db.publis.find({"type" : " David Gelbart " }   
Step 8
 
Sort "David Gelbart" publications by book title and year,
 
Using SQL,
  1. SELECT * FROM publis  
  2. WHERE authors LIKE ‘%David Gelbart%’  
  3. ORDER BY title, [Year]   
To sort documents in MongoDB, you need to use sort() method. The method accepts a document containing a list of fields along with their sorting order. To specify sorting order 1 and -1 are used. 1 is used for ascending order while -1 is used for descending order.
 
Using MongoDB,
  1. db.publis.find({  
  2.     authors: ’’David Gelbart‘’  
  3. }).sort({  
  4.     title: 1,  
  5.     year: 1  
  6. })  
Step 9
 
Sort "David Gelbart" posts by end page :
 
Using SQL,
  1. SELECT * FROM publis  
  2. WHERE authors LIKE ‘%David Gelbart%’  
  3. ORDER BY endpage   
Using MongoDB,
  1. db.publis.aggregate([{  
  2.     $match: {  
  3.         authors: «David Gelbart»  
  4.     }  
  5. }, ($sort: {  
  6.         "pages.end": 1  
  7.     }  
  8. }])  
  9. db.publis.find({  
  10.     authors: ’’David Gelbart‘’  
  11. }).sort({  
  12.     pages.end: 1  
  13. })   
Step 10
 
Project the result on the title of the publication, and its type,
  1. db.publis.aggregate([{  
  2.    $match: {  
  3.       authors: "David Gelbart"  
  4.    }  
  5.  }, {  
  6.    $sort: {  
  7.       "pages.end": 1  
  8.       }  
  9.    }]), {  
  10.    $project: {  
  11.       title: 1,  
  12.       type: 1  
  13.    }  
  14. }]);   
Step 11
 
Count the number of its publications,
  1. db.publis.aggregate([{  
  2.     $match: {  
  3.         authors: "David Gelbart"  
  4.     }  
  5. }, {  
  6.     $group: {  
  7.         _id: null,  
  8.         total: {  
  9.             $sum: 1  
  10.         }  
  11.     }  
  12. }]);   
Step 12
 
Count the number of publications since 2007,
  1. db.publis.aggregate([{  
  2.     $match: {  
  3.         year: {  
  4.             $gte: 2007  
  5.         }  
  6.     }  
  7. }, {  
  8.     $group: {  
  9.         _id: "null",  
  10.         total: {  
  11.             $sum: 1  
  12.         }  
  13.     }  
  14. }]);   
Step 13
 
Count the number of publications since 2007 and by type,
  1. db.publis.aggregate([{  
  2.     $match: {  
  3.         year: {  
  4.             $gte: 2007  
  5.         }  
  6.     }  
  7. }, {  
  8.     $group: {  
  9.         _id: "$type",  
  10.         total: {  
  11.             $sum: 1  
  12.         }  
  13.     }  
  14. }]);   
Step 14
 
Count the number of publications by author and sort the result in decreasing order,
  1. db.publis.aggregate([{  
  2.     $unwind: "$authors"  
  3. }, {  
  4.     $group: {  
  5.         _id: "$authors",  
  6.         number: {  
  7.             $sum: 1  
  8.         }  
  9.     }  
  10. }, {  
  11.     $sort: {  
  12.         number: -1  
  13.     }  
  14. }]);  

Map Reduce with Mongo

 
Step 1
 
For each book-type document, return the document with the “title” key.
  1. var mapFunction = function() {  
  2.     if (this.type == "Book") emit(this.title, this);  
  3. };  
  4. var reduceFunction = function(key, values) {  
  5.     return {  
  6.         articles: values  
  7.     };  
  8. };  
  9. db.publis.mapReduce(mapFunction, reduceFunction, {  
  10.     out: "result"  
  11. });  
  12. db.resultat.find();   
 or,
  1. var mapFunction = function() {  
  2.     emit(this.title, this);  
  3. };  
  4. var reduceFunction = function(key, values) {  
  5.     return {  
  6.         articles: values  
  7.     };  
  8. };  
  9. var queryParam = {  
  10.     query: {  
  11.         type: "Book"  
  12.     },  
  13.     out: "result_set"  
  14. };  
  15. db.publis.mapReduce(mapFunction, reduceFunction, queryParam);  
  16. db.result_set.find();   
Step 2
 
For each of its books, give the number of its authors.
  1. var mapFunction = function() {  
  2.     if (this.type == "Book") emit(this.title, this.authors.length);  
  3. };  
  4. var reduceFunction = function(key, values) {  
  5.     return {  
  6.         articles: values  
  7.     };  
  8. };  
  9. var queryParam = {  
  10.     query: {},  
  11.     out: "result_set"  
  12. };   
Step 3
 
For each document having "booktitle" (chapter) published by Springer, return the number of its chapters.
  1. var mapFunction = function() {  
  2.     if (this.publisher == "Springer" && this.booktitle) emit(this.booktitle, 1);  
  3. };  
  4. var reduceFunction = function(key, values) {  
  5.     return Array.sum(values);  
  6. };  
  7. var queryParam = {  
  8.     query: {},  
  9.     out: "result_set"  
  10. };  
  11. db.publis.mapReduce(mapFunction, reduceFunction, queryParam);  
  12. db.result_set.find({  
  13.     value: {  
  14.         $gte: 2  
  15.     }  
  16. });   
Step 4
 
For each of its books, return the number of its authors.
  1. var mapFunction = function() {  
  2.     if (this.publisher == "Springer") emit(this.year, 1);  
  3. };  
  4. var reduceFunction = function(key, values) {  
  5.     return Array.sum(values);  
  6. };  
Step 5
 
For each “publisher & year” pair (publisher must be present), return the number of publications.
  1. var mapFunction = function() {  
  2.     if (this.publisher) emit({  
  3.         publisher: this.publisher,  
  4.         year: this.year  
  5.     }, 1);  
  6. };  
  7. var reduceFunction = function(key, values) {  
  8.     return Array.sum(values);  
  9. };   
Step 6
 
For the author "Toru Ishida", return the number of publications per year
  1. var mapFunction = function() {  
  2.     if (Array.contains(this.authors, "Toru Ishida")) emit(this.year, 1);  
  3. };  
  4. var reduceFunction = function(key, values) {  
  5.     return Array.sum(values);  
  6. };  
  7. var queryParam = {  
  8.     query: {},  
  9.     out: "result_set"  
  10. };   
Or,
  1. var mapFunction = function() {  
  2.     emit(this.year, 1);  
  3. };  
  4. var reduceFunction = function(key, values) {  
  5.     return Array.sum(values);  
  6. };  
  7. var queryParam = {  
  8.     query: {  
  9.         authors: "Toru Ishida"  
  10.     },  
  11.     out: "result_set"  
  12. };   
Step 7
 
For the author "Toru Ishida", return the average number of pages for his articles (Article type)
  1. var mapFunction = function() {  
  2.     emit(nullthis.pages.end - this.pages.start);  
  3. };  
  4. var reduceFunction = function(key, values) {  
  5.     return Array.avg(values);  
  6. };  
  7. var queryParam = {  
  8.     query: {  
  9.         authors: "Toru Ishida"  
  10.     },  
  11.     out: "result_set"  
  12. };  
Step 8
 
For each author, list the titles of their publications,
  1. var mapFunction = function() {  
  2.     for (var i = 0; i < this.authors.length; i++) emit(this.authors[i], this.title);  
  3. };  
  4. var reduceFunction = function(key, values) {  
  5.     return {  
  6.         titles: values  
  7.     };  
  8. };  
Step 9
 
For each author, list the number of publications associated with each year,
  1. var mapFunction = function() {  
  2.     for (var i = 0; i < this.authors.length; i++) emit({  
  3.         author: this.authors[i],  
  4.         year: this.year  
  5.     }, 1);  
  6. };  
  7. var reduceFunction = function(key, values) {  
  8.     return Array.sum(values);  
  9. };  
Step 10
 
For the publisher "Springer", give the number of authors per year,
  1. var mapFunction = function() {  
  2.     for (var i = 0; i < this.authors.length; i++) emit(this.year, this.authors[i]);  
  3. };  
  4. var reduceFunction = function(key, values) {  
  5.     var distinct = 0;  
  6.     var authors = new Array();  
  7.     for (var i = 0; i < values.length; i++)  
  8.         if (!Array.contains(authors, values[i])) {  
  9.             distinct++;  
  10.             authors[authors.length] = values[i];  
  11.         }  
  12.     return distinct;  
  13. };   
Step 11
 
Count the publications of more than 3 authors.
  1. var mapFunction = function() {  
  2.     if (this.pages && this.pages.end) emit(this.publisher, this.pages.end - this.pages.start);  
  3. };  
  4. var reduceFunction = function(key, values) {  
  5.     return Array.avg(values);  
  6. };  
  7. ar queryParam = {  
  8.     query: {},  
  9.     out: "result_set"  
  10. };   
Step 12
 
For each publisher, give the average number of pages per publication,
  1. var mapFunction = function() {  
  2.     if (this.pages && this.pages.end) emit(this.publisher, this.pages.end - this.pages.start);  
  3. };  
  4. var reduceFunction = function(key, values) {  
  5.     return Array.avg(values);  
  6. };  
  7. ar queryParam = {  
  8.     query: {},  
  9.     out: "result_set"  
  10. };   
Step 13
 
For each author, give the minimum and maximum of years with publications, as well as the total number of publications,
  1. var mapFunction = function() {  
  2.     for (var i = 0; i < this.authors.length; i++) emit(this.authors[i], {  
  3.         min: this.year,  
  4.         max: this.year,  
  5.         number: 1  
  6.     });  
  7. };  
  8. var reduceFunction = function(key, values) {  
  9.     var v_min = 1000000;  
  10.     var v_max = 0;  
  11.     var v_number = 0;  
  12.     for (var i = 0; i < values.length; i++) {  
  13.         if (values[i].min < v_min) v_min = values[i].min;  
  14.         if (values[i].max > v_max) v_max = values[i].max;  
  15.         v_number++;  
  16.     }  
  17.     return {  
  18.         min: v_min,  
  19.         max: v_max,  
  20.         number: v_number  
  21.     };  
  22. };  
Summary
 
In this tutorial, we learned about basics of MongoDB and how to work with MongoDB.