>  Blog

MongoDB: An Introduction

Pradyumn Sharma

June 6, 2017

In my blog post titled "An Introduction to NoSQL", I had talked about some of the key characteristics of NoSQL databases. In this article, we'll look at some key concepts and features of MongoDB, one of the most prominent NoSQL database solutions.

MongoDB is a leading document oriented NoSQL database solution that is good at storing very large volumes of data. In fact, its name is derived from the word "huMONGOus". You can configure a MongoDB cluster with hundreds or thousands of servers, providing linear scalability.

Documents and Collections

When you want to store some data in MongoDB, you provide it as a JSON document. Here are examples of such documents:

{ _id:1, name: 'Ahmad', gender: 'M', dept: 'Fin'} 
{ _id:2, name: 'Bajrang', gender: 'M', dept: 'Sales'} 
{ _id:3, name: 'Catherine', gender: 'F', dept: 'HR'} 
{ _id:4, name: 'Dostoyevski', gender: 'M', dept: 'Prod'} 

A document in MongoDB is conceptually similar to a record in a typical relational database like MySQL or Oracle. Documents in MongoDB are stored in a collection (equivalent to a table in a relational database).

Documents can have sub-documents, as well as collections of values or sub-documents. For example, the following is a valid document to store in a MongoDB collection (and a pretty natural thing to do so):

	name: {first: 'Harish', last: 'Chandra'},
	gender: 'M',
	yearOfBirth: 1962,
	livesIn: 'Mumbai',
	countriesVisited: ['India', 'Singapore', 'Thailand', 'United Kingdom', 'Spain', 'Denmark', 
					   'United States of America'],
	languages: [
		{name: 'Hindi', proficiency: 'Fluent'},
		{name: 'English', proficiency: 'Fluent'},
		{name: 'Sanskrit', proficiency: 'Intermediate'} ]

You don't define a structure for a MongoDB collection. This means that different documents in a collection may have different fields and MongoDB will ingest them all as a part of the same collection! For example, you can insert the following two documents in the same collection:

	name: 'Narayan Subramanian', 
	gender: 'M', 
	currentCity: 'Jaipur'

	name: 'Pushpa Maheshwari', 
	gender: 'F',
	email: 'pushpa@example.com',
	worksAt: 'Indian Railways'

Rich Query Language

MongoDB provides a rich query language for database operations. It’s query language is built around the JSON format and is different from traditional SQL in terms of syntax. Here are some examples of the CRUD (Create, Read, Update, Delete) operations:

Inserting a document in a collection:

db.books.insert (
		title: 'To Kill a Mockingbird',
		author: 'Harper Lee'

In "persons" collection, find all males born before 1970 and all females born before 1980, sort by first name and last name of such persons:

db.persons.find (
	{$or: [
		{gender: 'M', yearOfBirth: {$lt: 1970} },
		{gender: 'F', yearOfBirth: {$lt: 1980} }
	] }
).sort ( {'name.first': 1, 'name.last': 1} )

MongoDB’s search operations are quite powerful. You can perform searches inside sub-documents as well as collections within documents.

In "persons" collection, update the document for ‘Merilyn Holmes’, setting gender to 'F', ‘yearOfBirth’ to ‘1997’, and ‘married’ to 'N':

db.persons.update (
	{name: {first: 'Merilyn', last: 'Holmes'} },
	{$set: {gender: 'F', yearOfBirth: 1997, married: 'N'} }

You can also update individual elements within collections embedded within documents.

Delete all males from the "persons" collection:

db.persons.remove ( {gender: 'M'} )

Some Other Features of MongoDB

  • Indexes are supported for fast retrieval of documents, just like relational databases.
  • Text indexing is supported to perform full text queries on documents.
  • You can have geospatial information in a collection, and perform powerful search operations (such as “find all restaurants within 1000 meters radius from a given position”)
  • Read-only views are a recent addition to MongoDB.
  • You can write Javascript code, both for client-side and server-side operations.
  • An aggregation framework provides more flexible and powerful query operations on collections. The framework works like a pipeline of various operations such as filter, transform, sort, grouping operation, left outer join with another collection, etc.
  • Drivers are available for all the mainstream programming languages.
  • For high availability, you can configure automatic replication with up to 49 servers automatically backing up the data from a primary server.
  • For high performance, you can partition your data across multiple "shards" (each shard being a single server or a replica set), so that workload is distributed across multiple partitions.

Closing Thoughts

All in all, MongoDB reduces some of the restrictions that traditional relational databases impose enabling us to design a database and write queries that better serve the demands of today’s applications (such as mobile services, querying large volume of data, data pipelines, etc).

MongoDB is not a replacement to relational databases. Instead it is a welcome addition to the field of database technology, providing solutions to problems that were hard or inefficient to solve using the traditional approach. That it provides great horizontal scalability and performance is an added benefit.