Mongo Shell Commands - Mongo Document Queries

This is a step-by-step guide to install MongoDB on Mac

Pavan Kulkarni

10 minute read

This post will introduce mongo shell and basic query operations that can be performed on mongo shell with examples.

Mongo Shell

The mongo shell is an interactive JavaScript interface to MongoDB. You can use the mongo shell to query and update data as well as perform administrative operations. The mongo shell is a component of the MongoDB distributions. Once you have installed and have started MongoDB, connect the mongo shell to your running MongoDB instance.

MongoDB Key Features

Aggregation Pipeline : The aggregation pipeline is a framework for data aggregation modeled on the concept of data processing pipelines. Documents enter a multi-stage pipeline that transforms the documents into aggregated results. MapReduce can be used for batch processing. More on this topic can be found here

BSON format : BSON is a binary serialization format used to store documents and make remote procedure calls in MongoDB. The BSON specification is located at bsonspec.org. The $type aggregation operator returns the type of an operator expression using one of the listed BSON type strings.

ObjectId : ObjectIds are small, likely unique, fast to generate, and ordered. ObjectId values consist of 12 bytes, where the first four bytes are a timestamp that reflect the ObjectId’s creation. In MongoDB, each document stored in a collection requires a unique _id field that acts as a primary key. If an inserted document omits the _id field, the MongoDB driver automatically generates an ObjectId for the _id field.

Per Mongo docs :
4-byte value representing the seconds since the Unix epoch
3-byte machine identifier
2-byte process id, and
3-byte counter, starting with a random value.

MongoDB Sharding : Sharding is a method for distributing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations. More on this topic can be found here. Sharding can be done in 2 ways :

  • Vertical Scaling : involves increasing the capacity of a single server, such as using a more powerful CPU, adding more RAM, or increasing the amount of storage space.
  • Horizontal Scaling : involves dividing the system dataset and load over multiple servers, adding additional servers to increase capacity as required.

MongoDB has flexible schema : Data in MongoDB has a flexible schema. Unlike SQL databases, where you must determine and declare a table’s schema before inserting data, MongoDB’s collections do not enforce document structure. This flexibility facilitates the mapping of documents to an entity or an object. Each document can match the data fields of the represented entity, even if the data has substantial variation. In practice, however, the documents in a collection share a similar structure. More on this topic can be found here

GridFS : GridFS is a specification for storing and retrieving files that exceed the BSON-document size limit of 16 MB. Instead of storing a file in a single document, GridFS divides the file into parts, or chunks [1], and stores each chunk as a separate document. By default, GridFS uses a default chunk size of 255 kB; that is, GridFS divides a file into chunks of 255 kB with the exception of the last chunk. The last chunk is only as large as necessary. Similarly, files that are no larger than the chunk size only have a final chunk, using only as much space as needed plus some additional metadata.More on this topic can be found here

Let’s talk Code

Here we will see some basic Mongo Shell operations.

  1. Start Mongo daemon

    Pavans-MacBook-Pro:~ pavanpkulkarni$ mongod
    2018-05-15T14:46:15.957-0400 I CONTROL  [initandlisten] MongoDB starting : pid=40209 port=27017 dbpath=/data/db 64-bit host=Pavans-MacBook-Pro.local
    2018-05-15T14:46:15.958-0400 I CONTROL  [initandlisten] db version v3.6.0
    2018-05-15T14:46:15.958-0400 I CONTROL  [initandlisten] git version: a57d8e71e6998a2d0afde7edc11bd23e5661c915
    2018-05-15T14:46:15.958-0400 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.2o  27 Mar 2018
    2018-05-15T14:46:15.958-0400 I CONTROL  [initandlisten] allocator: system
    2018-05-15T14:46:15.958-0400 I CONTROL  [initandlisten] modules: none
    2018-05-15T14:46:15.958-0400 I CONTROL  [initandlisten] build environment:
    2018-05-15T14:46:15.958-0400 I CONTROL  [initandlisten] distarch: x86_64
    2018-05-15T14:46:15.958-0400 I CONTROL  [initandlisten] target_arch: x86_64
    2018-05-15T14:46:15.958-0400 I CONTROL  [initandlisten] options: {}
    2018-05-15T14:46:15.958-0400 W -        [initandlisten] Detected unclean shutdown - /data/db/mongod.lock is not empty.
    2018-05-15T14:46:15.960-0400 I -        [initandlisten] Detected data files in /data/db created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
    
    .
    .
    .
    .
    .
    
    2018-05-15T14:46:16.414-0400 I CONTROL  [initandlisten] **          server with --bind_ip 127.0.0.1 to disable this warning.
    2018-05-15T14:46:16.414-0400 I CONTROL  [initandlisten]
    2018-05-15T14:46:16.437-0400 I FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory '/data/db/diagnostic.data'
    2018-05-15T14:46:16.438-0400 I NETWORK  [initandlisten] waiting for connections on port 27017
    2018-05-15T14:46:17.007-0400 I FTDC     [ftdc] Unclean full-time diagnostic data capture shutdown detected, found interim file, some metrics may have been lost. OK
    
    
  2. Open a new terminal

    Pavans-MacBook-Pro:~ pavanpkulkarni$ mongo
    MongoDB shell version v3.6.0
    connecting to: mongodb://127.0.0.1:27017
    MongoDB server version: 3.6.0
    Server has startup warnings:
    2018-05-15T14:46:16.413-0400 I CONTROL  [initandlisten]
    2018-05-15T14:46:16.413-0400 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
    2018-05-15T14:46:16.413-0400 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
    2018-05-15T14:46:16.413-0400 I CONTROL  [initandlisten]
    2018-05-15T14:46:16.413-0400 I CONTROL  [initandlisten] ** WARNING: This server is bound to localhost.
    2018-05-15T14:46:16.413-0400 I CONTROL  [initandlisten] **          Remote systems will be unable to connect to this server.
    2018-05-15T14:46:16.413-0400 I CONTROL  [initandlisten] **          Start the server with --bind_ip <address> to specify which IP
    2018-05-15T14:46:16.413-0400 I CONTROL  [initandlisten] **          addresses it should serve responses from, or with --bind_ip_all to
    2018-05-15T14:46:16.413-0400 I CONTROL  [initandlisten] **          bind to all interfaces. If this behavior is desired, start the
    2018-05-15T14:46:16.414-0400 I CONTROL  [initandlisten] **          server with --bind_ip 127.0.0.1 to disable this warning.
    2018-05-15T14:46:16.414-0400 I CONTROL  [initandlisten]
    > 
    
    

    Let’s check the default dbs

> show dbs
admin     0.000GB
config    0.000GB
local     0.000GB

Now, we will go ahead and create our own db. Type the below

> use super_hero_db
switched to db super_hero_db
> show dbs
admin     0.000GB
config    0.000GB
local     0.000GB

This will only switch to our super_hero_db. But at this point, the db itself is not created. To create a db, we need to insert a record.

> db.myCollection.insertOne( { x: 1 } );

Now when we can see our db in the listing

> show dbs
admin          0.000GB
config         0.000GB
local          0.000GB
super_hero_db  0.000GB

Also, typing db will give us the current db

> db
super_hero_db

Let’s now crate a collection. A collection may store a number of documents. A collection is analogous to a table of an RDBMS. A collection may store documents those who are not same in structure. This is possible because MongoDB is a Schema-less database.

> db.createCollection("students")
{ "ok" : 1 }
> show collections
myCollection
students

To insert data in MongoDB, we can either insert manually or from a file. Let’s see how to import data from JSON file. I have the file uploaded on my Gist.

Pavans-MacBook-Pro:~ pavanpkulkarni$ mongoimport --db super_hero_db --collection students --type json --file ~/Documents/data.json --jsonArray
2018-05-16T16:57:41.804-0400	connected to: localhost
2018-05-16T16:57:41.807-0400	imported 11 documents

To view data simply type,

> db.students.find()
{ "_id" : ObjectId("5afc9b45ef01bb656bfe5fd3"), "id" : 1, "name" : "Tom Riddle", "courses_registered" : [ { "CID" : "CS001", "sem" : "Spring_2001" }, { "CID" : "CS002", "sem" : "Summer_2001" }, { "CID" : "CS001", "sem" : "Fall_2001" } ], "year_graduated" : "2001" }
{ "_id" : ObjectId("5afc9b45ef01bb656bfe5fd4"), "id" : 3, "name" : "Haan Solo", "courses_registered" : [ { "CID" : "CS003", "sem" : "Spring_2002" }, { "CID" : "CS004", "sem" : "Summer_2002" }, { "CID" : "CS005", "sem" : "Fall_2002" } ], "year_graduated" : "2002" }
{ "_id" : ObjectId("5afc9b45ef01bb656bfe5fd5"), "id" : 5, "name" : "Sheldon Cooper", "courses_registered" : [ { "CID" : "CS004", "sem" : "Spring_2004" }, { "CID" : "CS005", "sem" : "Summer_2004" }, { "CID" : "CS003", "sem" : "Fall_2004" } ], "year_graduated" : "2004" }
{ "_id" : ObjectId("5afc9b45ef01bb656bfe5fd6"), "id" : 6, "name" : "Tony Stark", "courses_registered" : [ { "CID" : "CS009", "sem" : "Spring_2005" }, { "CID" : "CS006", "sem" : "Summer_2005" }, { "CID" : "CS004", "sem" : "Fall_2005" } ], "year_graduated" : "2005" }
{ "_id" : ObjectId("5afc9b45ef01bb656bfe5fd7"), "id" : 7, "name" : "Stephan Hawkings", "courses_registered" : [ { "CID" : "CS004", "sem" : "Spring_2006" }, { "CID" : "CS005", "sem" : "Summer_2006" }, { "CID" : "CS003", "sem" : "Fall_2006" } ], "year_graduated" : "2006" }
{ "_id" : ObjectId("5afc9b45ef01bb656bfe5fd8"), "id" : 8, "name" : "Cerci Lannister", "courses_registered" : [ { "CID" : "CS001", "sem" : "Spring_2007" }, { "CID" : "CS003", "sem" : "Summer_2007" }, { "CID" : "CS009", "sem" : "Fall_2007" } ], "year_graduated" : "2007" }
{ "_id" : ObjectId("5afc9b45ef01bb656bfe5fd9"), "id" : 9, "name" : "Wonder Woman", "courses_registered" : [ { "CID" : "CS006", "sem" : "Spring_2008" }, { "CID" : "CS007", "sem" : "Summer_2008" }, { "CID" : "CS009", "sem" : "Fall_2008" } ], "year_graduated" : "2008" }
{ "_id" : ObjectId("5afc9b45ef01bb656bfe5fda"), "id" : 4, "name" : "Frodo Baggins", "courses_registered" : [ { "CID" : "CS009", "sem" : "Spring_2003" }, { "CID" : "CS010", "sem" : "Summer_2003" }, { "CID" : "CS004", "sem" : "Fall_2003" } ], "year_graduated" : "2003" }
{ "_id" : ObjectId("5afc9b45ef01bb656bfe5fdb"), "id" : 11, "name" : "Peter Parker", "courses_registered" : [ { "CID" : "CS001", "sem" : "Spring_2010" }, { "CID" : "CS002", "sem" : "Summer_2010" }, { "CID" : "CS005", "sem" : "Fall_2010" } ], "year_graduated" : "2010" }
{ "_id" : ObjectId("5afc9b45ef01bb656bfe5fdc"), "id" : 10, "name" : "Hermione Granger", "courses_registered" : [ { "CID" : "CS010", "sem" : "Spring_2009" }, { "CID" : "CS002", "sem" : "Summer_2009" }, { "CID" : "CS007", "sem" : "Fall_2009" } ], "year_graduated" : "2009" }
{ "_id" : ObjectId("5afc9b45ef01bb656bfe5fdd"), "id" : 2, "name" : "Ned Stark", "courses_registered" : [ { "CID" : "CS003", "sem" : "Spring_2002" }, { "CID" : "CS004", "sem" : "Summer_2002" }, { "CID" : "CS005", "sem" : "Fall_2002" } ], "year_graduated" : "2002" }

To get a neatly indented output, pretty() option can be used.

We can also get just one record from the student collection as

> db.students.findOne()
{
	"_id" : ObjectId("5afc9b45ef01bb656bfe5fd3"),
	"id" : 1,
	"name" : "Tom Riddle",
	"courses_registered" : [
		{
			"CID" : "CS001",
			"sem" : "Spring_2001"
		},
		{
			"CID" : "CS002",
			"sem" : "Summer_2001"
		},
		{
			"CID" : "CS001",
			"sem" : "Fall_2001"
		}
	],
	"year_graduated" : "2001"
}

We can filter on any field

> db.students.find({name: "Tony Stark"})
{ "_id" : ObjectId("5afc9b45ef01bb656bfe5fd6"), "id" : 6, "name" : "Tony Stark", "courses_registered" : [ { "CID" : "CS009", "sem" : "Spring_2005" }, { "CID" : "CS006", "sem" : "Summer_2005" }, { "CID" : "CS004", "sem" : "Fall_2005" } ], "year_graduated" : "2005" }

All students who have registered for course CS001

> db.students.find({"courses_registered": {$elemMatch : {CID : "CS001"}}})
{ "_id" : ObjectId("5afc9b45ef01bb656bfe5fd3"), "id" : 1, "name" : "Tom Riddle", "courses_registered" : [ { "CID" : "CS001", "sem" : "Spring_2001" }, { "CID" : "CS002", "sem" : "Summer_2001" }, { "CID" : "CS001", "sem" : "Fall_2001" } ], "year_graduated" : "2001" }
{ "_id" : ObjectId("5afc9b45ef01bb656bfe5fd8"), "id" : 8, "name" : "Cerci Lannister", "courses_registered" : [ { "CID" : "CS001", "sem" : "Spring_2007" }, { "CID" : "CS003", "sem" : "Summer_2007" }, { "CID" : "CS009", "sem" : "Fall_2007" } ], "year_graduated" : "2007" }
{ "_id" : ObjectId("5afc9b45ef01bb656bfe5fdb"), "id" : 11, "name" : "Peter Parker", "courses_registered" : [ { "CID" : "CS001", "sem" : "Spring_2010" }, { "CID" : "CS002", "sem" : "Summer_2010" }, { "CID" : "CS005", "sem" : "Fall_2010" } ], "year_graduated" : "2010" }
>

Updating in mongodb is very easy. Let me go ahead and update information for name = “Tony Stark”. Notice that the _id field is same for original document (Tony Stark) and new doc (Iron Man). We can also, use upsert: true To avoid inserting the same document more than once.

Syntax :
db.students.update({}, {})

> db.students.update({name:"Tony Stark"}, {"id" : 6, "name" : "Iron Man", "courses_registered" : [ { "CID" : "CS009", "sem" : "Spring_2005" }, { "CID" : "CS006", "sem" : "Summer_2005" }, { "CID" : "CS004", "sem" : "Fall_2005" } ], "year_graduated" : "2005"})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.students.find({name: "Tony Stark"})
> db.students.find({name: "Iron Man"})
{ "_id" : ObjectId("5afc9b45ef01bb656bfe5fd6"), "id" : 6, "name" : "Iron Man", "courses_registered" : [ { "CID" : "CS009", "sem" : "Spring_2005" }, { "CID" : "CS006", "sem" : "Summer_2005" }, { "CID" : "CS004", "sem" : "Fall_2005" } ], "year_graduated" : "2005" }
>

Since Mongo has no in-built feature to print the schema, we can find the high level schema of a document as

> var myschema = db.students.findOne()
> for (var key in myschema) { print (key, typeof key)}
_id string
id string
name string
courses_registered string
year_graduated string

Also, take a quick look into db.collection.updateOne() and db.collection.updateMany here

Some Useful Commands:

  1. db.collection.explain()
  2. db.collection.updateOne()
  3. db.collection.updateMany()
  4. db.collection.drop()

For all our developer friends, this post will be good read to build a stand alone application in Scala to process data from MongoDB. You can get as creative as you want here :)

References:

  1. https://en.wikipedia.org/wiki/MongoDB
  2. https://www.mongodb.com/mongodb-architecture
  3. https://docs.mongodb.com/manual/mongo/
  4. https://docs.mongodb.com/manual/tutorial/update-documents/