MongoDB (from "humongous") is an open source document-oriented NoSQL database system.
MongoDB is part of the NoSQL family of database systems. Instead of storing data in tables as is done in a "classical" relational database, MongoDB stores structured data as JSON-like documents with dynamic schemas (MongoDB calls the format BSON), making the integration of data in certain types of applications easier and faster.
Development of MongoDB began in October 2007 by 10gen. It is now a mature and feature rich database ready for production use.[citation needed] It is used, for example, by MTV Networks[1], Craigslist[2] and Foursquare[3], UIDAI Aadhaar India's Unique identification project.
Binaries are available for Windows, Linux, OS X, and Solaris.
History
Development of MongoDB began at 10gen in 2007, when the company was building a Platform as a Service similar to Google App Engine[4]. In 2009 MongoDB was open sourced as a stand-alone product [5] with an AGPL license.
In March 2011, from version 1.4, MongoDB has been considered production ready[6].
The latest stable version, 2.0.5, was released in May 2012.
Licensing and support
MongoDB is available for free under the GNU Affero General Public License [5]. The language drivers are available under an Apache License. In addition, 10gen offers commercial licenses for MongoDB.[7]
Main features
The following is a brief summary of some of the main features:
- Ad hoc queries
- MongoDB supports search by field, range queries, regular expression searches. Queries can return specific fields of documents and also include user-defined JavaScript functions.
- Indexing
- Any field in a MongoDB document can be indexed (indexes in MongoDB are conceptually similar to those in RDBMSes). Secondary indexes are also available.
- Replication
- MongoDB supports master-slave replication. A master can perform reads and writes. A slave copies data from the master and can only be used for reads or backup (not writes). The slaves have the ability to elect a new master if the current one goes down.
- Load balancing
- MongoDB scales horizontally using sharding[8]. The developer chooses a shard key, which determines how the data in a collection will be distributed. The data is split into ranges (based on the shard key) and distributed across multiple shards. (A shard is a master with one or more slaves.)
- MongoDB can run over multiple servers, balancing the load and/or duplicating data to keep the system up and running in case of hardware failure. Automatic configuration is easy to deploy and new machines can be added to a running database.
- File storage
- MongoDB could be used as a file system, taking advantage of load balancing and data replication features over multiple machines for storing files.
- This function, called GridFS[9], is included with MongoDB drivers and available with no difficulty for development languages (see "Language Support" for a list of supported languages). MongoDB exposes functions for file manipulation and content to developers. GridFS is used, for example, in plugins for NGINX[10] and lighttpd[11]
- In a multi-machine MongoDB system, files can be distributed and copied multiple times between machines transparently, thus effectively creating a load balanced and fault tolerant system.
- Aggregation
- MapReduce can be used for batch processing of data and aggregation operations. The aggregation framework enables users to obtain the kind of results SQL group-by is used for
- Server-side JavaScript execution
- JavaScript can be used in queries, aggregation functions (such as MapReduce), are sent directly to the database to be executed.
- Capped collections
- MongoDB supports fixed-size collections called capped collections. This type of collection maintains insertion order and, once the specified size has been reached, behaves like a circular queue.
For further information on the points listed look up the MongoDB Developer Manual
Use cases & production deployments
According to "Use Cases" article at product's web MongoDB[12] is well suited for following cases:
- Archiving and event logging
- Document and Content Management Systems. as a document-oriented (JSON) database, MongoDB's flexible schemas are a good fit for this.
- ECommerce. Several sites are using MongoDB as the core of their ecommerce infrastructure (often in combination with an RDBMS for the final order processing and accounting).
- Gaming. High performance small read/writes are a good fit for MongoDB; also for certain games geospatial indexes can be helpful.
- High volume problems. Problems where a traditional DBMS might be too expensive for the data in question. In many cases developers would traditionally write custom code to a filesystem instead using flat files or other methodologies.
- Mobile. Specifically, the server-side infrastructure of mobile systems. Geospatial key here.
- Operational data store of a web site. MongoDB is very good at real-time inserts, updates, and queries. Scalability and replication are provided which are necessary functions for large web sites' real-time data stores. Specific web use case examples:
- content management
- comment storage, management, voting
- user registration, profile, session data
- Projects using iterative/agile development methodologies. Mongo's BSON data format makes it very easy to store and retrieve data in a document-style / "schemaless" format. Addition of new properties to existing objects is easy and does not generally require blocking "ALTER TABLE" style operations.
- Real-time stats/analytics
Enterprises who use MongoDB
Enterprises using MongoDB include:
For a complete list and references on each particular use case visit the article "Production Deployments" on MongoDB's web[13]
Data manipulation: collections and documents
MongoDB stores structured data as JSON-like documents with dynamic schemas (called BSON), with no predefined schemas.
The element of data is called a document, and documents are stored in collections. One collection may have any number of documents.
Compared to relational databases, we could say collections are like tables, and documents are like records. But there is one big difference: every record in a table has the same number of fields, while documents in a collection could have completely different fields. The only schema requirement mongo places on documents (aside from size limits) is that they must contain an '_id' field with a unique, non-array value.
One table SQL could be represented as
| Last Name |
First Name |
Age |
| DUMONT |
Jean |
49 |
| PELLERIN |
Franck |
29 |
| MATTHIEU |
Nicolas |
51 |
- Every record in a SQL table has the same fields
However a MongoDB collection could be described as
{
"_id": ObjectId("4efa8d2b7d284dad101e4bc9"),
"Last Name": "DUMONT",
"First Name": "Jean",
"Date of Birth": "01-22-1963"
},
{
"_id": ObjectId("4efa8d2b7d284dad101e4bc7"),
"Last Name": "PELLERIN",
"First Name": "Franck",
"Date of Birth": "09-19-1983",
"Address": "1 chemin des Loges",
"City": "VERSAILLES"
}
- Documents in a MongoDB collection could have different fields (note: "_id" field is obligatory, automatically created by MongoDB; it's a unique index which identifies the document; the usage of the default MongoID type is not mandatory, the user may specify any non-array value for _id as long as the value is unique)
In a document, new fields could be added, existing ones suppressed, modified or renamed at any moment. There is no predefined schema. A document structure is really simple and composed of key-value pairs like associative arrays in programming languages (following the JSON format). The key is the field name, the value is its content. Both are separated by ":", as in the example shown.
As value we could use numbers, strings and also binary data like images or another key-value pairs as in the following example:
{
"_id": ObjectId("4efa8d2b7d284dad101e4bc7"),
"Last Name": "PELLERIN",
"First Name": "Franck",
"Date of Birth": "09-19-1983",
"Address": {
"Street": "1 chemin des Loges",
"City": "VERSAILLES"
}
}
Here we can see that the field "Address" contains another document with two fields "Street" and "City".
Language support
MongoDB has official drivers for:
Web programming language Opa also has built-in support for MongoDB, which is tightly integrated in the language and offers a type-safety layer on top of MongoDB.
There are also a large number of unofficial drivers for ColdFusion, Delphi, Lua, Ruby, Smalltalk and many others.
Management and graphical front-ends
MongoDB tools
In a MongoDB installation the following commands are available:
- mongo
- MongoDB offers an interactive shell called mongo [14], which lets developers view, insert, remove, and update data in their databases, as well as get replication information, setting up sharding, shut down servers, execute JavaScript, and more.
- Administrative information can also be accessed through a web interface [15], a simple webpage that serves information about the current server status. By default, this interface is 1000 ports above the database port (28017).
- mongostat
- mongostat[16] is a command-line tool that displays a summary list of status statistics for a currently running MongoDB instance: how many inserts, updates, removes, queries, and commands were performed, as well as what percentage of the time the database was locked and how much memory it is using. This tool is similar to the UNIX/Linux vmstat utility.
- mongotop
- mongotop[17] is a command-line tool providing a method to track the amount of time a MongoDB instance spends reading and writing data. mongotop provides statistics on the per-collection level. By default, mongotop returns values every second. This tool is similar to the UNIX/Linux top utility.
- mongosniff
- mongosniff[18] is a command-line tool providing a low-level tracing/sniffing view into database activity by monitoring (or "sniffing") network traffic going to and from MongoDB. mongosniff requires the Libpcap network library and is only available for Unix-like systems. A cross-platform alternative is the open source Wireshark packet analyzer which has full support for the MongoDB wire protocol.
- mongoimport, mongoexport
- mongoimport[19] is a command-line utility to import content from a JSON, CSV, or TSV export created by mongoexport[20] or potentially other third-party data exports. Usage information can be found in the MongoDB Manual's section on Importing and Exporting MongoDB Data.
- mongodump, mongorestore
- mongodump[21] is a command-line utility for creating a binary export of the contents of a Mongo database; mongorestore[22] can be used to reload a database dump. Data backup strategies and considerations are detailed in the MongoDB Manual's section on Backup and Restoration Strategies.
Monitoring plugins
There are MongoDB monitoring plugins available for the following network tools:
-
More monitoring and diagnostic tools for MongoDB are listed on MongoDB Admin Zone: Monitoring and Diagnostics
Cloud-based monitoring services
-
Web & desktop application GUIs
Several GUIs have been created by MongoDB's developer community to help visualize their data. Some popular ones are:
Open source tools
- RockMongo: PHP-based MongoDB administration GUI tool
- phpMoAdmin: another PHP GUI that runs entirely from a single 95kb self-configuring file
- JMongoBrowser: a desktop application for all platforms.
- Mongo3: a Ruby-based interface.
- Meclipse: Eclipse plugin for interacting with MongoDB
Proprietary tools
More client tools for MongoDB are listed on MongoDB Administrator Manual
Business Intelligence Tools&Solutions
- Jaspersoft BI: Java based Report Designer and Report Server that supports MongoDB
- Nucleon BI Studio: MS Windows based Business Intelligence Solution (Reporting, Charts, Dashboards, Scripts...)for MongoDB and RDBMS.
- RJMetrics: A hosted Business Intelligence Solution that supports MongoDB.
See also
References
Bibliography
- Banker, Kyle (March 28, 2011), MongoDB in Action (1st ed.), Manning, pp. 375, ISBN 978-1-935182-87-0
- Chodorow, Kristina; Dirolf, Michael (September 23, 2010), MongoDB: The Definitive Guide (1st ed.), O'Reilly Media, pp. 216, ISBN 978-1-4493-8156-1
- Pirtle, Mitch (March 3, 2011), MongoDB for Web Development (1st ed.), Addison-Wesley Professional, pp. 360, ISBN 978-0-321-70533-4
- Hawkins, Tim; Plugge, Eelco; Membrey, Peter (September 26, 2010), The Definitive Guide to MongoDB: The NoSQL Database for Cloud and Desktop Computing (1st ed.), Apress, pp. 350, ISBN 978-1-4302-3051-9
External links