Databases are at the heart of almost all internet and enterprise applications. The demand for scale speed and fast application development has brought on a new breed of databases broadly termed NoSQL databases. mongoDB is one of the most popular and fast growing databases. Developers want a database that is easy to use. Relational databases save data in tables and rows. A typical application hardly ever does (by that I mean a typical application object is rarely related to a single table). This misalignment of application layer objects to tables and rows is called an Impedance Mismatch.
In The example above you can see that the object Foo has an integer x and bunch of tags. To map this object to a relational database you would probably need three tables, one for the main object, another for the tags and a third to tie them together. As an application developer this forces you to develop a mapping layer or use an ORM (Object Relational Mapper) to translate between the object in memory and what is saved in the database. Most often objects don’t translate to tables and rows. On top of that we may use polymorphism or inheritance. Mapping these to a relational database can add a lot of overhead. In mongo, there is no schema to define. There are no tables and no relationships between collections of objects. Every document you save can be as flat and simple or as complex as your application requires. This makes development much easier and your application code much simpler and cleaner. Also two documents from the same collection maybe different from each other as there is no schema governing the collection.
The relational model itself requires a relational database engine to manage writes in a very special way. To assure consistency and atomicity, it must lock rows and tables and only allow one writer access at a time.
Protecting referential integrity across the tables and rows increases the time the lock has to be in effect. Increased locking time means less writes and updates per second leading to higher latency in transactions and thus slower application. Scaling out by replicating or sharding data to other servers can make things even worse. If relational engines try to enforce consistency and extend these locks across the networks, lock times become longer and transaction latency becomes higher. This obviously will make applications even slower.
Relational databases address these issues by denormalizing tables or by relaxing consistency say by allowing dirty reads etc… Obviously this obfuscates the whole purpose of a relational database. Production systems often utilize these methods for reporting or decision support system databases where there is a tolerance for a certain margin of error.
So how does mongo approach these issues?
- First there is no schema, no tables, no columns, no rows and certainly no relationship between tables
- Mongo has a single document write scope which effectively means the document lives with in a collection but updating the document occurs one at a time. This means no locking or enforcing of relationships or constraints & no schema to protect
- In replication scenarios, mongo lets you choose the consistency you need. Mongo does not let you lock across several mongo servers. A replica set in Mongo consist of a single server that will accept all writes and several secondary servers which will be replicated to, but there are no locks from the primary to the secondary. Mongo makes it easy to configure your application for higher latency but lesser consisteny or vice versa
- Mongo also has what is called a capped collection. Capped collections has a fixed size and automatically overwrites older documents. Because it has a fixed size, it does not have to waste anytime allocating space for new documents.
In the next article I will try and address how Monogo deals with data consistency
But for now… the images below should give you an at a glance observation of what MongoDB is
Please stay tuned for a better coverage on the finer details of how mongoDB addresses the issues with traditional relational databases. NoSQL has its uses under certain scenarios but its not a catch all solution and certainly not advisable for every application database that you would have to design
Next up : MondoDB data consistency
For all your application development needs, visit www.verbat.com for a fiscally conscious proposal that meets your needs ( So I can keep this blog going as well!!!!)
Alternatively click through the link if you found this article interesting. (This will help the companies Search engine rankings)