In this post i will try to make a rough overview of wtf is the NoSQL database and why and when to use it.
Well, we all know how cool and easy it is when programming even the simple application to have data of the application's model stored in a relational SQL database like
MySQL or
PostgreSQL. It's easy to acces with the well known SQL language, vast majority of developers is familiar with principles of building such a dabtabase like SQL normal forms, there are plenty developed opensource libraries with pretty good documentation and community etc etc.
So why to invent or even try something new, when we have this established relational database ecosystem? Because in last few years it has showed up, that relational databases cannot scale properly due their complicated relations between data, that they handle. So what was considered as a great advantage of relational databases in their decades of fame has became disadvantage. It's because today's vast internet applications like Facebook, Google or Twitter wasn't able to handle their bazillion pieces of data in a real-time. Firstly it was partially solved by denormalizations of their relational databases models, but it wasn't as effective as they've expected. The database related infrastructure necessary to satisfy their needs became very expensive and without desirable impact on a speed of the application. So with a NoSQL database you have to give up some of the relational database properties like
ACID or consistency guarantee but on the other hand you have build-in partitioning, load balancing, transparent replication, great scaling possibilities with the ability to add capacity without without any influence or impact on applications running against the database.
I don't want to say that SQL databases are bad or generally unusable. Not at all. I think that SQL related technology is great and will do its job perfectly for most of your projects. I just want to say, that if you are planning to be really (really) BIG, you should consider using other technology than SQL.
So here comes NoSQL
First of all NoSQL is a big paradigm shift in modeling data and it can be pretty confusing at the beginning. Forget all these JOINs, GROUP BYs, foreign keys, rigid schemes, consistency guarantees. The basic concept of all NoSQL databases is to store only key-value pairs. Tables and designs are replaced by document oriented storage (see the picture below). Those of you familiar with
JSON should cope with that very quickly. The principle of NoSQL is to keep all pieces of the related informations together so that it's easy fetch them. So as a developer you have to model such a database with a "queries" that will be neede by you application already in your mind. In other words you have to know, what kind of operations above the dataset will you need. Yes, this approach has higher requirements on the developers abilities and yes it's not easy to get familiar with this concept, but give it a time.
Examples of NoSQL databases
NoSQL are being developed to effectively handle any amount of data. Their development is kinda fresh. It means that they are optimized for modern concurrency driven environments and clouds in. An examples we can mention
MongoDB,
CouchDB. Nice example of a connection between the database world and the cloud world can support of Apache Cassandra project (formerly facebook proprietary code) from the side of
Rackspace company running
cloud server hosting services.
I'm planning some more in-depth views in to the world of NoSQL in a form of examples and tutorials. So if you are interested in this topic, stay tuned.