Opt for an SQL or NoSQL database? This dichotomy represents one of the biggest IT “battles” of these years and the near future.
Let’s start by saying that there is no single solution to this question and, as always in engineering, the most correct answer is definitely “it depends.” This word infuriates non-technical people precisely because being non-technical they cannot understand in detail the various nuances that lead to the understanding that “it depends” is indeed the correct answer.
Let us try to clarify.
SQL or NoSQL, the characteristics
First of all, SQL (Structured query language) database refers to the most widely used family of databases in the world. They are the relational ones, with tables, rows, keys and, indeed, relationships between tables. To describe the citizens of a municipality and the cars they own we will need two tables:
– Citizens (with the various fields such as first name, last name and tax code – the latter, being unique by definition will be in cases of well-designed databases, the key to the table)
– Cars (with the license plate, description and a reference to the owner, contained in the other table, “citizens” such as the unique tax code and in multiple keys)
It is easy to query a relational database, so to find out how many cars “Mario Rossi” has, I only need a few words command (SELECT …. FROM … WHERE …). Even inserting, deleting or editing existing rows is very easy in a relational database.
In short, dated technology but with countless advantages.
SQL or NoSQL, the limitations of SQL DBs.
Let us now imagine, however, that we have to receive weather data from a large number of different data sources and then, at a later time, organize the data into a standardized format.
Specifically, each weather source will have different information such as:
– Perceived temperature
– Actual temperature
– Wind
– Visibility meters
– General conditions
– Humidity
– Reference day and time
Each source will have only part of this information and perhaps described with a different number of fields, e.g., for someone today’s general condition will be “sunny,” for another it will be “sunny”-“level 3.” These values need to be made consistent so that they can be used together in the same way. There are two solutions for this:
– As I read the source I transform it and save it directly to a standardized database (SQL is fine here).
– I want to keep the original data and then save the information in a NoSQL database that allows flexibility in handling unstructured information
When the sources are only 2 with 2 SQL tables I get by, if however there are 20 and very different in structure and information then in that case saving to a NoSQL database should be considered.
SQL or NoSQL, conclusions
So, as you may have guessed, NoSQL databases allow you to save information without specifying a data model a priori and in addition, in general, they suffer less from large size than their SQL cousins. In short, if the DB is or will become large, better than a NoSQL. There are tricks such as tables in SQL partitions to get around slow problems but NoSQL databases are born to handle “Big” data.
There is a potential problem with NoSQL databases.
– It is not as easy to query them
– It is not so easy to modify the data
Nothing impossible you mean, but you have to query the DB in a different way all the time or change data often, maybe you should consider a SQL database anyway.
Maybe even a hybrid solution, where you use a NoSql to save the unstructured data and then a SQL to save the standard data to query and manipulate.
As you have seen there are many aspects to consider:
– SQL DBs are simpler to use but require a major constraint which is to accept data in “packed” structures
– NoSQL DBs are more complex, but they allow you to handle data with very heterogeneous structures that are not known in advance and in general perform well for large volumes of data
Depending on the specific needs of the company, it will be necessary to evaluate the pros and cons as always, considering that the perfect solution does not exist, but it is necessary to find the best compromise.