Data modelling is the process of analysing the different types of data that a company can produce and the relationships between them. A data model consists of an abstract representation of the data structures of a database. The act of creating a model is called data modelling. Data structures are all the objects in the database and the rules that govern operations between the data. Data modelling therefore translates into the creation of data flow diagrams, a functional tool for identifying the data input and output flows, the relationship between the different flows and in general a guide to understanding the entire database architecture.
Data modelling, the usefulness of the activity
This tool also becomes very useful for defining the characteristics of data formats and database management functions. It thus represents a standardised method for the constitution and formatting of database contents, enabling different sources and sources to access and work on the same data efficiently.
Data modelling, the three most frequent models
Among the various data models used, the three most frequent are: relational, dimensional and Entity-Relation. Depending on the model used, the way in which data is organised, stored and retrieved is defined accordingly.
- The relational model is the oldest and most common approach. It consists of storing data in a fixed format through the use of measures and dimensions. Measures are numerical values used in arithmetic calculations, dimensions on the other hand can have a number or text format and are used to include descriptions or locations. A relational database is defined by terms and structural requirements, but the important factor is the relationships defined within this structure. In these models, common data elements, i.e. keys, connect tables and data sets.
- The dimensional approach, on the other hand, is characterised by the fact that it is less rigid and structured than the previous one. This is why it tends to be used mainly in business contexts for online queries and data warehousing tools. Fundamental data, for instance, a transaction quantity, are called ‘facts’ and are accompanied by reference information called ‘dimensions. A table of ‘facts’ is a primary table in a dimension model. Retrieval can be quick and efficient – because the data for a given type of activity are stored together – but the lack of relational links can make it more complicated to use the data. In fact, since the data structure is linked to the business function that produces and uses the data, combining data produced by different systems can be problematic.
- Finally, an Entity-Relationship model represents a business data structure in a graphical format containing boxes of various shapes representing activities, functions or ‘entities’ and rows representing associations, dependencies or ‘relationships’. The E-R model is therefore used to create a relational database in which each row represents an entity and the fields in that row contain attributes. As in all relational databases, key data elements are used to link tables together.
Data modelling, the types of data modelling
Three main types of modelling can be used during data processing, i.e. possible layouts to represent levels of thought:
- Conceptual data model: this is used to define the structure and content of the data at a macro level, without going into detail. It is generally used as a general model to then develop logical and physical models.
- Logical data model: describes the data flow and content of the database. The logical model adds detail to the overall structure of the conceptual model, but does not include specifics related to the database itself, as the model can be applied to various technologies and database products.
- Physical data model: describes the specifications of how the logical model is to be realised. It must contain enough detail to allow engineers to create the actual database structure in hardware and software to support the applications that will use it.
Thus, as the characteristics of the models show, data modelling tends to take a top-down approach: starting with the conceptual model to define the overall vision, moving on to the logical model that defines the flows and content of the database, and then moving on to the physical model that contains the fundamental technical details.
Data modelling, the benefits for the company
At the corporate level, data modelling enables collaboration between the IT department and the various business teams, reduces the possibility of errors and improves data integrity, as well as increases the speed and performance of data storage and retrieval.
In addition, a robust data model results in optimised analysis performance, no matter how large and complex the data assets are. When data are clearly defined and collected, data analysis becomes much simpler and margins of error are greatly reduced.
Correct data modelling is precisely the basis of the data acquisition, enrichment and analysis work carried out by our experts at the start of a project to support companies in their pricing decisions.