A company can use a variety of pricing strategy to set the prices of its products and services, anyway, whatever is the strategy used it will need well-structured information and a data-driven approach to take the right decisions.
Many companies have historical data but only sometimes this data are properly organized. However, it is very important to make this information usable in order to establish a correct pricing.
The number “15” is only a number and it could be, for example, a house number if I am not writing “€” just after it. The data “15€”, instead, let me know we are speaking about a price. Have a column named “price” in a database table named “product price”, properly completed, gives me a certainty. So, that number “15” becomes a usable content, with a functional information.
Once the input data representing the set are identified, the next step will be the data preparation and standardization.
Following, some examples:
- “John Doe” / “john doe”: they certainly represent the same information, but if we not standardize this defects in a single standard format, we will erroneously have to deal with two “John Doe” instead of one. Errors and values like this happen often, and the risk is to waste a lot of time looking for the original mistake.
- 13/11/2021, 11/13/2021, 2021/11/13, 2021-11-13 representing the same date, but different format exist (Italian, English, American with “/” and American with “-”) and they must be standardize before to proceed with any analysis.
- 45°28’38” – 9°10’53” and 45°27′51 – 9°11′22″ are both geographic coordinates of Milano area. Then if my aim would be to identify the city of Milan with only one coordinate, I will have to choose only one of the two to avoid data conflicts.
- Lunedì and Monday: represents the same day of the week, but in two different languages, so if I do not level them out the risk is to analyse a 14 days week.
In an ideal world, a database should already contains uniform and well-structured data itself, but due to the low digitisation level of most of the companies, today is not yet like this.
Once the possible inconsistencies in the existing database and in the new information saving flows have been corrected, the definitive information set is obtained. Before to proceed with the next steps is necessary to check that these information are correct: an incorrectly powered algorithm, inevitably produces incorrect results.
Data-driven pricing
Going back to the initial target, that is pricing products, the information needed are the one needed to estimate the products demand:
- Sales report, with details of quantity sold for any price and for any product, with date and time specified;
- Information about eventual promotion applied for any products in the past and in the present;
- information on customers or group of customers, useful to understand, for example, the demand changes in relation with the nationality;
- different information about competitors:
- Present and past prices for similar products to ours.
- Present and past promotion applied by competitors.
- Qualitative information, such as the market share of each competitor.
- Other information useful to estimate the demand and market related, for example:
- speaking about transports, the delivery method chosen by customers could be influenced by a concomitant event (e.g. delivery demand in event of outbreak)
- if we are, for example, managing a video on demand service, it may be useful to keep track of meteorological information.
- …
It must be said that these information are not all essential for a correct pricing: data of transaction and promotion may be enough to get close to a correct result, which however can be approximate without the support of the other data.
the importance of the data enrichment process
Collecting more information related to competitors or weather trend and use them through artificial intelligence algorithm could allow to learn apparently hidden relationship or information. For example, analysing e-commerce data we could understand that in correspondence with bad weather conditions the number of laptops purchased grows. Having this information available, an algorithm would be able to optimize the prices and increase the company’s revenues.
In addition to the quality and variety of information entered in the file is important to focus on the importance of their numerousness. With a database containing only 100 transactions is not possible to apply artificial intelligence algorithms effectively. In that case will be enough, at least initially, to use simpler algorithm such as preset system and periodical evaluation (for example “lower the price of bread of the 40% after the 5 pm”), or a mathematical interpolation function that allows to estimate, for example, the demand and the relative price to maximize the turnover of the quantity sold. This methods are relatively simple to build and can be a good start when you have few historical sales data.