A data warehouse is a place to store data. In short, it is a central storage area that gathers data from many different sources, such as databases, CRM systems and diverse flat files. These data are used for analysis and form the basis of business intelligence solutions, reports and decisions.
Before extracting data from the source systems, it is important to define which data are needed for further analysis. After that, the data are validated to keep mistakes from being transmitted to the final reports. The term ETL (Extract, Transform Load) is often used in this context. A data warehouse is designed differently than a standard database. In a standard database, a large amount of data is needed, whereas in a data warehouse, there is a greater demand for data that can be consolidated and combined over time. In contrast to many other systems, a data warehouse also manages data history, which is essential when looking for trends and developing prognostic data.
«A warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision-making process. »
– Bill Inmon, American computer scientist and father of the data warehouse
Figur 1 Datavarehus – http://anantics.com/dw.html
Collecting data from all sources into the data warehouse makes it possible to have an overview of the entire operation. By using a data warehouse, it is easier to find the information bottlenecks in the organization. These can often occur when not enough people are given access to the data. There are many questions that can be asked in the process of gaining insights. What department delivers the best numbers? Do the salespeople know that there is a new product in the portfolio? What campaign was the most effective? Naturally, these questions are different in an operation that sells knowledge or information rather than tangible goods, but the negative consequences of not having an overview are just as serious. Data warehouses and business intelligence solutions are the simplest way to a complete overview.
Advantages and disadvantages
From an IT and business perspective, there are many advantages of having a data warehouse. Generally, the quality of data will be better, and at the same time, the queries run on the data will be answered much quicker, something that is expected by users in this age of increasingly faster technology. The data warehouse saves the user a lot of time because the user does not have to go through the difficult process of retrieving data from different sources.
Much has been written about the many advantages of using a data warehouse, but there are also some disadvantages. A data warehouse cannot handle unstructured and complex amounts of data. Other disadvantages include the level of maintenance required and unclear ownership of the data. Therefore, it is important to consider data security. A data warehouse project lasts over long periods which can be demanding for smaller companies. Nevertheless, a data warehouse is an important tool within business intelligence.
Data warehouses in the cloud
Gradually, there have been more and more companies who have chosen to move their data warehouses to the cloud. By doing this, one does not have to think about management, it is easier to scale, and in many cases, analytical queries are handled faster. There are several options for cloud solutions for data warehouses, including Microsoft Azure.
A data warehouse is an important part of a company’s business intelligence solutions. Experienced consultants who are dedicated to finding good solutions are a fundamental part of a successful data warehouse project. Furthermore, it is important to be aware of which data to analyze. The advantages are many: business intelligence solutions have easier access to data from different sources, it is faster to retrieve data and the quality of the data is improved. Last but not least, performing analysis over different time intervals becomes effortless.