Data Warehouse : Definition,Architecture and Tools
What is Data WareHouse?
- A data warehouse is a large collection of business data used to help an organisation make decisions.
- DWH is a relational database that is designed for analysis and it contains multiple heterogeneous types of data from various sources.
- Data WareHousing is integrated generally at the organisation level, by combining data from different databases.
- It is also called associate enterprise data warehouse (EDW) as It generally used by enterprises as the data stored by these warehouses is of large size.
- Warehouse stores data retrieved from historical transactions; however, it also contains data from various other sources.
- Extraction, transformation and loading (ETL) tools are available for processing in data warehousing. Apart from these, Online Analytical Processing (OLAP) engine, analytical tools for clients are also present to manage the data gathering and delivering processes.
- Data generated by organisations worldwide is increasing constantly, which is the major driving force behind the growth of data warehousing market
Data Warehouse Architecture
Data warehouse architecture is determined by the organisation’s specific requirement . This has been the pillar of corporate data world.
Currently , there are more ways available for storing, analysing, and indexing data, but we can not ignore the importance of data warehousing
When designing a data warehouse, there are three main types of data architecture to consider.
- The objective of a single layer is to minimise the amount of data stored and eliminate redundancies.
- This architecture is not suitable for businesses with complex data requirements .
- Two-tier structure uses a system and a database server.
- There is a direct communication between client and data source server,
- In this architecture, Application is directly connected to data source layer without any intermediate application.
- It is most commonly used in small organisations.
- This architecture is not scalable.
Example of the two-tier architecture would be storing student related data into the database and retrieving student information when required.
This is the most widely used Architecture of DWH.
It consists of the Top, Middle and Bottom Tier.
1) Bottom Tier (Data Warehouse Server):
It is a relational database system. Data is cleansed, transformed and loaded into this layer using back-end tools.
2) Middle Tier (OLAP Server):
OLAP server is implemented using either ROLAP or MOLAP model. It also acts as a mediator between the end-user and the database.
3) Top-Tier (Front end layer):
Front-end tools are used for displaying results provided by OLAP.
It can be Query tools, reporting tools , Analysis tools and Data mining tools.