What is Data Warehousing?

A data warehousing is where anyone can store data from multiple data sources to be used for historical and trend analysis reporting.
Its acts as a central repository for many subject areas and contains the `single version of truth`.

Benefits of a Data Warehouse
A data warehouse maintains a copy of information from the source transaction systems.
This architectural complexity provides the opportunity for:
  • Maintain data history
  • Integrate data from multiple systems
  • Improve data quality
  • Present the organisation's information consistently
  • Provide a single common data model for all data of interest regardless of the data's source
  • Restructure the data so that it makes sense to the business users
  • Restructure the data so that it delivers excellent query performance, even for complex analytic queries
  • Add values to operational business applications

Types of Data Warehouses
Three types of data warehouses
  1. EDW - Enterprise Data Warehouse
  2. ODS - Operational Data Store
  3. Data Mart
  • Enterprise Data Warehouse - An enterprise data warehouse provides a central database for decision support throughout the enterprise
  • Operational Data Store - This has a broad enterprise wide scope, but unlike the real enterprise data warehouse, data is refreshed n near real time and used for routine business activity.
  • Data Mart - Data mart is a subset of data warehouse and it supports a particular region, business unit or business function.

How to build a Data Warehouse
  • Design - Top Down, Bottom Up
  • Customer Interviews and requirement gathering
  • Data profiling
  • Extract Transform Load data from source to data warehouse
  • Create Facts and Dimensions
  • Put a BI tool on top
  • Develop reports
  • Data Governance

Data Warehouse Roles
  • Business User - Client
  • Business User Support
  • Data Administrator
  • Data Analysts
  • Meta Data Administrator
  • Database Administrator
  • Developers (ETL / BI - Reports, Queries)

One Liner - a large store of data accumulated from a wide range of sources within a company and used to guide management decisions.

Comments

Popular posts from this blog

GP - Kerberos errors and resolutions

How to set Optimizer at database level in greenplum

GP - SQL Joins