Most organizations agree that data warehouses are a useful tool. They benefit from the ability to store and analyze data, and this can allow them to make sound business decisions. It is also important for them to make sure the correct information is published, and it should be easy to access by the people who are responsible for making decisions.
There are two elements that make up the data warehouse environment, and these are presentation and staging. The staging could also be known as the acquisition area. It is composed of ETL operations, and once the data has been prepared, it will be sent to the presentation area.
When the data is placed within the presentation area, a number of programs will analyze and review it. While many organizations agree on the overall goal of data warehouses, the approaches to building them may differ. Attempting to use data marts alone is not a good approach, because they are geared towards departments. In addition to this, attempting to use data marts alone will be inefficient, and you will run into a number of long term problems. There are two techniques for building data warehouses that have become very popular. These are the Kimball Bus Architecture and the Corporate Information Factory.
With the Kimball technique, the rough data will be transformed and refined within the staging area. It is important to make sure the data is properly handled during this step. During the staging process, the rough data will be pulled from the source systems. While some of the staging processes may be centralized, others will be distributed. The presentation area will have a dimensional structure, and this model will hold the same information as a standard model. However, it will be easier to use, and it will display information that is summarized.
A dimensional model will be created by a business operation. Departments within the organization do not play a role in this. The data will be populated once it is placed within the dimensional warehouse, and is not dependent on the various departments that may compose an organization. When business processes have been developed within the warehouse, the system will become highly efficient. The next popular data warehouse approach that you will want to become familiar with is the Corporate Information Factory. Another name for this technique is the EDW approach. The data that is extracted from the source will be coordinated.
Within the CIF, a standard data warehouse is used to hold data repositories, and it may also have specific data warehouses which are designed for data mining. The data marts may be designed for specific departments, and they may have summary data which is in the form of a dimensional structure. The atomic data may be obtained from the standard data warehouse. While there are some similarities between these to techniques, there are some notable differences as well.
One of the primary differences between these two techniques is the normalized data foundation. With the Kimball approach, the data structures that must be obtained before the dimensional presentation will be dependent on the source data and transformation. In most cases, the duplicate storage of data is not required in both dimensional and normalized foundations. Many of the people who choose to use a normalized data structure believe that it is faster than the dimensional structure, but they often fail to take ETL into consideration.
Another thing that separates the two data warehouse approaches is the management of atomic data. With the CIF, atomic data will be stored within a normalized data warehouse. In contrast, the Kimball method states that the atomic data should be placed within a dimensional structure. When the data is placed within a dimensional structure, it can be summarized in a wide variety of different ways.
It is important to make sure the information you have is detailed so that users will be able to ask relevant questions. While most users will not place an emphasis on the details of one atomic transaction, they may want a summary of a large number of transactions. It is important for them to have the details so that they will be able to answer important questions. The approach that you choose should be the one which best serves the needs of your company.
Data Warehousing Tutorials