Monday, January 31, 2011

DS - Data Integration and Diff. B/W FACT & DIMENSIONS

What is DATA integration
Data from different sources (like flatfiles CSV oracle db2 teradata mainfraes etc.) are extracted and transformed in to a single specific pattern (unique structure) and is know as integration of data.
The ETL tools like datastage informatica and pentaho (open source) are used to convert the data from different format to unique format.

OR

Data integration involves data combining in different residing in different sources and providing users with a unified view of these data.This process becomes significant in a variety of situations both commercial and scientific (combining research results from different bioinformatics repositories for example). Data integration appears with increasing frequency as the volume and the need to share existing data explodes.It has become the focus of extensive theoretical work and numerous open problems remain unsolved.

***********************************
Differce between FACT AND DIMENSION TABLES

A fact table captures the data that measures the organization's business operations. A fact table might contain business sales events such as cash register transactions or the contributions and expenditures of a nonprofit organization. Fact tables usually contain large numbers of rows, sometimes in the hundreds of millions of records when they contain one or more years of history for a large organization.

Dimension tables contain attributes that describe fact records in the fact table. Some of these attributes provide descriptive information; others are used to specify how fact table data should be summarized to provide useful information to the analyst.

AND

Fact table can stores large amount of the numerical data.so fact table is big in size as compared to Dimension table.

************

No comments:

Post a Comment