Friday 7 October 2011

EDW Architecture

 

SOURCE SYSTEM:


an operational system of record whose function is basically to capture the transactions of the business.
a source system can be a “legacy system” in a mainframe environment or can be “flat files” or can also be from “multiple rdbms systems.
diverseand uncoordinated
–different platforms in many locations
–multiple file formats and data types



STAGGING AREA:

a storage area and set of processes that clean, transform, combine, duplicate, archive, and prepare source data for use in the data warehouse.

the data staging area is everything in between the source system and the presentation server.

the staging area is dominated by the simple activities of sorting and sequential processing.

the data staging area does not need to be based on relational technology.

data staging area does not provide query and presentation services
 PRESENTATION LAYER:
        through the presentation layer, users can create custom reports, access standard reports and access   “dashboards” with drill-down capabilities as well as alerting functions
    cleansed / “presentation” data  are moved from the data staging area to

one or more servers designed for access by decision makers and others
presentation servers are:
subjectoriented

user communitydriven

for data marts: locallyimplemented
 
 
Semantic Layer:


the semantic layer defines a set of “logical” relationships which translate complex data terminology into business terms that will be familiar to end-users (e.g. product, customer, revenue, etc.)

for example, if a user wants to create a report showing product revenue generated from all customers for the past three years, the user simply chooses four objects:  customer, product, revenue and time.

thanks to the semantic layer, the user only has to be familiar with the business terminology relevant for his/her query.  (there is no need to understand the back-end database structure.) 


SUMMARY OF DIFFERENT LAYERS:

the target physical machine on which the data warehouse data is organized and stored for direct querying by end users, report writers, and other applications.

to summarize, three very different systems are required for a data warehouse to function:

the source system

the data staging area

and the presentation server.

the source system should be outside the dw.

the data staging area is described as the initial storage and cleaning system for dw.

the presentation server presents and stores in a dimensional framework.




No comments:

Post a Comment