Wednesday, 7 December 2011

Bill inmon Vs Ralph Kimball Approach


Bill Inmon's paradigm: Data warehouse is one part of the overall business intelligence system. An enterprise has one data warehouse, and data marts source their information from the data warehouse. In the data warehouse, information is stored in 3rd normal form.
Ralph Kimball's paradigm: Data warehouse is the conglomerate of all data marts within the enterprise. Information is always stored in the dimensional model.
There is no right or wrong between these two ideas, as they represent different data warehousing philosophies. In reality, the data warehouse in most enterprises are closer to Ralph Kimball's idea. This is because most data warehouses started out as a departmental effort, and hence they originated as a data mart. Only when more data marts are built later do they evolve into a data warehouse.Inmon and Kimball have created a great debate in Information Technology during the last decade.  They both relentlessly thrived for conceptualizing information management for decision support.  They approached the problem with different philosophies, design techniques, and implementation strategies.
This article is an analysis of these two approaches based on the issues raised and discovered.
INTRODUCTION TO WILLIAM INMON AND RALPH KIMBALLMr. William (Bill) Inmon is known as the “Father of Data Warehousing”, entitled for coining the term “Data Warehouse” in 1991.  He defined a model to support “single version of the truth” and championed the concept for more than a decade.  He also created “Corporate Information Factory” in collaboration with Ms. Claudia Imhoff.  Mr. Inmon is known to have published 40+ books and 600+ articles.
Mr. Ralph Kimball is known as the “Father of Business Intelligence” for defining the concept behind “Data Marts”, for developing the science behind the analytical tools that utilize dimensional hierarchies, and for conceptualizing star-schemas and snowflake data structures.  He defined a model to support analytical analysis and championed data marts for more than a decade.  Though Kimball’s writings do not exceed Inmon’s by quantity, Kimball’s books are all-time best sellers on data warehousing.
PHILOSOPHIES: QUEST FOR A COMMON GOAL
Inmon and Kimball are two pioneers that started different philosophies for enterprise-wide information gathering, information management, and analytics for decision support.  Inmon believes in creating a single enterprise-wide data warehouse for achieving an overall business intelligence system.  Kimball believes in creating several smaller data marts for achieving department-level analysis and reporting.
APPROACHES
Inmon’s philosophy recommends to start with building a large centralized enterprise-wide data warehouse, followed by several satellite databases to serve the analytical needs of departments (later known as “data marts”). Hence, his approach has received the “Top Down” title.
Kimball’s philosophy recommends to start with building several data marts that serve the analytical needs of departments, followed by “virtually” integrating these data marts for consistency through an Information Bus.  Hence, his approach received the “Bottom Up” title. Mr. Kimball believes in various data marts that store information in dimensional models to quickly address the needs of various departments and various areas of the enterprise data.
STRUCTURES
Besides the differences in approaches, Inmon and Kimball also differ in the structure of the data.  Inmon believes in creating a relational-model (third normal form: 3NF) where as Kimball believes in creating a multi-dimension model (star-schema and snowflakes).
Inmon argues that once the data is in a relational model, it will attain the enterprise-wide consistency which makes it easier to spawn-off the data-marts in dimensional-models.  Kimball argues that the actual users can understand, analyze, aggregate, and explore data-inconsistencies in an easier manner if the data is structured in a dimensional-model.  Additionally, to enable the Information Bus, data marts are categorized [Imhoff, Mastering Data warehouse design] as atomic data marts, and aggregated data marts that both use dimensional-models.
Irrespective of the structural differences in the model, both Inmon and Kimball agrees that there is a need to separate the detailed-level data from aggregated-level data.
CONTENT
Another difference is in the granularity of the content. Inmon believes that the content in the data warehouse has to be at the most granular level possible and must include all the possible historical data within an enterprise. His argument is that the end-users will mandate the needs on the level of data-detail that are not known at the time of building the data warehouse.
COMMON GOALS
Though Mr. Inmon and Mr. Kimball have different philosophies to their approach, they do tend to agree with each other in an indirect manner.  Though Inmon’s basis is on a single data warehouse, he stressed on iterative approach and discouraged the “big bang” approach. On the other hand, though Kimball’s philosophy is to quickly create few successful data marts at a time, he stresses on integration for consistency via an Information Bus.
DATA WAREHOUSE vs. BUSINESS INTELLIGENCE
Business Intelligence = Inmon’s Corporate Data Warehouse + Kimball’s Data Marts + Data Mining + Unstructured Data.
ON THE STREET, IN PRACTICEOver the years, almost every Fortune 500 company has implemented flavors of both the Inmon’s and the Kimball’s philosophies in pursuit of providing “single version of truth” through easily maneuverable analytics. In early 90s, several conferences promoted, numerous magazines recommended, and almost all the large corporations attempted to build Inmon’s centralized data warehouses. These are huge undertakings that needed to brew for several years through hard work and complex designs.
Due to the ease of implementation of Kimball’s philosophy with quicker returns, several mid-size companies initially implemented data marts rather than an enterprise-wide data warehouse. Additionally, there are also reviews stating that all the projects that tried to use Inmon’s approach have failed and none reported to succeed.
However, due to the underestimation of the complexity of the relations between the data marts as well as due to the departmental pressures to develop their respective data marts independently and simultaneously using Kimball’s approach, the database designers have created “silo-ed” data marts. As a result, the essence of Inmon’s centralized data warehouse that integrates departmental data for consistency has been reconsidered and pursued.
INMON, KIMBALL, … AND THE OTHERSAn analysis of Inmon vs. Kimball is not complete without referring to the first published work in 1988 on data warehousing by Barry Devlin and Paul Murphy of IBM Ireland. They coined the less-widely know term “Information Warehousing” which is defined as “A structured environment supporting end users in managing the complete business and supporting Information Systems in ensuring data quality”. Mr. Devlin has finally published his work as a book in 1997 called “Data Warehouse – from Architecture to Implementation”.
THE LESS DISCUSSED DIMENSION: PROCESS
The one dimension that has been ignored, stressed by several experts, is the “process”. For some reason, corporate processes have been treated as if they are not influential factors to the “analytics”.
STILL AN UNCONQUERED SCIENCE
Despite the great efforts from Inmon, Kimball, and the Others, the world of Data Warehousing is still facing great challenges. Even in 2005, after 14 years of Inmon explaining the concept, more than 50% of today’s data warehouse projects are anticipated to fail [Robert Jaques]. In fact, Ted Friedman, a Principal Analyst in Gartner wrote in 2005, “Many enterprises fail to recognize that they have an issue with data quality. They focus only on identifying, extracting, and loading data to the warehouse, but do not take the time to assess the quality.”
Today’s data warehouses suffer from poor quality of the data. Whether or not the poor quality of data existed a decade ago is a questionable hypothesis. In the 90s, the new breed of software products and the ease of implementing data-moving techniques have opened several avenues that resulted in data duplication. As a result, any data inconsistencies in source systems have been remedied by scrubbing and cleansing them on “local copies” of the data sets rather than taking efforts to correct them at the sources.
If Inmon or Kimball had foreseen the wave of software product proliferation in the 90s that implemented duplicated functionality in an organization, they might have stressed on architecting for better quality.
CONCLUSION
Inmon and Kimball have seen the world of accessing enterprise-wide data with different sets of eyes. They both agree that easier access of enterprise data in an accurate and timely manner is the key success factor for creating an integrated solution for corporate information. Additionally, they both agreed that creating independent silos (often misrepresented as Kimball’s data marts) can only solve a set of specialized needs; difficult to support and maintain in long run; and often requires reconciliation and duplicate efforts to migrate into an enterprise-wide effort.
Inmon has definitely foreseen the hurdles and issues with data management through integration. Inmon presented them in a very academic manner that cannot be ignored. Several failures in the market can be attributed due to ignoring what Inmon has warned about. However, Kimball has brought forward a practical approach that corporation love to execute – with a “project” mindset that has definitive budget and time. Inmon’s writings usually tend to generalize a concept with little attention to the technical details. Kimball’s writings try to establish a definitive science that includes implementation techniques with abundant examples.
However, data accuracy is still the monster that needs to be conquered by several organizations.
On the end note, Inmon and Kimball have created approaches that complement each other. It only appears that they contradict each other if one tends to “pick” an approach.
REFERENCES
1. Using the Data Warehouse, W. H. Inmon, Richard D. Hackathorn, Wiley, July, 1994.
2. Managing the Data Warehouse, W.H. Inmon, John Wiley & Sons, December, 1996.
3. The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses, Ralph Kimball, John Wiley & Sons, February, 1996.
4. The Data Warehouse Challenge: Taming Data Chaos, Michael H. Brackett, John Wiley & Sons, July, 1996.
5. Data Warehouse Project Management, Sid Adelman, Larissa T. Moss, Addison-Wesley Professional, December, 2000
6. Building the Data Warehouse (3rd Edition)”, W.H. Inmon, Wiley, March, 2002.
7. Nicholas Galemmo's Mastering Data Warehouse Design.

25 comments:

  1. It seems to be nice overall nice blog and thanks for share this.
    Chicago Refrigerated Warehouse

    ReplyDelete
  2. This is a very good content I read this blog, please share more content on MSBI Online Course

    ReplyDelete
  3. This is my 1st visit to your web... But I'm so impressed with your content. Good Job!
    Click here:
    angularjs training in rajajinagar
    Click here:
    angularjs training in marathahalli

    ReplyDelete
  4. Very well written blog and I always love to read blogs like these because they offer very good information to readers with very less amount of words....thanks for sharing your info with us and keep sharing.
    Click here:
    Microsoft azure training in annanagar
    Click here:
    Microsoft azure training in velarchery

    ReplyDelete
  5. The knowledge of technology you have been sharing thorough this post is very much helpful to develop new idea. here by i also want to share this.
    Devops Training in pune

    ReplyDelete
  6. Hi, Great.. Tutorial is just awesome..It is really helpful for a newbie like me.. I am a regular follower of your blog. Really very informative post you shared here. Kindly keep blogging.
    Data Science training in rajaji nagar | Data Science with Python training in chenni
    Data Science training in electronic city | Data Science training in USA
    Data science training in pune | Data science training in kalyan nagar

    ReplyDelete
  7. Whoa! I’m enjoying the template/theme of this website. It’s simple, yet effective. A lot of times it’s very hard to get that “perfect balance” between superb usability and visual appeal. I must say you’ve done a very good job with this.


    AWS Training in BTM Layout |Best AWS Training in BTM Layout

    AWS Training in Marathahalli | Best AWS Training in Marathahalli

    ReplyDelete
  8. Excellant post!!!. The strategy you have posted on this technology helped me to get into the next level and had lot of information in it.

    angularjs Training in chennai
    angularjs Training in chennai

    angularjs-Training in tambaram

    angularjs-Training in sholinganallur

    angularjs-Training in velachery

    ReplyDelete
  9. Hey, would you mind if I share your blog with my twitter group? There’s a lot of folks that I think would enjoy your content. Please let me know. Thank you.
    Java Training in Chennai | J2EE Training in Chennai | Advanced Java Training in Chennai | Core Java Training in Chennai | Java Training institute in Chennai

    ReplyDelete
  10. Very useful and information content has been shared out here, Thanks for sharing it. oracle training in chennai

    ReplyDelete