Showing posts with label Data Warehouse. Show all posts
Showing posts with label Data Warehouse. Show all posts

Saturday, February 13, 2016

Data Warehouse models–comparing Kimball to Inmon

Have recently seen various folks whiteboarding approaches to data warehouse (enterprise data, data stores, ods, etc.) One thing I’m noticing is that they generally fit one or both of the historically accepted approaches, those proffered by Ralph Kimball (and group) and from the writings of Bill Inmon. 

Here’s a comparison diagram I have referenced many times over the years.  The diagram, and article, is from 2010, but the original thought dates back to the 1990’s or earlier.

https://bennyaustin.files.wordpress.com/2010/05/kimball-vs-inmon4.jpg

https://bennyaustin.wordpress.com/2010/05/02/kimball-and-inmon-dw-models/

Take them for what they’re worth and for the era they come from.  Inmon’s model contains what looks like the current buzzword of ‘data lake’ (though Gartner and others are skeptical of the approach)  In all practicality, the only OLAP cubes developed today against a star schema layer.   Agile tools like PowerPivot and Tableau bypass this somewhat, but still contain a step where one models groupings (i.e. dims) and aggregate (i.e. facts)

Friday, January 29, 2016

Musings on Data Warehouse projects- dealing with data sources of varying quality

Had a discussion this past week with some great colleagues re: the quality of the database design of a potential source system (or lack thereof, depending on point of view)  I've racked my brain trying to find an article that provided some caution in dealing with this.  Almost sure it came from either a book or article from the Kimball Group.   I will paraphrase it:

From time to time, a data source will appear to warrant improvement of some kind.  The data source provider may be all for it.  Couldn't that spreadsheet be turned into a proper system?  
Avoid such efforts.  Your team will be constantly distracted by work of a very different cadence than that which data warehousing requires.

The closest I could find was a snippet in “The Microsoft Data Warehouse Toolkit”, from the chapter on Business Requirements Example: Adventure Works Cycles  (emphasis mine)

"….price lists and international support are important issues to his organization, but they are transaction system problems because they involve enhancing transaction systems or building new IT infrastructure. You can help with better reporting, but you shouldn't be dealing with connectivity and data capture issues if you can avoid it. "

I have taken this viewpoint into many battles over scope of a data warehouse project…  I have not won all of those battles ;)  Nonetheless, it’s a positive outcome to have this caution incorporated into your mindset when evaluating sources.  When possible move the system development 'out' to the proper parties as quickly as possible.