Friday, June 17, 2016

Performance Tuning: Oracle and SQL Server

In the past few weeks, I was able to attend an Oracle Performance Optimization training put on by my organization.  The training was performed by Gaja Vaidyanatha(@dbperfman).  Now Gaja was a very cogent and informative trainer.  The reason for this post though, is that i finally had the more formal training in the Oracle side to compare to the SQL Server side.  In around 2010 or so, i went to a week long Performance Tuning and Optimization class put on by Microsoft.  That was conducted by Josh Vickery (https://www.linkedin.com/in/joshvickery) also a blazingly smart guy.

What’s common between Oracle and SQL Server Performance Tuning?

Well hopefully without coming across as a hanger onner, what’s common is how obvious in hindsight the general concepts are.   Theory of constraints and basic analysis and scientific method rule the day.  Of course the tools vary, but the basic approach is the same.  Gather data on where your database operations are spending most of their time, determine the largest blocker.  Optimize that, then re-measure.  Lather, rinse, repeat.  So both formal instructions were not necessarily new frameworks of thinking for someone who’s had analytical tasks before.  But like most training, there are tricks of the trade and good mindsets shared

Hints, connection settings, session settings

Both platforms have a framework of database session hints, connection settings, sessions settings.  Most of which you should avoid.  While the Einsteins (ref: Lhotka) may point out that you can get such and such a performance increase or less code, etc.  They come at great cost.  #1 the difficulty the next person is going to have figuring out what the heck you were trying to do.  #2 entropy: the characteristics of the database will change and/or product upgrades will obviate the need for the custom configuration or actually make it so its harmful.  Every time I hear about such thing, i think, what are the odds that I can outthink the engineers who designed either of these database platforms. 

I will concede one point that was made.  That if the hint or setting provides information that wasn’t previously available to the database platform.  e.g. cardinality on a temporary table which can’t be known at design time.

It’s the application, stupid!

Other than obvious database perf issues, it’s usually going to be application design issues that drive performance.  Either the application is querying too much data for what it needs, querying the same data repetivily, having long-running applications on the user thread.  While databases will continue to evolve and can be made to perform better and better for cheaper and cheaper, the application design owns the user experience.

What’s different between Oracle and SQL Server Performance Tuning?

Some will read that question and say ‘everything’.  I’m going to be contrarian and say at the high level, not much.  The steps and tools are obviously different.  But you still view an execution plan for a given query (either estimated or actual), or use a profiler tool to analyze waits

Tooling

The tooling for Oracle is a lot less mature than SQL Server.  Maybe not a lot less, but much less accessible.  In SQL Server, through the Management Studio (SSMS), for a query, you can view the estimated execution plan (before actually running the query) or the actual execution plan (after running the query).  In Oracle, the provided tools – SQL Developer does have a way to the view the estimated plan.  To view the actual plan, one has to set a session variable to ‘trace’, browse to the file system of the server (!), copy down a binary file, pass it to an executable called TKProf, then view the output as a text file.  I will say although the SQL Server Management Studio’s graphical view of the execution plan is easier to navigate, the outline view of the execution plan on the Oracle side does have its charms.

In fairness, you might consider Oracle’s tracing/TK prof workflow more similar to SQL Server Profiler and now Extended Events as it includes wait stats and things of this nature.  So then you’re back to this being somewhat limited to sysadmins vs. developers. 

But to the point of it being less accessible, graning permission to n number of developers to access the file system of a database server is odd or possibly untenable.  Luckily for the class I was in, one of our very sharp Oracle DBAs came up with a workaround to expose the default OS trace file location of one of our test servers over HTTP using a simple web server.  Otherwise, the 3 day class would have been spent looking at PowerPoint.  It would seem that third-party tools such as TOAD would provide this capability, but was unable to confirm that with some regular TOAD users

  SQL Server Oracle
Estimated Execution Plan Click the ‘Display Estimated Execution Plan’ button in the toolbar
image
Click the Explain Plan button in the toolbar
image

https://markhoxey.wordpress.com/2012/07/10/obtaining-an-execution-plan/
Actual Execution Plan Click the ‘Include Actual Execution Plan’ button in the toolbar
image
Execute the SQL commands to setup a trace file for yourself
alter session set tracefile_identifier='MyTrace';

Turn on tracing

alter session set sql_trace=true;
alter session set events '10046 trace name context forever, level 12' ;

Execute your SQL

Turn-off tracing

alter session set events '10046 trace name context off';
alter session set sql_trace=false;

?Find the file on your servers files system

Copy to your hard drive

Run the command TKPROF input.trc output.txt

image

So while the steps and individual commands are different, the concepts of Performance Tuning and Optimization in both the SQL Server and Oracle platform are very similar.  If you can find third party tools or open source tools to get around the rough edges of Oracle’s trace file access, developers should find their optimization skills translate well across products.

Thursday, June 16, 2016

Database Architect - what does it all mean?

It's interesting to have such varied work and challenges in front of me in my work for the California Department of Corrections and Rehabilitation (CDCR).  My title is Database Architect, which could mean a lot of things and often does.  I looked back at the week and just noted the variation of tasks, in topic, depth, level of hands-on involvement, hard vs. soft skills, everything.  Blessed to get to work with and around really talented and hard-working people that keep the machine going.

Some of my week looked like:

Received training on Oracle Performance tuning
Conducted training on SQL Server Integration Services, including helping someone get the brand new version SQL Server Data Tools installed.
Helping a group migrate data and customize into a SharePoint list
Trained on SQL Server Reporting Services Report Builder
Helped a developer troubleshoot Web Services over a proxy
Various network architecture plans and database questions

And luckily, since many people take vacation in summer:

Covering for our Infrastructure Architect, helping (mostly just facilitating) the build of CDCR's Inmate Network.

Covering for a database administrator, i've planned and will be performing the organization's first Database backup based on differentials with staged recovery (long way of saying what was expected to take >15 hours should be completed in <5 p="">
Planning portions of a vendor's data center move




Saturday, February 13, 2016

Data Warehouse models–comparing Kimball to Inmon

Have recently seen various folks whiteboarding approaches to data warehouse (enterprise data, data stores, ods, etc.) One thing I’m noticing is that they generally fit one or both of the historically accepted approaches, those proffered by Ralph Kimball (and group) and from the writings of Bill Inmon. 

Here’s a comparison diagram I have referenced many times over the years.  The diagram, and article, is from 2010, but the original thought dates back to the 1990’s or earlier.

https://bennyaustin.files.wordpress.com/2010/05/kimball-vs-inmon4.jpg

https://bennyaustin.wordpress.com/2010/05/02/kimball-and-inmon-dw-models/

Take them for what they’re worth and for the era they come from.  Inmon’s model contains what looks like the current buzzword of ‘data lake’ (though Gartner and others are skeptical of the approach)  In all practicality, the only OLAP cubes developed today against a star schema layer.   Agile tools like PowerPivot and Tableau bypass this somewhat, but still contain a step where one models groupings (i.e. dims) and aggregate (i.e. facts)

Sunday, January 31, 2016

Prep notes for my upcoming talk on 'Partially Contained Databases'

Did a prep session this past Friday  for my upcoming talk (Feb. 3) on 'Partially Contained Databases' to the Sacramento SQL Server Users Group (@SacSQLUG)  (Some #spoileralerts if any the group is reading this before the Feb meeting)  Had a few great, gracious current or former co-workers view the talk and demo over a remote screen sharing session and provide feedback. 

  1. The feedback from a group of people with differing experiences with SQL Server was hugely valuable.  As a presenter, you are so close to the topic, you can include things as obvious that you forgot actually took you quite a few leaps to discover yourself.
  2. The call to action, which I will add to the slides, is to 'try it out'.  Try using partial containment for a simple example like a reporting service account that should have access to only one database.
  3. This topic is fast becoming a standard practice to employ for high availability/disaster recovery scenarios.  (At this time, I'll leave it to the viewer's great mental ability to make the connection) 

Some of the core feedback (which may only make sense when the final presentation is shared)
  •  hard time following the difference between types of accounts
  •  too much time spent on the 'intro'; that is the demonstration of backup/restore in an uncontained database to demo the problem trying to be solved
  • appreciated mention of the connection to certification exams
  •  pre-requisites of DBA knowledge 
My thought process incorporating this feedback:
  • Simplified the login and user creation in the demo.  Namely, removed the example using Windows authentication
  • Removed a time-consuming section on various permutations of recovering logins where the password is known/not known, original server is available/unavailable.   This content is saved for a future blog post.
  • Pre-requisites: well, I think it's OK to make an assumption or two since the talk is aimed to the SQL Server user group.  And in any case, that ship has sailed for this week's talk.  I may review the abstract  for future use and make sure it calls out the viewer is familiar with database backup/restore. 
Hurts my heart a bit to give up the Windows auth and login recovery content, because the content is a good practice.  But the time they eat up distracts from the two key  takeaways:

  1. With a partially contained database, logins can live with the schema of a user database rather than the system database (that's a clue for the High Availability/Disaster Recovery connection)
  2. The db_owner and other database roles have additional power within a user database.
Fun stuff.  and i think if i just script out the sample database creation step, all the scripts could be published to let others run the demo themselves.

Friday, January 29, 2016

Musings on Data Warehouse projects- dealing with data sources of varying quality

Had a discussion this past week with some great colleagues re: the quality of the database design of a potential source system (or lack thereof, depending on point of view)  I've racked my brain trying to find an article that provided some caution in dealing with this.  Almost sure it came from either a book or article from the Kimball Group.   I will paraphrase it:

From time to time, a data source will appear to warrant improvement of some kind.  The data source provider may be all for it.  Couldn't that spreadsheet be turned into a proper system?  
Avoid such efforts.  Your team will be constantly distracted by work of a very different cadence than that which data warehousing requires.

The closest I could find was a snippet in “The Microsoft Data Warehouse Toolkit”, from the chapter on Business Requirements Example: Adventure Works Cycles  (emphasis mine)

"….price lists and international support are important issues to his organization, but they are transaction system problems because they involve enhancing transaction systems or building new IT infrastructure. You can help with better reporting, but you shouldn't be dealing with connectivity and data capture issues if you can avoid it. "

I have taken this viewpoint into many battles over scope of a data warehouse project…  I have not won all of those battles ;)  Nonetheless, it’s a positive outcome to have this caution incorporated into your mindset when evaluating sources.  When possible move the system development 'out' to the proper parties as quickly as possible.