Friday, December 13, 2013

Identifying the interloper

I'm often looking to identify top consumers - SQL statements that we know will regularly chew through time, CPU, memory etc.  After a while of living with an application, you will get used to seeing certain statements up there however over time, you may miss something that creeps in.

Most sites I visit as a consultant have the DBMS_WORKLOAD_REPOSITORY snapshots taken automatically every 30 or 60 minutes which may be fine for applications that run well and predictably, however finding problematic events (bad SQL, spikes in latency etc) require that snapshot creation be more frequent.  I personally like snapshots to be generated every 10 minutes which is the minimum interval permissible in Oracle 11g and 12c.  I should add that this is the most frequent that automatic snapshots will be taken - there is nothing stopping you from creating them almost as frequently as you like (beware granularity issues should you try this).

Recently I've been getting a lot of requests to identify what the top consuming SQL statements are and to identify new statements that appear as new software is deployed.  You can certainly do this through reading the AWR reports, but the comparison between snapshots requires a little manipulation unless you like scraping data into Excel (or SuperCalc 4 for those in the stone age) so I came up with some SQL to do the hard work for me.

In this particular case, the SQL looks for statements executed the most frequently, specifically by the "executions_delta" that is quite the useful statistic.  You could of course care about any of the measurements - elapsed time, CPU consumed, I/O waits etc - modify the SQL as appropriate.
So, here it is.  It will identify the top 5 ranked SQL statements for snapshots taken in the past ten hours.

  SELECT snap_id,
         MAX (CASE WHEN exec_rank = 1 THEN sql_id ELSE NULL END) AS "First",
         MAX (CASE WHEN exec_rank = 2 THEN sql_id ELSE NULL END) AS "Second",
         MAX (CASE WHEN exec_rank = 3 THEN sql_id ELSE NULL END) AS "Third",
         MAX (CASE WHEN exec_rank = 4 THEN sql_id ELSE NULL END) AS "Fourth",
         MAX (CASE WHEN exec_rank = 5 THEN sql_id ELSE NULL END) AS "Fifth"
    FROM (  SELECT a.snap_id,
                   a.sql_id,
                   a.executions_delta,
                   DENSE_RANK ()
                   OVER (PARTITION BY a.snap_id ORDER BY a.executions_delta DESC)
                      exec_rank
              FROM sys.wrh$_sqlstat a, sys.dba_hist_snapshot d
             WHERE     a.snap_id = d.snap_id
                   AND (    (d.begin_interval_time > SYSDATE - 10 / 24)
                        AND (d.end_interval_time < SYSDATE - 1 / 96))
          ORDER BY snap_id, exec_rank, sql_id)
   WHERE exec_rank < 6 GROUP BY snap_id

That done, you should end up with data that looks like this. Now it should be somewhat simpler to identify what has changed between snapshots.

SNAP_ID
First
Second
Third
Fourth
Fifth
2909
96g93hntrzjtr
3nkd3g3ju5ph1
db78fxqxwxt7r
53saa2zkr6wc3
32hbap2vtmf53
2910
cm5vu20fhtnq1
3c1kubcdjnppq
b2gnxm5z6r51n
95mpkn5xz9001
db78fxqxwxt7r
2911
cm5vu20fhtnq1
61a161nm1ttjj
5ms6rbzdnq16t
3c1kubcdjnppq
089dbukv1aanh
2912
cm5vu20fhtnq1
61a161nm1ttjj
5ms6rbzdnq16t
089dbukv1aanh
csnp95dz2r8ss
2913
cm5vu20fhtnq1
61a161nm1ttjj
089dbukv1aanh
9pvbjqqt222b4
csnp95dz2r8ss
2914
cm5vu20fhtnq1
61a161nm1ttjj
089dbukv1aanh
9pvbjqqt222b4
csnp95dz2r8ss
2915
cm5vu20fhtnq1
61a161nm1ttjj
5ms6rbzdnq16t
089dbukv1aanh
9pvbjqqt222b4
2916
cm5vu20fhtnq1
61a161nm1ttjj
089dbukv1aanh
3c1kubcdjnppq
9pvbjqqt222b4