Wednesday, March 12, 2014

Querying the Top 5 Timed Events

Querying the Top 5 Timed Events

In a previous life, I spent the better part of two years digesting other people's AWR reports.  Text, HTML, Statspack etc., all in an effort to show how slow their storage was an how much better their databases would perform on our all-flash storage array.

One of the frustrations of this job was that every problem generally only had one solution (a flash storage array, if you hadn't guessed) and along the way I encountered some pretty horribly tuned databases but of course could do nothing about that.

I did get to work with some excellent and talented people though, including the English duo of Chris Buckel and Ash Ubhi who have produced some sterling work in the field of AWR analysis - and parsing AWR reports and transforming them into lovely Excel charts...  All very useful sales stuff and actually quite essential for in-house DBA's for reporting purposes.  Most management like a pretty chart.

Writing my own AWR scripts to circumvent having to run a thousand reports led me down various rat holes - I was really looking for the top 5 timed events in query form.... and then I stumbled across the blog of my fellow Enkitecian, Alex Fatkulin.  In his blog entry (that's where the link goes), he provides a script to do just what I wanted.   Here's Alex's original script.

As you can see, it will give a nicely formatted output....

Top 5 Wait Events
Great for various purposes so utilizing the work Alex already did, I modified it slightly to give me statistics that I could turn into useful presentation materials.  Here's a snippet of the data I extracted:

More Top 5 Wait Events
In general, I only care about a few events - I/O or contention related for my purposes on this client engagement.  

Graphical wait event data
Not rocket science but it makes my life easier.  My code modifications to the original are minimal.

Have fun with your own AWR analysis.

Sunday, February 23, 2014

Beware the transparent HugePages in Linux 6

Beware the transparent HugePages in Linux 6

I came across this one earlier this week on a client site which had me scratching my head.


[root@someServer ~]# grep Huge /proc/meminfo
AnonHugePages 458752 kB
HugePages_Total 0
HugePages_Free 0
HugePages_Rsvd 0
HugePages_Surp 0
Hugepagesize 2048 kB

All looks good, except what are Anonymous HugePages?

Turns out that new with Linux 6 amongst others, comes this weird set of dynamic HugePages that are different from the static ones defined and created at boot time.

Digging into Oracle Support, I see the lovely note 1557478.1 that indicates that these are generally not good things to have around on Oracle servers and that they should be disabled.

For your edification, this is the important part of that note and describes how to tell if it's enabled and if so, how to disable it.


ALERT: Disable Transparent HugePages on SLES11, RHEL6, OL6 and UEK2 Kernels (Doc ID 1557478.1)
Starting with RedHat6, OL6, SLES11 and UEK2 kernels, Transparent HugePages are implemented and enabled (default) in an attempt to improve the memory management.  Transparent HugePages are similar to the HugePages that have been available in previous Linux releases.  The main difference is that the Transparent HugePages are set up dynamically at run time by the khugepaged thread in kernel while the regular HugePages had to be preallocated at the boot up time.

Because Transparent HugePages are known to cause unexpected node reboots and performance problems with RAC, Oracle strongly advises to disable the use of Transparent HugePages. In addition, Transparent Hugepages may cause problems even in a single-instance database environment with unexpected performance problems or delays. As such, Oracle recommends disabling Transparent HugePages on all Database servers running Oracle.

# cat /sys/kernel/mm/*transparent_hugepage/enabled
[always] never

To disable Transparent HugePages boot time either one of the following 2 methods may be used:
Add the following to the kernel boot line in /etc/grub.conf (a symlink to /boot/grub/grub.conf) and reboot the server (this is the preferred method):
transparent_hugepage=never

Once modified the line will read similar to the following example:
title Oracle Linux Server (2.6.32-300.25.1.el6uek.x86_64)
        root (hd0,0)
        kernel /vmlinuz-2.6.32-300.25.1.el6uek.x86_64 ro root=LABEL=/ transparent_hugepage=never
        initrd /initramfs-2.6.32-300.25.1.el6uek.x86_64.img

OR
Add the following lines in /etc/rc.local and reboot the server:
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
  echo never > /sys/kernel/mm/transparent_hugepage/enabledfiif test -f /sys/kernel/mm/transparent_hugepage/defrag; then  echo never > /sys/kernel/mm/transparent_hugepage/defragfiPlease change the file path for RHEL kernel to /sys/kernel/mm/redhat_transparent_hugepage/ accordingly.


Tuesday, January 21, 2014

Time Warp Oddity

Time warp oddity

So a few days ago, a colleague was converting a client's database from 10.2 single instance (on Linux) to 11.2 RAC (on Linux).  Everything went smoothly.  As with the source database server, the new server uses UTC time.

However he noticed that all SYSDATE functions accessed by remote clients (or by logging in from the server using sqlplus and a connect string) were off - and whilst the current time might be 17:00 UTC, the time returned by the client was 12:00 EDT.

Puzzling - there's no timezone support for a listener which clearly is the issue here, yet the problem is from a little-known configuration file used as part of the Grid Infrastructure install.

Using "cat /proc/<pid>/environ for our listener process we saw that the process environment (derived from the configuration file below) was incorrect.

Oracle Support document 1390015.1 illustrates this problem which boils down to either editing the file shown below, or using srvctl setenv to fix the issue and then either way, bounce the clusterware.

From their note:
For 11.2.0.2 and above the TZ entry in $GRID_HOME/crs/install/s_crsconfig_<nodename>_env.txt sets to correct time zone! This enables resources started under the Grid Infrastructure home (listener, databases) using the timezone set in here and so has no requirement anymore to set TZ for DB and listener via setenv. For more details see: How To Change Timezone for 11gR2 Grid Infrastructure ( Document 1209444.1)

Nice and obscure.