Book Image

Oracle 11g R1 / R2 Real Application Clusters Handbook

Book Image

Oracle 11g R1 / R2 Real Application Clusters Handbook

Overview of this book

RAC or Real Application Clusters is a grid computing solution that allows multiple nodes (servers) in a clustered system to mount and open a single database that resides on shared disk storage. Should a single system (node) fail, the database service will still be available on the remaining nodes. RAC is an integral part of the Oracle database setup: one database, multiple users accessing it, in real time. This book will enable DBAs to get their finger on the pulse of the Oracle 11g RAC environment quickly and easily. This practical handbook documents how to administer a complex Oracle 11g RAC environment. It covers all areas of the Oracle 11g R1 RAC environment, with bonus R2 information included, and is indispensable if you are an Oracle DBA charged with configuring and implementing Oracle11g. It presents a complete method for the design, installation, and configuration of Oracle 11g RAC, ultimately enabling rapid administration of Oracle 11g RAC environments.Packed with real-world examples, expert tips, and troubleshooting advice, the book begins by introducing the concept of RAC and High Availability. It then dives deep into the world of RAC design, installation, and configuration, enabling you to support complex RAC environments for real-world deployments. Chapters cover RAC and High Availability, Oracle 11g RAC Architecture, Oracle 11g RAC Installation, Automatic Storage Management, Troubleshooting, Workload Management, and much more. By following the practical examples in the book, you will learn every concept of the RAC environment and how to successfully support complex Oracle 11g R1 and R2 RAC environments for various deployments in real-world situations.
Table of Contents (20 chapters)
Oracle 11g R1/R2 Real Application Clusters Handbook
Credits
About the Authors
About the Reviewers
Preface

Operating system-level commands for tuning and diagnosis


While the previous Oracle database commands prove useful for managing and troubleshooting Oracle RAC and Oracle Clusterware issues, there is a critical need to use operating system commands with the Linux and Unix environment to diagnose and tune difficult issues that may arise in the Oracle RAC ecosystem. We will now provide you with some useful details on tools that can benefit you in these situations.

STRACE

Linux provides a useful tracing tool for system calls that can be used for diagnosis and debugging issues with Oracle 11g RAC environments. As DBAs, we often have problems with our database environments when a runaway Oracle database process causes a performance problem in consuming excessive memory from the operating system. By understanding how to use a trace tool such as STRACE for your Linux environment, you can pinpoint the root cause of such a problem and find a solution to the failure more effectively.

To display the options for STRACE, enter the strace at a Linux or Unix shell prompt session as follows:

[oracle@raclinux1 ~]$ strace
usage: strace [-dffhiqrtttTvVxx] [-a column] [-e expr] ... [-o file]
[-p pid] ... [-s strsize] [-u username] [-E var=val] ...
[command [arg ...]]
or: strace -c [-e expr] ... [-O overhead] [-S sortby] [-E var=val] ...
[command [arg ...]]
-c -- count time, calls, and errors for each syscall and report summary
-f -- follow forks, -ff -- with output into separate files
-F -- attempt to follow vforks, -h -- print help message
-i -- print instruction pointer at time of syscall
-q -- suppress messages about attaching, detaching, etc.
-r -- print relative timestamp, -t -- absolute timestamp, -tt -- with usecs
-T -- print time spent in each syscall, -V -- print version
-v -- verbose mode: print unabbreviated argv, stat, termio[s], etc. args
-x -- print non-ascii strings in hex, -xx -- print all strings in hex
-a column -- alignment COLUMN for printing syscall results (default 40)
-e expr -- a qualifying expression: option=[!]all or option=[!]val1[,val2]...
options: trace, abbrev, verbose, raw, signal, read, or write
-o file -- send trace output to FILE instead of stderr
-O overhead -- set overhead for tracing syscalls to OVERHEAD usecs
-p pid -- trace process with process id PID, may be repeated
-s strsize -- limit length of print strings to STRSIZE chars (default 32)
operating system commandsoperating system commandsSTRACE-S sortby -- sort syscall counts by: time, calls, name, nothing (default time)
-u username -- run command as username handling setuid and/or setgid
-E var=val -- put var=val in the environment for command
-E var -- remove var from the environment for command
[oracle@raclinux1 ~]$

Now, let's take a look at how to use STRACE to trace an Oracle 11g RAC process:

[oracle@raclinux1 ~]$ strace oracle
execve("/u01/app/oracle/product/11.1.0/11g/bin/oracle", ["oracle"], [/* 41 vars */]) = 0
uname({sys="Linux", node="raclinux1.us.oracle.com", ...}) = 0
brk(0) = 0xf56b000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/u01/app/oracle/product/11.1.0/11g/lib/tls/i686/sse2/libskg

We can also use STRACE to examine system calls for an Oracle 11g database instance, in an Oracle 11g RAC environment, using the following example:

[oracle@raclinux1 bin]$ strace -c oracle
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
4.19 0.865232 41202 21 write
4.02 0.831352 207838 4 lstat64
4.02 0.831329 207832 4 _llseek
4.02 0.831327 207832 4 times
3.42 0.707903 707903 1 io_setup
3.42 0.707867 707867 1 _sysctl
3.42 0.707864 707864 1 socket
3.42 0.707843 707843 1 bind
3.42 0.707839 707839 1 set_thread_area
3.42 0.707837 707837 1 futex
3.42 0.707836 707836 1 dup
3.42 0.707836 707836 1 set_tid_address
3.42 0.707835 707835 1 gettid
3.42 0.707833 707833 1 fcntl64
3.42 0.707831 707831 1 getuid32
3.42 0.707831 707831 1 geteuid32
3.21 0.662962 82870 8 mprotect
3.21 0.662671 82834 8 gettimeofday
2.77 0.572299 26014 22 fstat64
2.71 0.560835 12192 46 old_mmap
2.66 0.549201 109840 5 shmdt
2.61 0.539206 107841 5 shmat
2.61 0.539194 107839 5 5 access
2.61 0.539158 107832 5 rt_sigprocmask
2.46 0.509131 2204 231 close
2.39 0.494136 41178 12 9 shmget
2.01 0.415677 207839 2 uname
1.57 0.325381 20336 16 16 mkdir
1.30 0.268287 5708 47 17 stat64
1.27 0.263489 909 290 61 open
1.19 0.246993 41166 6 getrlimit
1.19 0.246990 41165 6 rt_sigaction
0.87 0.179247 2938 61 read
0.60 0.123500 41167 3 brk
0.60 0.123500 41167 3 setrlimit
0.60 0.123494 41165 3 lseek
2.70 -1.1557161 2786 200 mmap2
1.52 -1.1313491 2986 105 munmap
------ ----------- ----------- --------- --------- ----------------
operating system commandsoperating system commandsSTRACE100.00 20.669398 1134 108 total

TRUSS

TRUSS is a useful utility available for the Sun Solaris Unix operating system that can be used for tracing Oracle 11g background processes.

Truss provides the ability to walk through trace, step by step, for Oracle 11g internal processes, by using the -p parameter to the truss command, followed by the Solaris process id (PID). You can also trace child processes for Oracle background processes by using the truss command with the -f parameter. The parameters available for the truss command are listed next:

truss [-fcaeil] [-[tvx] [!]syscall...] [-s [!]signal...] [-m [!]fault...] [-[rw] [!]fd...] [-o outfile] command | -p pid

Some typical options for truss are as follows:

  • -o: output to file

  • -f: trace child processes

  • -c: count system calls

  • -p: trace calls based on Unix PID

Now, let's take a look at how to trace the Oracle 11g database instance process by using the truss utility with Solaris:

solaris06$ truss -cp 7612
^C
Syscall seconds calls errors
read .900 55
Write .500 1
times .100 93
yield .500 426

GDB

GDB is a system debugger available for the Linux platform that can be used to discover problems with Oracle system processes. The man page available in Unix and Linux platforms for GDB will provide you with all of the command-line syntax options. To display the syntax for GDB, enter the man GDB command at a Unix or Linux shell prompt. Further details on GDB are also available online at http://linux.die.net/man/1/gdb.

GDB provides you with the ability to trace system memory usage for background processes as well as the ability to dump memory values for these background processes. This is useful in troubleshooting difficult problems such as when a memory leak occurs with an Oracle 11g database process. Before you can use the GDB utility to trace the Oracle 11g database instance process, you will first need to determine the specific Process ID (PID) in Linux by using the ps and grep Linux/Unix operating system commands. In the following example, we will use GDB to trace an open SQL*PLUS session within the Linux operating system.

After finding the process id (4107) for the SQL*PLUS session, we can begin the debug session with GDB:

[oracle@raclinux1 ~]$ gdb $ORACLE_HOME/bin/oracle 4107 start the new session for GDB
GNU gdb Red Hat Linux (6.1post-1.20040607.62rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...(no debugging symbols found)...Using host libthread_db library "/lib/tls/libthread_db.so.1".
Attaching to program: /u01/app/oracle/product/11.1.0/11g/bin/oracle, process 4107 The GDB dump starts by attaching to the PID we gave it earlier
Reading symbols from /u01/app/oracle/product/11.1.0/11g/lib/libskgxp11.so...(no debugging symbols found)...done.
Loaded symbols for /u01/app/oracle/product/11.1.0/11g/lib/libskgxp11.so
Reading symbols from /lib/tls/librt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/tls/librt.so.1 GDB loads the shared libraries for Oracle to perform the memory dump trace
Reading symbols from /u01/app/oracle/product/11.1.0/11g/lib/libnnz11.so...(no debugging symbols found)...done.
Loaded symbols for /u01/app/oracle/product/11.1.0/11g/lib/libnnz11.so
Reading symbols from /u01/app/oracle/product/11.1.0/11g/lib/libclsra11.so...done.
Loaded symbols for /u01/app/oracle/product/11.1.0/11g/lib/libclsra11.so
Reading symbols from /u01/app/oracle/product/11.1.0/11g/lib/libdbcfg11.so...done.
Loaded symbols for /u01/app/oracle/product/11.1.0/11g/lib/libdbcfg11.so
Reading symbols from /u01/app/oracle/product/11.1.0/11g/lib/libhasgen11.so...done.
Loaded symbols for /u01/app/oracle/product/11.1.0/11g/lib/libhasgen11.so
Reading symbols from /u01/app/oracle/product/11.1.0/11g/lib/libskgxn2.so...done.Loaded symbols for /u01/app/oracle/product/11.1.0/11g/lib/libskgxn2.so
Reading symbols from /u01/app/oracle/product/11.1.0/11g/lib/libocr11.so...done.
Loaded symbols for /u01/app/oracle/product/11.1.0/11g/lib/libocr11.so
Reading symbols from /u01/app/oracle/product/11.1.0/11g/lib/libocrb11.so...done.Loaded symbols for /u01/app/oracle/product/11.1.0/11g/lib/libocrb11.so
Reading symbols from /u01/app/oracle/product/11.1.0/11g/lib/libocrutl11.so...done.
Loaded symbols for /u01/app/oracle/product/11.1.0/11g/lib/libocrutl11.so
Reading symbols from /usr/lib/libaio.so.1...done.
Loaded symbols for /usr/lib/libaio.so.1
Reading symbols from /lib/libdl.so.2...done.
GDB commandGDB commanddebug sessionLoaded symbols for /lib/libdl.so.2
Reading symbols from /lib/tls/libm.so.6...done.
Loaded symbols for /lib/tls/libm.so.6
Reading symbols from /lib/tls/libpthread.so.0...done.
(gdb) stepi we use the STEPI command for GDB to walk the dump through the memory
[Switching to Thread -1208035648 (LWP 4107)]
0x0090a7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) call ksudss(10) We issue the CALL command to move through the memory stack
$1 = 0
(gdb) detach we use the detach command for GDB to end our memory tracing session
Detaching from program: /u01/app/oracle/product/11.1.0/11g/bin/oracle, process 4107

The next step is to attach the Oracle process id (PID) 4107 found above to the GDB debugger. Before we can make use of the GDB functions, we must first use the GDB attach command to link the debugger to the Oracle process that will be traced. This will enable us to drill down further into the tracing for Oracle 11g by using the attach command with GDB, as well as show the memory dump information:

(gdb) attach 4107 we start a new memory dump trace by using the attach command for GDB
Attaching to program: /u01/app/oracle/product/11.1.0/11g/bin/oracle, process 4107
[New Thread -1208035648 (LWP 4107)]
Symbols already loaded for /u01/app/oracle/product/11.1.0/11g/lib/libskgxp11.so
Symbols already loaded for /lib/tls/librt.so.1
Symbols already loaded for /u01/app/oracle/product/11.1.0/11g/lib/libnnz11.so
Symbols already loaded for /u01/app/oracle/product/11.1.0/11g/lib/libclsra11.so
Symbols already loaded for /u01/app/oracle/product/11.1.0/11g/lib/libdbcfg11.so
Symbols already loaded for /u01/app/oracle/product/11.1.0/11g/lib/libhasgen11.soSymbols already loaded for /u01/app/oracle/product/11.1.0/11g/lib/libskgxn2.so
Symbols already loaded for /u01/app/oracle/product/11.1.0/11g/lib/libocr11.so
Symbols already loaded for /u01/app/oracle/product/11.1.0/11g/lib/libocrb11.so
Symbols already loaded for /u01/app/oracle/product/11.1.0/11g/lib/libocrutl11.soSymbols already loaded for /usr/lib/libaio.so.1
Symbols already loaded for /lib/libdl.so.2
Symbols already loaded for /lib/tls/libm.so.6
Symbols already loaded for /lib/tls/libpthread.so.0
Symbols already loaded for /lib/libnsl.so.1
Symbols already loaded for /lib/tls/libc.so.6
Symbols already loaded for /lib/ld-linux.so.2
Symbols already loaded for /usr/lib/libnuma.so
Symbols already loaded for /lib/libnss_files.so.2
Symbols already loaded for /u01/app/oracle/product/11.1.0/11g/lib/libnque11.so
[Switching to Thread -1208035648 (LWP 3107)]
0x0090a7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2

Now, to get the stack dump, we will need to issue the bt command for our GDB session:

(gdb) bt We issue the BT command for GDB to obtain a listing of the memory stack dump
#0 0x0090a7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x003d14c3 in __read_nocancel () from /lib/tls/libpthread.so.0
#2 0x0e5b652e in sntpread ()
#3 0x0e5b64cf in ntpfprd ()
#4 0x0e59b3c7 in nsbasic_brc ()
#5 0x0e59e20e in nsbrecv ()
#6 0x0e5a2d20 in nioqrc ()
#7 0x0e39da65 in __PGOSF20_opikndf2 ()