What is Event Tracing ? Does it Really Help -- Explained with an Example

Does Event Tracing in Oracle really helps ?

Recently, I face an "ORA-00942: table or view does not exist" which seems very tiny to me, but soon i realize that, this was not small problem, because this error was coming from the same query but intermittent, How is this possible, that same table exists once but not another time. I assume some other situations like this could be table partition issue, but all my assumption were not helping at all.

Finally, I decide to Trace an Event for "ORA-00942" message and then solve the issue, which was due to a bug in Oracle 11.2.0.2. What if, I had not traced the event, definitely i was not able to overcome this challenge. Yes, Event tracing really helps ...

What is Event Tracing ?

I believe, every DBA has seen ORA-00600 trace file generates every time ORA-00600 or ORA-07445 occurs which has each and every detail about this error. Generating trace file for  ORA-00600 or ORA-07445 is default behavior of oracle, but Same doesn't happen in case of other errors.

To solve any ORA-00600 or ORA-07445, Database Administrator analyze it's call stack and try to match with some bug and find either some workaround or patch to fix this.

So Event Tracing is basically to note down each and every step for execution of particular query from starting to up to when issue or error Reoccurs in a file, which can be analyzed for troubleshooting.

Let's take an Example to understand Tracing an Event in Oracle:- In below example I am generating an "ORA-01422" error and will trace it to find out root cause for the error.

SQL> select * from dept;
    DEPTNO DNAME          LOC
---------- -------------- -------------
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO
        40 operation      BOSTON

SQL> insert into dept values(41,'SALES','USA');
1 row created.

SQL> commit;
Commit complete.

SQL> create or replace procedure test_proc as
v_two    varchar2(5);
begin
   select DNAME into v_two from dept where DNAME = 'SALES';
end;
/ 
Procedure created.

SQL> exec test_proc;

BEGIN test_proc; END;
*
ERROR at line 1:
ORA-01422: exact fetch returns more than requested number of rows
ORA-06512: at "SCOTT.TEST_PROC", line 8
ORA-06512: at line 1

This is a very small piece of code, What if this has been a code of thousands lines. It has been very difficult for Remote DBA to find out which sql query or table is generating this error message.

But DBA can easily find out culprit command or table by setting up tracing on "ORA-01422: exact fetch returns more than requested number of rows" (Error messages are called as events)

How to Trace an Event ?

For event tracing DBA has to use "alter system/session set event {event_id} trace name {action} {paremter}" command. Unix Find command is used to find event file generated at OS level. In below steps Database Administrator is tracing "ORA-01422: exact fetch returns more than requested number of rows" and locating trace file with Linux find command.  DBA has to first set the event for tracing then Re execute the query generating the issue to catch the event.
SQL> connect sys as sysdba
Enter password: 
Connected.

SQL> ALTER SYSTEM SET EVENTS '1422 TRACE NAME ERRORSTACK LEVEL 3';
System altered.

SQL> set linesize 1000
SQL> show parameter user_dump_dest
NAME                                 TYPE                             VALUE
------------------------------------ -------------------------------- ------------------------------
user_dump_dest                       string                           /etc/oracle/diag/rdbms/orcl/orcl/trace   

SQL> connect scott 
Enter password: 
Connected.

SQL> exec test_proc;

BEGIN test_proc; END;
*
ERROR at line 1:
ORA-01422: exact fetch returns more than requested number of rows
ORA-06512: at "SCOTT.TEST_PROC", line 8
ORA-06512: at line 1 

[oracle@database ~]$ cd /etc/oracle/diag/rdbms/orcl/orcl/trace
[oracle@database trace]$ find . -type f -print | xargs grep -li "ORA-01422"
./alert_orcl.log
./orcl_ora_3262.trc

Here is message from Alert log and trace file generated from database, Alert log shows name of trace file as well.
Sat Jun 08 23:58:49 2013
OS Pid: 3295 executed alter system set events '1422 TRACE NAME ERRORSTACK LEVEL 3'
Sat Jun 08 23:59:02 2013
Errors in file /etc/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_3262.trc:
ORA-01422: exact fetch returns more than requested number of rows
Sat Jun 08 23:59:11 2013
Dumping diagnostic data in directory=[cdmp_20130608235911], requested by (instance=1, osid=3262), summary=[abnormal process termination].
"alert_orcl.log" 3253L, 137636C  

Trace file orcl_ora_3262.trc

Trace file /etc/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_3262.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /etc/oracle/oracle
System name:    Linux
Node name:      database.example.com
Release:        2.6.18-194.el5
Version:        #1 SMP Mon Mar 29 20:06:41 EDT 2010
Machine:        i686
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 20
Unix process pid: 3262, image: oracle@database.example.com (TNS V1-V3)
** 2013-06-08 23:59:02.829
*** SESSION ID:(45.27) 2013-06-08 23:59:02.829
*** CLIENT ID:() 2013-06-08 23:59:02.829
*** SERVICE NAME:(SYS$USERS) 2013-06-08 23:59:02.829
*** MODULE NAME:(SQL*Plus) 2013-06-08 23:59:02.829
*** ACTION NAME:() 2013-06-08 23:59:02.829
dbkedDefDump(): Starting a non-incident diagnostic dump (flags=0x0, level=3, mask=0x0)
----- Error Stack Dump -----
ORA-01422: exact fetch returns more than requested number of rows
--- Current SQL Statement for this session (sql_id=9hts4bbxq1k14) -----
SELECT DNAME FROM DEPT WHERE DNAME = 'SALES'
---- PL/SQL Stack -----
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0x3de13254         8  procedure SCOTT.TEST_PROC
                                                                                                           From the above trace file DBA can easily identify for "Current SQL Statement for this session" is "SELECT DNAME FROM DEPT WHERE DNAME = 'SALES'" which is causing "ORA-01422: exact fetch returns more than requested number of rows"
From the above trace file DBA can easily identify for "Current SQL Statement for this session" is "SELECT DNAME FROM DEPT WHERE DNAME = 'SALES'" which is causing "ORA-01422: exact fetch returns more than requested number of rows" error message. Call stack is also available for this query is trace file, Database Administrator can also analyze call stack for other issues.

This is how tracing helps in troubleshooting. Now, let's talk about detail of tracing parameters and diff-2 levels in event tracing.

Syntax for Tracing and Event

 {EVENT_ID} TRACE NAME {ACTION} {PARAMETERS}

Examples: 

 
 alter SYSTEM set events ‘WHAT TO TRACE?‘ scope = spfile;
 alter SESSION set events ‘WHAT TO TRACE?‘;

ALTER SYSTEM set events '904 trace name ERRORSTACK LEVEL 3';
ALTER SESSION set events '10046 trace name CONTEXT FOREVER, LEVEL 12';

In above two examples DBA is tracing "ORA-00904: invalid identifier" error and Enable standard SQL_TRACE functionality for sql queries. These are called as events.

Now, The most common Actions in tracing an event are:

ERRORSTACK - This action with produce session errorstack trace as well see in the above example, This is used for tracing errors in code etc.
SYSTEMSTATE - System state dump is taken in case of database hang situation.
CONTEXT - configure some behaviour. I have not seen so much use of this.

Examples of Actions:

ALTER SYSTEM set events '904 trace name ERRORSTACK LEVEL 3';
This setting causes an ERRORSTACK dump when an ORA-904 occurs.

ALTER SYSTEM set events 'IMMEDIATE trace name SYSTEMSTATE LEVEL 266';
This setting causes a SYSTEMSTATE dump to occur when the command is issued.

ALTER SESSION set events '10046 trace name CONTEXT FOREVER, LEVEL 12';
This setting enables level 12 SQL Tracing.

Parameters for Event Tracing

Every action has it's own parameters, which are given below:

ERRORSTACK Parameters:

0 - Error stack only
1 - Error stack and function call stack (if implemented)
2 - As 1 plus the ProcessState
3 - As 2 plus the context area (all cursors and current cursor highlighted)

DBA's Usually prefer to use 3 as the parameter of tracing, Since it include all details.

SYSTEMSTATE parameters

1 - Very basic process information only
10 - Most common level - includes state object trees for all processes. Includes "interesting" RAC resources in the trace (in RAC only).
11 - In RAC level 11 includes a dump of any resources not on the freelist just after the "BUSY GLOBAL CACHE ELEMENTS" section.
LEVEL + 256 - Adding 256 to the level will try to dump short stack info for eac

CONTEXT parameters

CONTEXT only seems to have two parameters:
FOREVER - enables the event until disabled
OFF - disables the event


Enabling Tracing at different Levels


Database Administrator can enable Event tracing at System or session level or both. Enabling tracing at system level or permanent needs a instance restart. DBA has to add scope=spfile at end of trace command like

ALTER SYSTEM set events '904 trace name errorstack level 3' SCOPE=SPFILE;

Enabling tracing at Session level will take effect immediately and this is default as well, DBA has not to give any scope parameter for this.

alter SYSTEM set events '1042 trace name errorstack level 3' SCOPE = MEMORY;
alter SESSION set events '1042 trace name errorstack level 1';

To set event at both Spfile and Memory use scope=both;

Once you have done with tracing don't forget to disable it otherwise it will generate lot of files at user_dump_dest location and mount point will be full.


Disabling Event Tracing


alter session set events '1555 trace name errorstack off';
ALTER SYSTEM SET EVENTS '1422 TRACE NAME ERRORSTACK off';

Tracing an Event helps me a lot to solve my day to day issue, If this also help you share your experience in comments.

1 comment: