10 Steps to Analyze AWR Report in Oracle

As you have Generated AWR Report in Oracle, Next task is to Analyze AWR Report in Oracle. By Reading AWR Report  you can easily solve issues like Slow database, high wait events, Slow query and many more. Though It's a lengthy report but Analyzing or Reading relevant part of AWR Report can help to troubleshoot issues in easy and fast manner.

AWR stands for Automatically workload repository, Though there could be many types of database performance issues, but when whole database is slow, then there are two possibilities.

1. Issue with Database Machine. OS Watcher is the best tool to start.
2. If Database performance issue, Then AWR Report is the place to look at
.

In case if a particular query is not performing well, i would suggest to look at execution plan of the query, stats of underlying table etc. In this case AWR won't help much.

Recommendations before getting an AWR Report.

1. Collect Multiple AWR Reports: It's always good to have two AWR Reports, one for good time (when database was performing well), second when performance is poor. This way Remote DBA can easily compare good and bad report to find out the culprit.

2. Stick to Particular Time: "Database is performing slow" will not help anymore to resolve performace issues. We have to have a specific time like Database was slow yesterday at 1 Pm and continue till 4Pm. Here, DBA will get a report for these three hours.

3. Split Large AWR Report into Smaller Reports: Instead of having one report for long time like one report for 4hrs. it's is better to have four reports each for one hour. This will help to isolate the problem.

In case of RAC env. generate one report for each instance. Once, you have generated AWR report. Now, it's time of analyze the report. Since, AWR report is a huge report and area to look into AWR is also depends on problem to problem. Here, I am list most common area for a DBA to look into which will give a clear picture of the issue.

Steps to Analyze AWR Report

1. Database Details:

After getting an AWR Report This is first and Top part of the report. In this part cross check for database and instance and and database version with the Database having performance issue.This report also show RAC=YES if it's an RAC database.

 

2. Host Configuration:

This will give you name, platform CUP, socket and RAM etc. Important thing to notice is number of cores into the system. In this example there are 12 CUP's in Cores.



3. Snap Shot Detail:

This are the detail about snap shot taken, Snap start time and end time. Difference between them is as "Elapsed". Here is a new term "DB Time"

DB Time= session time spent in database.

DB Time= CPU Time + Non IDLE wait time.

You can find, DB time is very large as compared to Elapse time, which is not a concern. Check if you have taken a report for the time having performance problem. If yes fine, other wise take a report for performance problem time.

Next is Cache Sizes, which is just detail about SGA components.

4. Load Profile:

Here are few important stats for a DBA to look into. Fist is "DB CPU(s)" per second. Before that let's understand how DB CUP's work. Suppose you have 12 cores into the system. So, per wall clock second you have 12 seconds to work on CPU.


 So, if "DB CPU(s)" per second in this report > cores in (Host Configuration (#2)).

means env is CPU bound and either need more CPU's or need to further check is this happening all the time or just for a fraction of time. As per my experience there are very few cases, when system is CPU bound.

In this case, machine has 12 cores and DB CPU(s) per second is 6.8. So, this is not a CPU bound case.

Next stat to look at are Parses and Hard parses. If the ratio of hard parse to parse is high, this means Database is performing more hard parse. So, needs to look at parameters like cursor_sharing and application level for bind variables etc.

5. Instance Efficiency Percentages:


In these statistics, you have to look at "% Non-Parse CPU". If this value is near 100% means most of the CPU resources are used into operations other than parsing, which is good for database health.


6. Top 5 Timed Foreground Events:

This is another most important stats to consider while looking at AWR Report for any database performance related issue. This has a list of top 5 foreground wait events.


Here, first of all check for wait class if wait class is  User I/O , System I/O,  Others etc this could be fine but if wait class has value "Concurrency" then there could be some serious problem. Next to look at is Time (s) which show how many times DB was waiting in this class and then Avg Wait (ms). If Time(s) are high but  Avg Wait (ms) is low then you can ignore this. If both are high or Avg Wait (ms) is high then this has to further investigate.

In the above screen shot, most of the resource are taken by DB CPU = 64% DB time. Taking resource by DB CUP is a normal situation.

Let's take an example,  In which event is "log file switch (checkpoint incomplete) " which has high waits, huge Time (s) and large values in Avg Wait (ms) and wait class is configuration. So, here you have to investigate and resolve log file switch (checkpoint incomplete).

Host CPU, Instance CPU and Memory Statistics are self explanatory.  Next is RAC Statistics, I did not find any issue in these stats most of the time.

7. Time Model Statistics:

This is a detailed explanations of system resource consumptions. Stats are order by Time (s) and % of DB Time.
 
A noticeable result Sum of all  % of DB time is > 100%. why is this ?

Because this is cumulative time i.e. In this case SQL execute elapsed time is taking 89% of DB time, which includes it sub parts like parse time elapsed, hard parse elapsed time etc. So, if you find Hard parse time elapsed is taking more %. So investigate further so on and so forth.

DBA has to look for stat which is taking abnormal % of DB time. 

8. Operating System Statistics - Detail:

This is the information related to OS, what is the load status on System shown here.


This report shows, system is 62 and 70% idle at time of report taken, So, there is no resource crunch at system level. But if, you found very high busy, user or sys % and indeed this will led to low idle %. Investigate what is causing this. OS Watcher is the tool which can help in this direction.

Next, very crucial part of AWR report for a DBA is SQL Statistics. Which has all sql query details executed during report time interval.


We will explore few of them, To understand, how to analyzed these reports. Let's start with

9. SQL Ordered by Elapsed Time:

As explained by name itself, this lists SQL queries ordered by Elapsed time into reported time interval.


In this report, look for query has low executions and high Elapsed time per Exec (s) and this query could be a candidate for troubleshooting or optimizations. In above report, you can see first query has maximum Elapsed time but no execution. So you have to investigate this.

In Important point, if executions is 0, it doesn't means query is not executing, this might be the case when query was still executing and you took AWR report. That's why query completion was not covered in Report.

10. SQL Ordered by CUP Time:

In this report, SQL queries are listed on the basis of CPU taken by the query i.e. queries causing high load on the system. The top few queries could be the candidate query for optimization.








From above stat, look for queries using highest CPU Times, If a query shows executions 0, this doesn't means query is not executing. It might be same case as in SQL queries ordered by Elapsed time. The query is still executing and you have taken the snapshot.

However, There are so many other stats in AWR Report which a DBA needs to consider, I have listed only ten of them but these are the most commonly used stats for any performance related information.

Please share you view about this article, Does it helps you to understand, How to analyze AWR Report.

72 comments:

  1. Nice article, me also working in oracle side as a Tech Arch.
    Keep n touch
    Thanks
    Vijay
    Bangalore
    vijay.7858@gmail.com

    ReplyDelete
    Replies
    1. Vijay, Thanks for your comments.

      Delete
    2. there are 10 stats discussed and all are very clear to me now..hope u will show some other stats as well..thanks a lot.

      Regards
      Amit Pal

      Delete
    3. Hi Umesh,

      Keep up the good work !!!

      Tx

      Delete
    4. Hi,

      It was really a nice starts for performance tuning. Please share some more explanation on the same to dig more to AWR Report.

      Delete
  2. Very Good article....thanks for sharing this....

    ReplyDelete
  3. Very nice article.....thanks for sharing...also share some books & tips which have detail information and analysis of AWR,ASH & ADDM Report.

    Thanks & Regards
    Vaibhav Jain
    Jr. Oracle DBA
    vaibhav.mca@gmail.com

    ReplyDelete
  4. Vaibhav, Thanks for your appreciation.

    ReplyDelete
  5. great doc!! Really appreciate it.

    ReplyDelete
  6. Great Doc!! Really appreciate it. Many Thanks.

    ReplyDelete
  7. Thanks for Sharing the document, its good analysis of AWR Report.

    Thanks & Regards,
    Padma
    Jr. Oracle DBA
    mailmeonpadma@gmail.com

    ReplyDelete
  8. Thanks for Document its good analysis for AWR report :)

    ReplyDelete
  9. Nice explanation.. Thanks a lot.
    May I also request you to post similar blog to analyze DB when it is under peak load using ASH?

    Thanks in advance..

    ReplyDelete
  10. Really this article helped me alot.Thanks Umesh

    ReplyDelete
  11. Thanks a lot Umesh for this useful article.

    ReplyDelete
  12. It's a very informative and helpful article.

    ReplyDelete
  13. Nicely explain all of them...Really appreciated.

    ReplyDelete
  14. Dear Umesh ,

    I really appreciate your effort for explanation of AWR.

    Many thanks for sharing this article.

    Regards
    Goulay
    Oracle DBA

    ReplyDelete
  15. Very informative

    small typo
    10. SQL Ordered by CUP Time needs to be read as 10. SQL Ordered by CPU Time

    ReplyDelete
  16. SQL execute elapsed time is taking 89% of DB should be SQL execute elapsed time is taking 98% of DB (looks like typo)

    ReplyDelete
  17. Superb article.. helped me in understanding the awr report checks.. Great job..

    Thanks

    ReplyDelete
  18. very helpful....thanks for sharing!

    ReplyDelete
  19. you need to fix this
    SQL Ordered by CUP Time
    SQL Ordered by CPU Time

    ReplyDelete
  20. This report has been taken when user sessions are active for long time. Please tell me what is sbtwrite2.

    Top 5 Timed Events Avg %Total
    ~~~~~~~~~~~~~~~~~~ wait Call
    Event Waits Time (s) (ms) Time Wait Class
    ------------------------------ ------------ ----------- ------ ------ ----------
    Backup: sbtwrite2 40,575 801 20 207.2 Administra
    RMAN backup & recovery I/O 69,380 588 8 152.2 System I/O
    CPU time 318 82.3
    Backup: sbtclose2 22 113 5131 29.2 Administra
    control file parallel write 51,693 97 2 25.0 System I/O

    ReplyDelete
  21. In the #6 you mention about "Concurrency" as serious problem - could you give any example? What does it mean concurency?

    ReplyDelete
    Replies
    1. As in real life, concurrency means to client wants to access to the same object simultaneously. Lock is the solution but could be an issue, when it is not released or take a lof of time to release the object....

      Delete
  22. Very Nice article - appreciated
    Can you please explain:
    #6 you mention about "Concurrency" as serious problem - could you give any example? What does it mean concurrency?

    -Vikas

    ReplyDelete
  23. Thank you.
    In the first part, I don't understand why you use "cores" and not "cpus". And I don't know what appends if the cpu_counts parameters is setted less?

    ReplyDelete
    Replies
    1. cores are CPUs, essentially, a multi core cpu happens to have more than one cpu on a chip, meaning it can do more than one thing at a time. duo-core= 2 cpu. Quas-core=4CPU and so on. In this post, I think the author is mistaking 24 CPUs for just 12. Correct me if I am wrong

      Delete
  24. It is the nice document...It is better if you give recomendation on below issue.
    1."% Non-Parse CPU% is just 50%
    2.recomended value for cursor_sharing

    thanks
    Lakshminarayan

    ReplyDelete
  25. It is the nice document...It is better if you give recomendation on below issue.
    1."% Non-Parse CPU% is just 50%
    2.recomended value for cursor_sharing

    thanks
    Lakshminarayan

    ReplyDelete
  26. Many Thanks for sharing this, very much informative and useful

    ReplyDelete
  27. Sergio (DBA - Brazil)December 9, 2015 at 4:50 AM

    Hi Umesh
    Very .. but very interesting article !
    Congratulations !

    ReplyDelete
  28. neat explanation. thank you

    ReplyDelete
  29. This is a good article.. I am new to AWR reports.. But this helped me get some insight in the topic.
    thank you..

    ReplyDelete
  30. Good for quick analysis..thanx for sharing with us!!

    ReplyDelete
  31. Usefull hints...
    Thanks
    Suresh

    ReplyDelete
  32. It is quite useful. Can someone tell me if the elapsed time per execution and executions show empty(no value at all) then what does it mean?

    ReplyDelete
  33. Very nice and informative article. -Ankit

    ReplyDelete
  34. Excellent explanation. Good job Umesh and keep it up.

    ReplyDelete
  35. Excellent explanation. It really helps to understand AWR.

    ReplyDelete
  36. excellent representation. Please provide more detail about AWR and ADDM report reading.

    ReplyDelete
  37. Excellent representation. Please share more detailed information on AWR and ADDM report reading and analysys.

    ReplyDelete
  38. Excellent explanation. It helped me a lot! Thanks!

    ReplyDelete
  39. Try to get more set of idea through this really fun know this
    PlSQL Training

    ReplyDelete
  40. Quite helpful for beginner like me. Thanks :)

    ReplyDelete
  41. Hi,

    I am new to awr report. for better understanding i need below details.
    Could please help me understanding the each column with description. What each column tells about.

    Thanks

    ReplyDelete
  42. This great stuff to understand keep it up.

    Thanks

    ReplyDelete
  43. Hi ..Thanks for the article.
    Am seeing "Network" in Top 10 Foreground Events by Total Wait Time . Is this an issue on Network layer ?

    ReplyDelete