Advanced Oracle Troubleshooting with Tanel Poder - 3 Days
Seminar Description:
This seminar is concentrated entirely on Oracle troubleshooting - understanding what exactly is Oracle doing right now or what was it doing when the problem occurred. You will gain the skill to systematically work out the reasons for crashes, hangs, bad performance or other misbehaviour.
The seminar takes a holistic approach for end-to-end troubleshooting. It will explain the full lifecycle of a database request, from database client libraries and network to Oracle database kernel and underlying OS. For each layer a troubleshooting technique is provided, along with advice on using the right tool for the right problem at the right time.
The three days are full of intensive learning, reading dumps, stack traces, network packet captures and Oracle SGA directly. You?ll be using debuggers and custom tools provided to you for real-time and post-mortem diagnosis.
The emphasis is put on practical troubleshooting; safety comes first, and many techniques are designed to require no change to database schemas or instance parameters.
Objectives:
- Understand internals of core Oracle kernel components and their interaction with OS
- Be proficient in systematic end-to-end troubleshooting, using the right tool for right problem
- Have the ability to proactively eliminate potential database performance issues
Non-objectives:
- This seminar is not a beginner database tuning course
- This seminar is not an SQL tuning or CBO course
Prerequisites: Good understanding of general Oracle database architecture, OS commands
Duration: 3 days (4 x 1.5h sessions per day)
Keywords: Oracle troubleshooting, performance tuning, internals
Target Audience: Database Administrators, Performance Engineers
Topics:
- Systematic approach to troubleshooting and tuning
- End to end lifecycle of a database request
- Oracle, OS and hardware interaction
- Oracle kernel core services
- Oracle Wait Interface and kernel instrumentation framework
- Lifecycle of a cursor
- SQL execution row source operators and data flow
- Logical and physical IO internals
- Lifecycle of a transaction
Speaker:
Tanel Poder is a researcher and database engineer, focusing mainly on designing Oracle Database based solutions for high performance, scalability and availability. He often gets involved in advanced end-to-end system troubleshooting, which has taken him to work for clients in Asia, the US and Europe.
He hosts an Oracle performance tuning site at http://www.tanelpoder.com and has published useful Oracle troubleshooting tools like PerfSheet, Sesspack and Session Snapper.
In addition to delivering training and seminars, Tanel is a frequent speaker at major conferences such as OracleWorld, UKOUG, Hotsos Symposium, IOUG-A Live and many regional ones.
Tanel is also one of the first Oracle Certified Masters in the world and an OakTable Network member.
Seminar outline:
Day 1
Session 1 - End to end lifecycle of a database request
This session gives an overview of the core hardware layers and software code paths a database request has to pass through. For every relevant layer a commonly available diagnosis technique or tracing tool is proposed.
- End to end call execution flow
- OCI, Oracle Net (SQL*Net) and direct wire protocol client libraries
- TCP, IPC and network services for Oracle
- Upper Oracle kernel: TNS protocol and Two-Task Common layer
- Lower Oracle kernel: From OPI to OSD
The sessions following will give you the skill and knowledge of using the right tool for the right problem at the right time.
Session 2 - Oracle, OS and hardware interaction
This session concentrates on what happens at OS and hardware level when Oracle needs to use operating system services or OS needs to interact with Oracle. Both Unix derivatives and Windows server family internals are discussed.
- Oracle OSD layer
- Unix/Windows private and shared memory allocation for Oracle
- Dynamic SGA, automatic PGA, MEMORY_TARGET internals and gotchas
- OS scheduling challenges for Oracle
- IO services and system call interface
- Context switching and TLB misses
Session 3 - Oracle kernel core services
This session covers in detail the kernel services and components most of database calls interact with - and often get in trouble with.
- KGH Heap and shared/java/large/streams pool memory management
- Process startup, session creation
- State objects, task cleanup
- Latching, locking
- Sleeping, interprocess communication
Day 2
Session 4 - Oracle Wait Interface and kernel instrumentation framework
This session concentrates on Oracle wait interface internals, shortcomings and workarounds. Also the 10g+ new end-to-end instrumentation framework is covered.
- Oracle waits, sleeps, timeouts and uninstrumented time
- Getting the most out of Oracle performance counters
- Oracle Time Model statistics improvements
- Modules, actions and client_ids in end-to-end instrumentation framework
- Diagnostic event internals
- Oradebug diagnostics and when to not use it
Session 5 - Lifecycle of a cursor
Most of the application code execution in Oracle is done via cursors - this includes calling PL/SQL stored procedures. Different workloads require different cursor management strategies, getting it wrong may cause your system not scale regardless the amount of hardware thrown in. Getting it right requires understanding the library cache data structures and concurrency mechanisms, which this session will deliver.
- Library cache lookup and data structures
- Cursor management, parsing and execution
- Library cache latching, locking, pins and mutexes
- Bind variable strategies, overcoming bind peeking problems
- Cursor caching benefits and problems
- Diagnosing library cache bottlenecks and inefficiencies
Session 6 - SQL execution row source operators and data flow
This session gives you detailed understanding of what a server process has to go through when looking up records and generating result sets based on our SQL. This will be crucial for reading and really understanding SQL execution plans. You will also be introduced to a way for diagnosing where a statement has got stuck if the conventional diagnostics methods like tracing and V$/X$ queries fail to reveal it.
- Reading execution plans
- Row source operators, RSO/DFO trees
- Filter operators
- Join methods
- Sorting, grouping, aggregations, analytics
- Measuring SQL performance at execution plan line level
Day 3
Session 7 - Logical and physical IO internals
This session gives you understanding of Oracle?s internal buffer cache management and key differences between many OS level IO configuration options. Also the different costs of different LIO operations are discussed along with tuning advice.
- Cache layer - datafile and datablock structure
- Buffer cache management, multiversion read consistency
- The varying cost of logical IOs
- Single block, multiblock physical IO, data block prefetching
- Asynchronous IO, unbuffered direct, concurrent and raw IO
- IO troubleshooting
Session 8 - Lifecycle of a transaction
In addition to covering Oracle transaction management core internals, this session wraps up the topics learned, by studying a case of a transaction life cycle from its initialization to post-commit cleanout.
- Oracle row-level locking and transaction management
- Undo segments, ITLs and write consistency
- Undo vectors, in-memory undo
- Cache layer - redolog structure, private redo strands
- Commit processing, commit cleanout
- Gotchas with LGWR and log file sync
Customer Feedback About Tanel:
Our recent exercise with Tanel Poder was an extremely positive one and one which I would recommend to anyone seeking help with a contention issue on their Oracle database. I had seen Tanel speak at the Hotsos Conference in Dallas and knew that he would be extremely valuable in helping us identify what was causing an unpredictable latching scenario in our production environment.
Even before he arrived he asked for and begin reviewing the data we had collected (AWR reports, statspack output, etc.) and begin devising a "plan of attack" for determining the cause of our issue. Upon arrival he took our input very seriously, but was quick to validate our assumptions with factual database analysis. I also appreciated his concern for our security and the well-being of our production environments and was very careful to not run scripts which would cause contention themselves in production, or leave behind bothersome custom objects.
Tanel was always courteous and friendly and was willing to repeat his explanations and translate his very technical knowledge into terms that the lesser-trained DBA could comprehend. And the tools he brought along were bundled up nicely for us to download to our own environment and use for our troubleshooting. He summarized his findings for us in a way that was presentable to upper management and gave a clear explanation of his approach, analysis, and final recommendations.
I appreciated Tanel's offer to be a resource in the future if we have further questions after using his scripts and tools and I fully intend to take him up on it!
- Item #: Tanel1
