E10000 HPC Customers Sales Tools Market Dev. SAE/Benchmarks Competition About DHPG Home
SAE/Benchmarking Information
SITEMAP
You are here: dhpg.west/SAE/Tuning/oracle/ora_tuning.html

ABSTRACT

This paper discusses various bottlenecks in the processing of the database which can be tuned for maximum throughput from an Oracle 7.1 database. It discusses various techniques developed in-house for performance tuning both at the Application and Server Level. It also covers the tuning of Multi Threaded server, Parallel Query, Parallel Loading, Parallel Index Creation and highlights the importance of Shared SQL area and how it helps in tuning.


  • INTRODUCTION
  • LOCKING
  • SHARED SQL AREAS AND PRIVATE SQL AREAS
  • DATABASE WIDE STATISTICS
  • DBWR PERFORMANCE:
  • MINIMISING I/O
  • LATCH STATISTICS:
  • DATA DICTIONARY STATISTICS:
  • REDO LOG WRITER PERFORMANCE:
  • ROLLBACK SEGMENTS:
  • MULTI-THREADED SERVER TUNING:
  • PARALLEL QUERY TUNING:
  • PARALLEL INDEX CREATION:
  • PARALLEL LOADER TUNING:
  • OTHER TUNING TIPS:

    INTRODUCTION

    Performance bottlenecks can arise out of numerous reasons, i.e. application design, database resource unavailability and/or the o/s resources etc. This paper provides description of various sql commands and scripts from which a DBA can monitor, control and fine tune the database performance. This paper does not deal with operating system tuning, which has been found to be of tremendous use. All the database wide statistics described in this paper can be obtained using utlbstat.sql and utlestat.sql scripts found in with the other RDBMS installation scripts (CATALOG.SQL, CATPROC.SQL, etc.).

    LOCKING

    Performance problems are not always due to resource constraints, they often are a result of bad programming or lock contentions. Two scripts provided with oracle software, blocking.sql and locktree.sql, will help in determining whether you have a locking issue.

    Locktree.sql uses the view definitions created by blocking.sql. The DBA can query using Locktree.sql to print a tree-structured output of the various processes on the system involved in a lockout. It will show the processes waiting and the resource they are waiting for. As documented in the Locktree.sql script, it creates a temporary table which has two disadvantages. First, in order to create this table a number of locks must be acquired. This may cause the process running the script to get caught in the same locking problem it is trying to diagnose. Secondly, if a process waits for a lock held by more than one processes (shared lock) then the wait-for-graph is no longer a tree and the connect-by will show the processes (and any process waiting on it) several times.

    If a process_id is printed immediately below and to the right of another process, then it is waiting for that process. The process_id's printed at the left hand side of the page are the ones that everyone is waiting for.

    Blocking.sql and Locktree.sql are not useful when a deadlock is not detected automatically by the system. If this situation occurs then processes will be holding locks in a circular fashion. Since the script must always find a process which is not waiting for any locks, the output will most likely be misleading. These scripts give the best results when processes or the entire system is at its standstill.

    SHARED SQL AREAS AND PRIVATE SQL AREAS

    Oracle represents every SQL statements in two parts i.e., Shared part and a Private part. Shared part is common for all the users who are executing the same SQL statement, whereas each of them have their own private part. Shared part contains information about the parse tree and the execution plan. Private part contains information about the binding data and run time buffers. A private SQL area is again divided into two parts Persistent area and Runtime area. Persistent area remains in the memory unless the corresponding cursor has been closed and therefore for better memory utilization, we should close all open cursors that will not be used again. Runtime area is freed after the statement is executed and therefore doesn't retain memory.

    SHARED POOL in SGA consists of three major parts i.e., Library Cache, Dictionary Cache and Session Information. We will consider the tuning aspects of library cache.

    The library Cache can be tuned if we can:

    * Examine the Library Cache activity: This ensures that SQL statements can share a shared SQL area whenever possible and also by increasing memory for the library cache. * Reduce library cache misses: This is done by increasing the memory allocated to library cache until the statistics RELOADS is near 0. You may also need to increase the number of cursors permitted for each session by increasing OPEN_CURSORS. While allocating more memory to the library cache, you must also ensure that this doesn't lead to too much paging and swapping. Misses can also be reduced by writing identical SQL statements in the application. * Speed up access to shared SQL and PL/SQL areas in library cache: This can be done by setting the value of initialization parameter CURSOR_SPACE_FOR_TIME. If it is set to TRUE, this implies that a shared SQL area can only be deallocated when all applications cursors associated with its statement are closed.

    When users connect through the Multi threaded server, Oracle needs to allocate additional space in the shared pool for storing information about the connections between the user processes, dispatchers and servers. Thereore, for each user who will connect using the Multi threaded server, you must add 1K to the setting of the parameter SHARED_POOL_SIZE.

    The amount of free memory in shared pool can be determined using the following query:

    select * from v$sgastat where name = 'free memory';

    If there is always free memory in shared pool, then increasing its size will have little or no effect. On the same lines, if it is always full, this doesn't mean that there is a problem. The main point to consider is the Hit Ratio's.

    UTLESTAT.SQL provides various statistics for all the items in the library cache ie., CLUSTER, BODY, INDEX, OBJECT, PIPE, SQL AREA, TABLE/PROCEDURE and TRIGGERS. You should ensure that pin hit ratio is high and should be close to 1, otherwise you need to increase the size of SHARED_POOL_SIZE.

    DATABASE WIDE STATISTICS

    Buffer Busy Waits: This indicates the instances when a user process attempted to acquire a buffer and had to wait because it was already held in an incompatible mode. A non-zero value is indicative of block contention. Block contention waits should be compared against the logical reads to get the relative level of contention. Logical reads are the total reads of blocks done by all processes. It is the sum of the blocks read in read consistent and current mode. i.e., Consistent Gets + DB Block Gets. If this ratio is about 5-10% then contention can be analyzed further by block type i.e. Data Block, Data Segment Header, Rollback Segment Block, Rollback Segment Header Block etc. by querying on V$waitstat.

    If total waits is high for undo block and undo segment header classes, additional rollback segments should be created to reduce rollback segment buffer contention.

    If the total waits is high for the data block and segment header classes, additional free lists should be created to reduce data block contention. To add additional free lists recreate the tables and indexes with higher FREELISTS storage parameter. It is not recommended that this parameter be increased higher than needed for avoiding contention.

    Cache Hit Ratio(CHR): Hit ratio gives an indication of how often the various processes accessing the data buffers find the blocks in cache. This is calculated as follows:

    Hit Ratio = [ Logical Reads / ( Logical Reads + Physical Reads )] * 100

    where Logical Reads = Db block gets + Consistent Gets Physical Reads = Block Requests which resulted in disk reads.

    For example:

    Hit Ratio = [ 6123433 /( 6123433 + 1201212 )] * 100 = 83.60 %

    A well tuned buffer cache would have a hit ratio of greater than 90%. Hit ratio also depends on the nature of activity on the database. If this value is less than 70%, the parameter DB_BLOCK_BUFFERS should be increased.

    Cluster Key Scans and Block Gets: are the number of cluster blocks accessed. Cluster Key Scans is the number of scans processed on cluster blocks. If the ratio of Cluster Key Scan Block Gets to Cluster Key Scans is greater than one, the rows for one cluster key are stored in multiple data blocks and the cluster should be analyzed for row chaining.

    For example:

    If Cluster Key Scan Block Gets is 8414 and Cluster Key Scans is 3223, the ratio is:

    (8414/3223) = 2.61

    which indicates that on an average 3 oracle blocks are accessed for each cluster key.

    Migrated and chained rows in a table or a cluster could also be found by using the ANALYZE command with the LIST CHAINED ROWS option. It may not be possible to avoid chaining in all the cases (i.e., having LONG columns in the tables etc).

    The SIZE parameter specified during the create cluster command determines the number of cluster keys per block, with the default being one. If this parameter is not specified correctly rows for one cluster key may not fit into one data block or there may be wasted space in the data block. If all of the data for one cluster key does not fit in one block, additional disk access must occur to access the data. Reference the Server's Administrators Guide to determine how to calculate the SIZE parameter for the create cluster command.

    Cumulative Opened cursors: This parameter indicates the total number of cursors opened during the script execution interval. A cursor is opened for each SQL statement which is parsed into a context area. To improve performance cursors should be reused and reparsing should be avoided.

    If a cursor will not be reused, it is best to close the cursor when the SQL statement completes as this will free up all the resources used by the cursor.

    A per transaction average of cursors opened is more relevant than the total number. This value should be analyzed in the context of the composition of the database workload during the interval. Generally this value should not exceed 5-7 cursors per transaction for online processing intervals. For batch processing, it can be higher than 10 depending on the processing being done. If the values are higher this, the application should be analyzed for optimal cursor usage.

    DBWR PERFORMANCE:

    DBWR Checkpoints: are the number of checkpoints messages that were sent to DBWR. This however, does not mean that so many checkpoints were performed because if the second checkpoint is issued while the first is active, the first stops and the second starts. During checkpoint processing, the Log writer hands over to the DBWR a list of modified blocks which are to be written to disk. This causes a slight degradation in performance as buffers are pinned and physical I/O occurs. The performance of normal database operation can be improved, if checkpointing frequency is reduced although this may increase the crash recovery time.

    To reduce the number of checkpoints increase the init.ora parameter LOG_CHECKPOINT_INTERVAL. If this parameter is set to a size (bytes) larger than the size of the redo log, a checkpoint is performed during each log switch. To increase the number of checkpoints and decrease database recovery time decrease the init.ora parameter LOG_CHECKPOINT_INTERVAL.

    LOG_CHECKPOINT_TIMEOUT can also be used the reduce or increase the number of checkpoints. This specifies the amount of time to pass before another checkpoint occurs. A value of 0 disables the time based checkpoint.

    DBWR timeouts : No. of timeouts when dbwr had been idle since the last timeout. These are the times that dbwr looked for buffers to idle write.

    DBWR make free requests : is the number of messages received requesting dbwr to make some more free buffers for the lru. This includes times when dbwr messages itself.

    DBWR free buffers found : is the number of buffers that dbwr found already clean when requested to make free buffers. Divide by "dbwr make free requests" to find the average number of reusable buffers at the end of the lru.

    DBWR lru scans : is the number of times that dbwr does a scan of the lru queue looking for more buffers to write. This includes times when the scan is to fill out a batch being written for another purpose such as checkpoint. Thus this will always be greater than or equal to "dbwr make free requests".

    DBWR summed scan depth : The current scan depth is added to this stat. every time dbwr scans the lru for dirty buffers. Divide by "dbwr lru scans to find the average scan depth. Note that this is not the number of buffers scanned. The scan is reduced by buffers moved to the dirty queue by foregrounds, and may stop early if a full write batch is accumulated.

    DBWR buffers scanned : is the number of buffers looked at when scanning the lru for dirty buffers to make clean. Divide by "dbwr lru scans" to find the average number of buffers scanned. This count includes both dirty and clean buffers. The average buffers scanned may be different from the average scan depth due to write batches filling up before a scan is complete. Note that this includes scans for reasons other than make free buffer requests.

    Summed Dirty Queue Length : is the sum of the lruw queue length after every write request completes. Divide by write requests to get average queue length after write completion. This will help in setting a proper value for _db_block_write_batch and also in figuring out the average length of the dirty list.

    Dirty Buffers Inspected : is the number of times a foreground encountered a dirty buffer which had aged out through the lru queue, when foreground is looking for a buffer to reuse. This should be zero if dbwr is keeping up with foregrounds.

    Free Buffers Inspected : is the number of buffers skipped in the buffer cache by user processes in order to find a free buffer. If this value is high compared to the Free Buffer Scans, then it means that the buffer cache has too many modified blocks and the cache size should be increased. In our output, it is 3553/1320858 which is a low number. The difference between this and "dirty buffers inspected" is the number of buffers that could not be used because they were busy or needed to be written after rapid aging out. They may have a user, a waiter, or being read/written.

    Free Buffer Requested : is the total number of free buffer needed in order to create/load a block.

    Free Buffer Waits : is the number of times processes needed a free buffer and one was not available. If Free Buffer Waits/Free Buffer Scans > 10%, increase init.ora DB_BLOCK_WRITE_BATCH. In our output, this is 20 / 1320858 which is very low.

    Enqueue Timeouts : indicates the number of times that an enqueue lock was requested and was not granted immediately. If this parameter is greater than zero, increase the init.ora parameter ENQUEUE_RESOURCES. In our output, this is 5 and therefore increasing the value of ENQUEUE_RESOURCES will reduce waits.

    Recursive Calls : occur because of cache misses and dynamic storage extension. If the dictionary data is found in cache, a recursive call is not made and the data is read from cache directly. In general if recursive calls is greater than 4 per process, the data dictionary cache (described below) should be optimized and segments should be rebuilt with storage clauses to have a few large extents. Segments include tables, indexes, rollback segment, and temporary segments.

    Also, Recursive calls should be fewer than user calls (less than one- tenth). Where there is an imbalance the aim should be to reduce parsing. review : recursive calls to user calls parse count to user calls. High levels of recursive sql may also be attributable to significant use of PL/SQL. For each sql statement in a PL/sql block, on each iteration, there are recursive calls to do the equivalent of bind and define.

    Sorts : SQL statements which use order by, group by, distinct, sort-merge joins set operations (union/intersection/minus) etc. require sorting of the rows. The cumulative statistics for all statements processed is indicated below:

    Sorts (disk) 543 Sorts (Memory) 587318 Sorts (rows) 11268045

    Sorts(disk) are sorts which require creation of temporary segments on disk to store the intermediate sort results. This occurs if the data being sorted cannot be fit into memory block specified by the SORT_AREA_SIZE in the init.ora. Sorts(memory) are the sorts completed in memory. Obviously, disk sorts are several times slower than memory sorts and hence performance will be optimized if most sorts are done in memory. Typically, the ratio of disk sorts to memory sorts should be less than 5%. If it is more, then the SORT_AREA_SIZE parameter should be raised depending on available memory. In our output, this ratio is 543/587318 which is less than 1% and is very good.

    Table Statistics:- The following statistics gives interesting information on the execution of sql statements on tables

    Total Per/Txn Table scans (long Tables ) = 1633 1.29 Table scans (short Tables ) = 5030 3.98 Table Scans Blocks Gotten = 1326513 1051.11 Table Scan Rows Gotten = 32563977 25803.46 Table Fetch By rowid = 102254 81.02 Table Fetch By Continued Row = 9565 7.75

    Table Scans (long tables) : is the total number of full table scans performed on tables with more than 5 db_blocks. If the number of full table scans is greater than 0 on a per transaction(in the output it is 1.29) the sql statements in the application would generally need tuning. Use of indexes by statements should be based on selectivity of the keys, clustering of rows by the key values, the amount of data retrieved etc. and several other considerations.

    Table Scans (short tables) : is the number of full table scans performed on tables with less than 5 db_blocks. It is optimal to perform full table scans on short tables rather than using indexes. Table Scans (long tables) plus Table Scans (short tables) is equal to the number of full table scans. In our output nearly 4 short tables per transaction are accessed using full scan. Its a good idea not to build indexes on small tables.

    Table Scan Blocks Gotten and Table Scan Rows Gotten : respectively are the number of blocks and rows scanned during all full table scans. To determine on average the number of rows gotten per block for all full table scans:

    Table Scan Rows Gotten/Table Scan Blocks Gotten i.e. 32563977/1326513 = 24.54 rows per block

    To determine the approximate number of rows gotten for short and long table scans:

    Table Scans (short) X 4 blocks = Blocks Scanned (short) i.e. 5030 x 4 = 20120 short table block scans

    Table Scan Blocks Gotten - Blocks Scanned (short) = Blocks Scanned (long) 132651 - 20120 = 112531 Long Table Blocks Scanned

    Table Fetch by Rowid : denotes the rows that were accessed using an index and rows that were accessed using the statement where rowid = 'xxxxxxxx.xxxx.xxxx'. Rowid is the fastest path to a row and should be used whenever applicable. In our output, this is 102254 which compared with the total number of rows retrieved i.e. 102254 + 32563977 = 32666231 ( rowid access + full scan access ) shows the relative use of indexes. In this case on a per transaction basis one long table is being accessed by a full scan. Hence this ratio is poor and should be improved by tuning the sql statements to use indexes on long tables.

    Table Fetch by Continued Row : is the number of rows which are continued or chained to another block. If this number is high additional I/O must be performed in order to read the entire row. Row chaining cannot be avoided for tables with long columns. Ratio of Table fetch by Continued Row and Table fetch by Rowid is a good indication of the amount of chained rows. To determine if a table has row chaining. ORACLE7 has a new command viz., ANALYZE which provides the number of chained rows in a table.

    While creating various tables and indexes, calculate the extent sizes keeping future extensions in mind so that Unnecessary Dynamic Space Management calls could be minimized.

    Full table scan's can make the most needed blocks from the SGA to age out due to the LRU algorithm. In Oracle7 a new init.ora parameter called SMALL_TABLE_THRESHOLD has been added. This determines the number of buffers in the SGA that are available for full table scans. The basic assumption is that the blocks read due to full table scans may not be required in near future and therefore should not age out other blocks already in SGA.

    MINIMISING I/O

    Full table scan time is degraded by a wastefully high "high water mark" and a low block fill factor. While chaining must be avoided, it is still essential to maintain a reasonably high blocking factor, minimizing the need for costly disk I/O.

    The high water mark identifies the portion of the table that has been allocated. It should be checked to ensure it is not wastefully high (occurring because a once large table has had several deletions). Such a table should be recreated (V6) or truncated (V7) and reloaded to set the high water mark back to its optimal position.

    Finding the high water mark is probably the most difficult chore in the analysis of a table. A decent estimate can be obtained by selecting a count of the unique fileid/blockid combinations for the table. Both the fileid and blockid are components of the rowid pseudo column. A precise answer is obtained by counting consistent mode reads in a SQL trace analysis of a full table scan.

    Pctused and Pctfree determine the blocking factor for the table. Storage clauses control the segment spread across the volumes. At a number of sites the major tuning act we have accomplished has been to improve the blocking factor on an otherwise fixed I/O bound application.

    Pctfree, pctused, and storage clauses should be considered mandatory for all segment creation. Even best guess starting values are likely to be better than the default values. Pctfree should allow for the expected growth of items in a block: under Oracle 6 all updates involve the creation of a new row, while in Oracle 7 a new row will be created in situ if and only if it has the same data length as the old. In rare circumstances char may be useful over varchar2 to guarantee in situ update. For indexes, pctfree is only used during index creation to distribute the free space. Thus, if a table currently contains one year's worth of data and we wanted to build an index which would not suffer block splits for another year, then we would create it with pctfree = 50. Whereas for an index where each key is the highest new key, pctfree = 0. Space allocation is resource intensive and should be minimized. Bear in mind, however, that it is not necessary to recreate everything at maximum predicted size. Data structure growth within Oracle is one of the easiest mechanisms to manage. Seent space allocation should be in standard units to avoid space bubbling.

    User Calls and Parse Statistics : Calls is the number of times a call is made to the Kernel. Parse Count indicates the number of times a SQL statement was parsed. The number of calls to the Kernel should be reduced if possible. The performance Tuning Guide indicates how to setup array processing to reduce the number of calls to the Kernel.

    To calculate the number of calls to the Kernel per parse perform the following calculation:

    Parse Count / User Calls = Avg. calls per parse

    For example:

    7901 / 81355 = nearly 10%.

    ALERT LOG : When an instance is shutdown normally or immediately ORACLE writes the highest number of sessions concurrently connected to the instance, since the instance started to the ALERT log file, this number can be used for tuning purpose (for setting TRANSACTIONS value).

    File I/O Statistics :

    TABLE            PHYS_     PHYS_     PHYS_     PHYS_    PHYS_    PHYS_
    SPACE            READS     BLKS_     RD_       WRITES   BLKS_    WRT_
                               RD        TIME               WR       TIM
    ---------   -----------    -------   -------   -------  -------  -------
    DATA_N              668     4761      4068         5       5        28543
    INDEX_N              30       30        99         5       5        25689
    SITE               1878     7579      1981        30      30     15945455
    DATA_A(a4.dbf)   187900  1173580   1008375         0       0            0
    DATA_A             2242    17310     16306         0       0            0
    INDEX_B           16367    16367     61324      2641    2641    321756495
    INDEX_B           12811    12811     44437      1496    1496    626867878
    ROLL_S              116      116       548      2395    2395     91039250
    SYSTEM            14762    78891     80594      1800    1800   1488658059
    TEMP                  5       80       163      5760    5760   2937277389
    

    (Reads are done by the user processes and writes to files are done by the DBWR.)

    File I/O should be spread evenly across multiple disk drives. The above output shows that file a4.dbf is heavily accessed whereas the remaining files containing data have substantially fewer accesses. The distribution of writes is also skewed. Further analysis needs to be done for finding out the tables/indexes used most in the accessed files. The number of writes are much less compared to the reads on an average. If writes to rollback segments is small, this would indicate very low DML activity. The system tablespace file shows considerable read activity usually because of recursive calls. By comparing PHYS_READS and PHYS_BLKS_RD we can calculate the number of blocks read from a single datafile on a per read request basis.

    In general, File I/O can be distributed by seperating the types of files onto different disks. Redo logs should be located on disks that do not contain database files. This is because the datafile I/O is scattered whereas redo log files are always written sequentially. Tables should be located on different disks than their associated indexes as concurrent access of the two can be done in a multi-user environment. Large tables and indexes should be striped across several disks if the nature of access is highly distributed. Active database files should not be located at opposite ends of the disk and the most active database files should be located on the highest throughput disks. Typically, rollback segments may be placed in a separate tablespace and spread across disks on different files. Care should be taken to ensure that the creation order of rollback segments is they consecutive numbered segments are on different disks.

    The init.ora parameter DB_FILE_MULTI_BLOCK_READ_COUNT can be set to increase the number of blocks read during a single read. Increasing this parameter reduces I/O when full table scans are being performed.

    LATCH STATISTICS

    :

    An abstract of the output for this section is included below:

    LATCH_NAME         NOWAIT_GETS        NOWAIT_MISSES      NOWAIT_HIT_RATIO
    ------------------ ------------------ ------------------ ------------------
    cache buffers chai           43755868             515451               .988
    cache buffers lru             4452556             514120               .884
    

    No_wait_gets do not wait for the latch to become free, they immediately time out. On multiple cpu systems, where cpu is a bottleneck, reduce spin_count as low as 20 has proved successful; 500 is a good start point. Decrease spin_count only while timeouts do not occur on the scn latch. Increasing latch_wait_multi (to say 20) has a similar gain on a single cpu system. This has proved most beneficial on RISC based systems, where it dampens costly context switching.

    If contention for the cache buffer lru chain is high, it may be worthwhile DECREASING _db_writer_max_scan_cnt, or increasing _db_block_write_batch or _db_block_hash_buckets may also assist.

    DATA DICTIONARY STATISTICS:

    The data dictionary data is maintained in a separate cache called the dictionary cache which is stored in shared SQL area. This is accessed for each sql statement at parse time and also at runtime for dynamic storage allocation etc. Cache hits avoid the necessity for recursive calls and performance on sql statements improves.

    The data dictionary cache statistics are generated from the table X$kqrst.

    Included below is the output of utlestat.sql for dictionary cache.

    NAME                 GET_REQS GET_MISS SCAN_REQ SCAN_MIS MOD_REQS CUR_USAG
    -------------------- -------- -------- -------- -------- -------- --------
    dc_free_extents          979        14      499        0      991      111
    dc_used_extents          965        30        0        0      965      164
    dc_segments              729        12        0        0      653       47
    ...
    

    The size of the dictionary cache is determined by the init.ora parameter SHARED_POOL_SIZE.

    If ratio of number of get_miss to get_req is > 10% the init.ora parameter SHARED_POOL_SIZE should be increased.

    REDO LOG WRITER PERFORMANCE:

    Redo Writer Latches : Before a user process performs a block change it must create the redo record in the log buffer. The allocation of space in the redo log buffer is controlled by the allocation latch. The user process must acquire this latch and may copy while holding it if the size of the entry is less than the parameter "LOG_SMALL_ENTRY_MAX _SIZE". On multi-processing systems, the copying into the buffer can be optimized by acquiring one of the several redo_copy latches. If the size of the redo is more than LOG_SMALL_ENTRY_MAX_SIZE than the user process must obtain a redo copy latch and after obtaining redo copy latch the user process can release the allocation latch and then copy the redo while holding the redo copy latch.

    On computers with multiple CPU's, multiple Redo Copy latches allow multiple processes to copy entries to the redo log buffer concurrently. The number of redo copy latches is determined by the init.ora parameter LOG_SIMULTANEOUS_COPIES. If there is contention for the redo copy latch then more latches should be added by increasing the value of LOG_SIMULTANEOUS_COPIES.

    The performance of the redo writer process along with and effectiveness of init.ora parameters affecting the LGWR can be measured by looking at several statistics. Some tuning issues were discussed for the Latch option. The most important statistics are 'redo blocks written', 'redo entries linearized', 'redo small copies' and 'redo writes'.

    Large values associated with any of the above statistics are meaningless unless compared with the corresponding Redo Latch parameters. Poorly tuned redo parameters often result in latches being held for longer periods and hence resulting in poor performance.

    'redo blocks written' is useful in finding how much redo is generated for an operation or time interval. It may however not have much tuning use.

    Redo_log_space_wait_time : This indicates the time the user processes had to wait to get space in the redo file.

    Redo_log_space_request : Reflects the number of times a user process waits for space in the redo log buffer. This value ideally should be 0. If this value is non zero then the size of the log buffer should be increased by increasing the value of LOG_BUFFER initialization parameter.

    Redo_buffer_allocation_retries : This indicates the number of repeated attempts to allocate space in the redo buffer. A value indicates that the redo writer is falling behind possibly due to a log switch. Although log switch is a normal event, frequent log switches indicate improper sizing of the redo log files. In this output, the value is 5 which is low compared to the number of redo entries 28481.

    Redo_entries/Redo_size : This statistic is useful in sizing the redo log files and planing the checkpointing frequency. Log switches also post the archiver and file copy is done to archive the log file. Redo size tells the number of bytes of redo generated and redo entries is the number of redo records created.

    These statistics should be analyzed for the given time interval.

    For example:

    Redo Size : N Bytes Time interval : tend- tstart = t (minutes say) redo generation per minute = ( N / t )

    log switch, desired interval can be determined by considering the impact of archiver file copy and checkpoints on the system. Log switches force checkpoints to occur and also if archiving is enabled the archiver performs a copy of the file. These two operations involve activity for the background processes and can affect the database response if the sizing is not done properly. Large redo log files reduce the number of log switches but increase the archiver copy time degrading the user response on single cpu systems. Frequent checkpointing may cause the users to wait on buffers pinned by the DBWR.

    A good time interval for log switch is where the cpu time used and the I/O's on the devices involved is not large enough for users to be blocked e.g. on a busy VAX 6420 a file copy of 30000 VMS blocks takes 30secs and several thousand I/O's on a system with 30 users. Once the time interval is known (say in mins), the log file size can be calculated by simple multiplication with the redo generated per minute.

    'redo writes' is the total number of redo writes to the redo buffer. It may be useful to determine whether or not it is too large compared with 'redo entries' statistics. in our example output, redo entries 28481 and redo writes is 1907.

    ROLLBACK SEGMENTS:

    Rollback segments are used to record the before image of the transactions. The amount of rollback generated by a transaction depends on the nature of operation and the amount of data block changes. A transaction writing rollback data has to first access the transaction table stored in the rollback segment header and acquire a slot in that table. This requires momentary latching of the table to serialize concurrent update to it. If the database is update intensive and has a small number of rollback segments, user transactions will wait on the latch to access the transaction table. A large number of rollback segments on a query intensive database will result in wastage of space. The following statistics shows interesting information on the rollback segments.

    Included below is the statistics collected for this section:

    UNDO SEGMENT        TRANS_TBL_GETS      TRANS_TBL_WAITS     UNDO_BYTES_WRITTEN
    ------------------- ------------------- -------------------  
    SEGMENT_SIZE_BYTES  XACTS               SHRINKS             WRAPS
    ------------------- ------------------- -------------------  
                     0                  31                   0                   0
                735232                   0                   0                   0
                     2                2300                   1              820916
               1021952                   0                   0                   2
    ...
    

    In the above output, the waits on transaction table are zero. If the ratio of Trans_tbl_waits to Trans_tbl_gets is greater than 5%, additional rollback segments should be added to the database. In general, rollback segments should be the same size and created with a large number of small extents. Occasionally, large transactions may enlarge rollback segments. The database administrator should periodically verify this and recreate them with equal extents. It is very important to set the OPTIMAL value of the rollback segments so that it doesn't SHRINK too often.

    MULTI-THREADED SERVER TUNING:

    Reducing Contention for Dispatchers and Server Processes:

    Contention for dispatchers is signaled by:

    * High busy rates for existing dispatchers * Increase in the waiting time for responses in the response queue of existing dispatcher processes.

    Busy rate for each dispatcher could be queried as follows

    select network "Protocol", SUM(busy)/ (SUM(busy) + SUM(idle)) "Total Busy Rate" from v$dispatcher group by network.

    If the dispatcher processes for a specific protocol are busy more than 50% of the time, then we may improve the performance by adding more dispatcher processes.

    To monitor wait time for dispatcher process response queues

    select network "Protocol", DECODE( SUM(totalq), 0, 'No Responses', SUM(wait)/SUM(totalq) || ' hunderedths of seconds') "Average Wait Time per Response" from v$queue q, v$dispatcher d where q.type = 'DISPATCHER' AND q.paddr = d.paddr group by network; If the average wait time for a specific network protocol continues to increase steadily as your application runs, you may be able to improve performance of those users processes connected to ORACLE using that protocol by adding dispatcher processes.

    Contention for shared servers is signaled by * Steady increase in waiting time for requests in the request queue.

    * Monitor these statistics using the following query from time to time while the application is running.

    select decode( totalq, 0, 'No Requests', wait/totalq || 'hunderedths of seconds') "Average Wait Time per Requests" from v$queue where type='COMMON';

    This query returns the total wait time for all requests and total number of requests for the request queue. Since ORACLE starts the shared server on its own when the load on the existing shared server processes increases, we will gain no benefit by explicitly increasing the number of shared servers, unless the number of shared servers has reached the maximum and in that case we could increase the value of MTS_MAX_SERVERS.

    PARALLEL Query Tuning:

    Points to Consider while tuning PARALLEL QUERY:

    Whenever Parallel Slaves is incremented by 1 both an extra process is used for scans and nested loop joins, and two extra slave processes are used for sort/merge/joins and index create. Each additional process may allocate up to sort_area_size more memory. Also this process will never release this memory, although it will reuse it if it needs it.

    SVM Stripe sizes should be large enough to minimize the contention probabilities. A stripe of 16k seemed to cause too many hot spots, Values of 64k to 128K seemed to do the trick. This makes sense if you think of in terms of minimizing the number of processes requesting a drive at one time. In other words, keeping the average queue close to 1. If you make a stripe of 16k and the multiblock_read_count is 64k every time you read, you will issue a request to 4 drives. If you have 20 processes and 20 drives, your average queue will be 5 for an I/O intensive operation. If you increase your stripe to 64k then your average queue will probably be around 2 on half the drives and on average.. around (app.)) 1.2 ? (Just a thought)

    I've successfully tested stripes of up to 256k and concatenations of 1 or 2 Mb. Concatenations don't seem to be very useful because you need to size your concat exactly for the data to spread out cleanly. Big stripes seem to be the best answer so far. 64k probably does the trick since the largest multiblock I/O oracle does is 64k! I've also used extensively 128k which also seems to be a happy number. It is possible that values somewhere in between could be better?

    Also, if you have a lot more drives than cpus, and not necessarily I/O bound queries, then fewer processes accessing more disks may lead you to use smaller stripes

    Deciding how much memory to allocate to the SGA vs the sort sizes is crucial. While sort_area_size is essential for sort performance, it is also important to maintain large SGA's for when you have to go to the temp space or for reading the database. Its been found that after a certain size, around 100mb it didn't really make a difference if you increased the SGA, but tweaking the sort_area_size made a huge difference. It also ate memory very quickly.

    Using explicit oracle datafiles seems to be slightly faster than striping. This is probably due to the fact that striping avoids multiple reads to one device by luck (i.e. lots of drives in a stripe), and a big enough stripe size so that reads don't collide most of the time. The key here is 'most'. Separating the data through explicit loader striping (i.e. loading 1/n the data in each of the n raw devices) guarantees that there are no collisions. Also , If there are multiple datafiles for the table being scanned in parallel, different query slaves read from different files, thereby improving the total i/o throughput and avoid disk contention. Therefore it is advisable to have your large tables, residing on different datafiles on different disk drives to improve the full table scans.

    Stripes have other advantages, however. They can be created a lot faster: 1/n time. They can be maintained with SVM i.e. OS mirroring. And they always guarantee that data is spread out across N drives.. where as explicit loader striping may only guarantee that as long as the data is freshly loaded. Also, the difference in performance is almost negligible.. less than 5%. So if you don't care about the last two drops of performance, use SVM it will save you a lot of hassles.

    As far as media failure recovery, loader striping is better, since one failed drive will not affect more than one datafile. Only that datafile will need to be recovered in that case. In the case of stripes a failure on one drive may force recovery of the whole database. To avoid this eventuality use of mirroring is recommended. Mirroring also improves read performance as SVM will alternate reads. On the other hand update performance will decrease.

    It indeed is a difficult decision of choosing one over another.

    db_block_size values larger than the default 2k, or 4k are recommended. On the test machine a value of 8K seemed near optimal, this however is a difficult rule to apply across the board. Values of 16k seemed to degrade performance.

    Parallel Query does not support parallel scans on indexed columns. If a significant part of the processing on a query is related to sorting, or other cpu or temp space intensive operation, where parallelism is still useful then on occasion there are ways to tweak a query to make it parallelize even if you have indexes on all the columns you wish to join.

    For example, Pick the column with the fewest rows (but must be greater than 1000). Then, make it the driving column on the query (1st join), and add the following hint on the query...

    select /*+ full(ename) parallel(ename,10) */ a.ename, b.dept .. etc. etc. You may need to add something like use_merge(b).. ordered or other hint if the query optimizer tries to do an Nested Loop Join on you and this bombs! Anyway, this causes a full scan on that table and parallelizes everything down.

    Avoid execution plans similar to

    Nested Loop Full Table Scan of Table emp Indexed scan of table B because, the full table scan will be parallelized but since Indexes access is not done in parallel, Index scan will become the bottleneck and you may not be gaining much by parallelizing this query.

    Partition allocation for Parallel Full Table Scan:

    Rowid's are allocated to each of the slaves in the following manner. Rowids are datafile#,block#,slot pairs. These are divided into P sections. Each section is then divided into a 9/13, 3/13, and 1/13 segment. Then all the 9/13ths segments are handed to each of the slaves participating in the query. If any one of the slaves finishes all its segments before it another slave has a chance to finish his.. then the fast slave takes up the slack of the slow slave by picking up its 3/13th or 1/13 segment. Modified algorithms exist if not enough blocks exist to divide in this manner.

    Parallel Query definitely is a very good tool to use and depending on the query, it could speed up the query many folds, but it also requires the use of temp space (Sort/Merge Join). Therefore in a very large database environment, Its advantages and disadvantages need to be weighed and used accordingly.

    PARALLEL INDEX CREATION:

    Large initial extents are necessary as allocating additional extents is a heavily serialized operation. This is specially true of the temp tablespace and the target index tablespace. In some cases it is worthwhile to create the index (or whatever), then execute the following query:

    select * from user_extents where tablespace_name = 'name' ;

    Then count how much space your index took by adding the number of blocks or bytes per extent, divide that by the number of processes you are using to create the index (PARALLEL/2).. and that is the size you should make your initial extent. Each of the slaves will grab an initial extent when working on a temp space or on an index. Add a couple mb for insurance. Similar exercises can pay of big in things like load or Sort Merge Joins as long as you can get accurate counts of space usage. A sample storage parameter would be:

    (initial 20m next 20m pctincrease 0)

    Well tuned temp tablespaces are key whenever sorting cannot be done in memory. This seems to be crucial in Index Creates. Never undersize your temp tablespace in very large databases. This might be even more important than your main tablespace if you do a lot of index creates or Sort Merge Joins.

    PARALLEL LOADER TUNING:

    With the parallel loader you need to be careful that the input data source is as well tuned as the output datafile. Probably it will also be good to use the file system, to get some read-ahead. To take advantage of the Parallel Loader, You need to divide the data to be loaded for each table, into multiple OS files and start multiple direct loader jobs at the same time. Since Direct Loader in ORACLE7.1, takes a "shared" lock on the table and therefore it is possible to have multiple jobs loading into the same table simultaneously. It is important to know that, when multiple parallel loader jobs are run against a single table, they all acquire initial space equal to the initial extent of the table definition. Therefore make sure that initial extent is not very large compared to the Next extent, Otherwise either you will run out of the space in the tablespace or will waste lots of space.

    OTHER TUNING TIPS:

  • Proper use of indexes improves the query and application performance dramatically.
  • Use cost based optimizer whenever possible because this way optimizer uses the statistics for the tables, indexes and clusters and chooses the fastest way to execute a SQL statement.
  • Use hints in the SQL statement, to direct the optimizer to choose a specific way to execute a SQL statement.
  • Take Advantage of new features of Oracle7 Viz., Integrity Constraints, Stored Procedures and Packages and Shared SQL and PL/SQL areas. Using constraints is very useful because this releases the application from the work of checking various constraints and does it at the database level, thereby reducing the calls to kernel and also the response time.
  • On OS with virtual storage, be sure that the SGA fits into real memory to ensure fast access to cached data.
  • Be sure that both foreground and background, have the same operating system priority. Increasing and decreasing these priorities could lead to severe performance degradation.
  • Use discrete transactions to improve the performance of your transactions. Discrete transactions are simpler and have faster rollback mechanism that improves performance. Discrete transactions have few restrictions, be sure to go through them. Discrete transactions can be enabled by setting DISCRETE_TRANSACTIONS_ENABLED to TRUE.
  • Take advantage of the new Read Only Tablespace feature of ORACLE7.1. This helps in two ways, i.e., Firstly, It doesn't allow anyone to update any object in that tablespace and secondly, you do not need to backup the read-only tablespace, everytime, you do hot backups. Do not forget to backup the read-only tablespace, immediately after it becomes read-only from read write.

    References

    (1) Oracle7 Server Administrators Guide (2) Oracle7 Server Concepts Manual (3) Oracle7 Server Application Developer's Guide (4) Using SQL to Identify Database Performance Bottlenecks by Deepak Gupta and Sameer Patker (IOUW 92) (5) Getting Maximum out of your ORACLE7 Database by Deepak Gupta (IOUW 93) (6) Tuning Tips by Juan M Tellez.

    [ Back To Top ]
  • E10000 | HPC | Customers | Sales Tools | Market Dev | SAE/Benchmarks | Competition | About DHPG |Sitemap | Home
    View DHPG.WEST
    CALENDAR
      Sun Proprietary & Confidential: Internal Use Only
    For dhpg contact information click here..
    Questions, Comments or Content Suggestions?
    Email webjester@west