首页 > 数据库 > How to Perform a Healthcheck on the Database

How to Perform a Healthcheck on the Database

2008年5月24日 2,163 views 发表评论 阅读评论
Subject: How to Perform a Healthcheck on the Database
Doc ID: Note:122669.1 Type: BULLETIN
Last Revision Date: 20-MAR-2008 Status: PUBLISHED

Table of Contents
-----------------

  1. Introduction
  2. Parameter file
  3. Controlfiles
  4. Redolog files
  5. Archiving
  6. Datafiles
    6.1 Autoextend
    6.2 Location
  7. Tablespaces
    7.1 SYSTEM Tablespace
    7.2 SYSAUX Tablespace
    7.3 Locally vs Dictionary Managed Tablespaces
    7.4 Temporary Tablespace
    7.5 Tablespace Fragmentation
  8. Objects
    8.1 Number of Extents
    8.2 Next extent
    8.3 Indexes
 9. AUTO vs MANUAL undo
    9.1 AUTO Undo
    9.2 Manual undo
 10. Memory Management
    10.1 Pre-Oracle 9i
    10.2 Oracle 9i
    10.3 Oracle 10g
    10.4 Oracle 11g
 11. Logging & Tracing
    11.1 Alert File
    11.2 Max_dump_file_size
    11.3 User and core dump size parameters
    11.4 Audit files
    11.5 Sqlnet

1. Introduction
---------------
This article explains how to perform a health check on the database.  General 
guidelines are given on what areas to investigate to get a better overview on
how the database is working and evolving.  These guidelines will reveal common 
issues regarding configuration as well as problems that may occur in the future.

The areas investigated here are mostly based on scripts and are brought to you 
without any warranty, these scripts may need to be adapted for next database 
releases and features.  This article will probably need to be extended to serve 
specific application needs/checks.

Although some performance areas are discussed in this article, it is not the 
intention of this article to give a full detailed explanation of optimizing the
database performance.  

General scripts that help track information on the database:
=)> Note 250262.1 Health Check / Validation Engine Guide
=)> Note 136697.1 “hcheck8i.sql” script to check for known problems in Oracle8i, Oracle9i, and Oracle10g 

2. Parameter file
-----------------
The parameter file can exists in 2 forms. First of all we have the text-based 
version, commonly referred to as init.ora or pfile, and a binary-based file, 
commonly referred to as spfile. The pfile can be adjusted using a standard Operating 
System editor, while the spfile needs to be managed through the instance itself.
It is important to realize that the spfile takes presedence above the pfile, meaning 
whenever there is an spfile available this will be automatically taken unless 
specified otherwise.

NOTE:  Getting an RDA report after making changes to the database configuration is
also a recommendation.   Keeping historical RDA reports will ensure you have 
an overview of the database configuration as the database evolves.

Reference:
  Note 249664.1: Pfile vs SPfile 

3. Controlfiles
---------------
It is highly recommended to have at least two copies of the controlfile. This can 
be done by mirroring the controlfile, strongly recommended on different physical 
disks. If a controlfile is lost, due to a disk crash for example, then you can 
use the mirrored file to startup the database. In this way fast and easy recovery 
from controlfile loss is obtained.

  connect as sysdba
  SQL> select status, name from v$controlfile;

  STATUS  NAME
  ------- ---------------------------------
          /u01/oradata/L102/control01.ctl
          /u02/oradata/L102/control02.ctl

The location and the number of controlfiles can be controlled by the 'control_files' 
initialization parameter.

4. Redolog files
----------------
The Oracle server maintains online redo log files to minimize loss of data in the 
database. Redo log files are used in a situation such as instance failure to recover 
commited data that has not yet been written to the data files. Mirroring the 
redo log files, strongly recommended on different physical disks, makes recovery more 
easy in case one of the redo log files is lost due to a disk crash, user delete, etc. 

  connect as sysdba
  SQL> select * from v$logfile;

     GROUP# STATUS  TYPE   MEMBER
  --------- ------- ------ -----------------------------------
          1         ONLINE /u01/oradata/L102/redo01_A.log
          1         ONLINE /u02/oradata/L102/redo01_B.log

          2         ONLINE /u01/oradata/L102/redo02_A.log
          2         ONLINE /u02/oradata/L102/redo02_B.log

          3         ONLINE /u01/oradata/L102/redo03_A.log
          3         ONLINE /u02/oradata/L102/redo03_B.log

At least two redo log groups are required, although it is advisable to have at least 
three redo log groups when archiving is enabled (see the following chapter). It is 
common, in environments where there are intensive log switches, to see the ARCHiver 
background process fall behind of the LGWR background process. In this case the LGWR 
process needs to wait for the ARCH process to complete archiving the redo log file.

References :
 Note 102995.1 Maintenance of Online Redo Log Groups and Members

5. Archiving
------------
Archiving provides the mechanism needed to backup the changes of the database.
The archive files are essential in providing the necessary information to recover the 
database. It is advisable to run the database in archive log mode, although you may 
have reasons for not doing this, for example in case of a TEST environment where you 
accept to loose the changes made between the current time and the last backup. 
You may ignore this chapter when the database doesn't run in archive log mode.

There are several ways of checking the archive configuration, below is one of them:

  connect as sysdba
  SQL> archive log list

  Database log mode              No Archive Mode --OR-- Archive Mode
  Automatic archival             Disabled        --OR-- Enabled
  Archive destination            <arch. dest.>   --OR-- USE_DB_RECOVERY_FILE_DEST
  Oldest online log sequence     seq. no
  Current log sequence           seq. no

Pre-10g, if the database is running in archive log mode but the automatic archiver 
process is disabled, then you were required to manually archive the redolog files.  
If this is not done in time then the database is frozen and any activity is prevented.
Therefore you should enable automatic archiving when the database is running in archive 
log mode.  This can be done by setting the 'log_archive_start' parameter to true in 
the parameter file.
Starting from 10g, this parameter became obsolete and is no longer required to be set 
explicitly. It is important that there is enough free space on the dedicated disk(s) 
for the archive files, otherwise the ARCHiver process can't write and a crash is inevitable.

References:
  Note 69739.1  How to Turn Archiving ON and OFF
  Note 122555.1 Determine how many disk space is needed for the archive files

6. Datafiles
------------

  6.1 Autoextend
  ~~~~~~~~~~~~~~~
  The autoextend command option enables or disables the automatic extension of
  data files.  If the given datafile is unable to allocate the space needed, it
  can increase the size of the datafile to make space for objects to grow.

  A standard Oracle datafile can have, at most, 4194303 Oracle datablocks. 
  So this also implies that the maximum size is dependant on the Oracle Block size used.

        DB_BLOCK_SIZE	  Max Mb value to use in any command
	~~~~~~~~~~~~~	  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
		2048		 8191 M
		4096		16383 M
		8192		32767 M
	       16384		65535 M

  starting from Oracle 10g, we have a new functionality called BIGFILE, which 
  allows for bigger files to be created. Please also consider that every Operating 
  System has its limits, therefore you should make sure that the maximum size of 
  a datafile cannot be extended past the Operating System allowed limit.

  To determine if a datafile and thus, a tablespace, has AUTOEXTEND capabilities:

    SQL> select file_id, tablespace_name, bytes, maxbytes, maxblocks, increment_by, file_name
         from dba_data_files
         where autoextensible = 'YES';

  Reference:
    Note 112011.1: ALERT: RESIZE or AUTOEXTEND can "Over-size" Datafiles and Corrupt the Dictionary
    Note 262472.1: 10g BIGFILE Type Tablespaces Versus SMALLFILE Type 

  6.2 Location
  ~~~~~~~~~~~~~
  Verify the location of your datafiles.  Overtime a database will grow and datafiles 
  may be added to the database.  Avoid placing datafiles on a 'wherever there is space' 
  basis as this will complicate backup strategies and maintenance.

  Below is an example of bad usage:

    SQL> select * from v$dbfile;

          FILE# NAME
    --------- --------------------------------------------------
            1 D:\DATABASE\SYS1D806.DBF
            2 D:\DATABASE\D806\RBS1D806.DBF
            3 D:\DATABASE\D806\TMP1D806.DBF
            5 D:\DATABASE\D806\USR1D806.DBF
            6 D:\USR2D806.DBF
            7 F:\ORACLE\USR3D806.DBF

7. Tablespaces
--------------

  7.1 SYSTEM Tablespace
  ~~~~~~~~~~~~~~~~~~~~~~
  User objects should not be created in the system tablespace.  Doing so can lead
  to unnecessary fragmentation and preventing system tables of growing.  The following query
  returns a list of objects that are created in the system tablespace but not owned
  by SYS or SYSTEM.

    SQL> select owner, segment_name, segment_type
         from dba_segments
         where tablespace_name = 'SYSTEM'
           and owner not in ('SYS','SYSTEM');

  7.2 SYSAUX Tablespace (10g Release and above)
  ~~~~~~~~~~~~~~~~~~~~~~
  The SYSAUX tablespace was automatically installed as an auxiliary tablespace to 
  the SYSTEM tablespace when you created or upgraded the database. Some database 
  components that formerly created and used separate tablespaces now occupy the 
  SYSAUX tablespace.

  If the SYSAUX tablespace becomes unavailable, core database functionality will 
  remain operational. The database features that use the SYSAUX tablespace could 
  fail, or function with limited capability.

  The amount of data stored in this tablespace can be significant and may grow 
  over time to unmanageble sizes if not configured properly. There are a few 
  components that need special attention.

  To check which components are occupying space:

     select space_usage_kbytes, occupant_name, occupant_desc 
     from v$sysaux_occupants
     order by 1 desc; 

  Reference:
    Note 329984.1: Usage and Storage Management of SYSAUX tablespace occupants SM/AWR, SM/ADVISOR, SM/OPTSTAT and SM/OTHER 

  7.3 Locally vs Dictionary Managed Tablespaces
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Locally Managed Tablespaces are available since Oracle 8i, however they became 
  the default starting from Oracle 9i. Locally Managed Tablespaces, also referred to 
  as LMT, have some advantage over Data Dictionary managed tablespaces.

  To verify which tablespace is Locally Managed or Dictionary Managed, you can run 
  the following query:

    SQL> select tablespace_name, extent_management 
         from dba_tablespaces;

  Reference:
   Note 93771.1:  Introduction to Locally-Managed Tablespaces
   Note 105120.1: Advantages of Using Locally Managed vs Dictionary Managed Tablespaces 

  7.4 Temporary Tablespace
  ~~~~~~~~~~~~~~~~~~~~~~~~~
  o  Locally Managed Tablespaces use tempfiles to serve the temporary tablespace, 
     whereas Dictionary Managed Tablespaces use a tablespace of the type temporary. 
     When you are running an older version (pre Oracle 9i), then it is important to 
     check the type of tablespace used to store the temporary segments. By default, 
     all tablespaces are created as PERMANENT, therefore you should make sure that 
     the tablespace dedicated for temporary segments is of the type TEMPORARY.

      SQL> select tablespace_name, contents 
           from dba_tablespaces;

      TABLESPACE_NAME                CONTENTS
      ------------------------------ ---------
      SYSTEM                         PERMANENT
      USER_DATA                      PERMANENT
      ROLLBACK_DATA                  PERMANENT
      TEMPORARY_DATA                 TEMPORARY

  o Make sure that the users on the database are assigned a tablespace of the 
    type temporary.  The following query lists all the users that have a permanent
    tablespace specified as their default temporary tablespace.

      SQL> select u.username, t.tablespace_name
           from dba_users u, dba_tablespaces t
           where u.temporary_tablespace = t.tablespace_name
             and t.contents <> 'TEMPORARY';

    Note: User SYS and SYSTEM will show the SYSTEM tablespace as there default 
    temporary tablespace.  This value can be altered as well to prevent fragmentation
    in the SYSTEM tablespace.

       SQL> alter user SYSTEM temporary tablespace TEMP;

  o The space allocated in the temporary tablespace is reused. This is done for 
    performance reasons to avoid the bottleneck of constant allocating and de-allocating 
    of extents and segments. Therefore when looking at the free space in the temporary 
    tablespace, this may appear as full all the time. The following are a few queries 
    that can be used to list more meaningful information about the temporary segment usage:

    This will give the size of the temporary tablespace:

      SQL> select tablespace_name, sum(bytes)/1024/1024 mb 
           from dba_temp_files
           group by tablespace_name;

    This will give the "high water mark" of that temporary tablespace (= max used at one time): 

      SQL> select tablespace_name, sum(bytes_cached)/1024/1024 mb
           from v$temp_extent_pool
           group by tablespace_name;

    This will give current usage:

      SQL> select ss.tablespace_name,sum((ss.used_blocks*ts.blocksize))/1024/1024 mb 
           from gv$sort_segment ss, sys.ts$ ts  
           where ss.tablespace_name = ts.name 
           group by ss.tablespace_name;

  7.5 Tablespace Fragmentation
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Heavly fragmented tablespaces can have an impact on the performance, especially
  when a lot of Full Table Scans are occurring on the system.  Another disadvantage
  of fragmentation is that you can get out-of-space errors while the total sum of
  all free space is much more then you had requested.

  The only way to resolve fragmentation is drop and recreate the object.  In most
  cases doing an export and import will solve the problem.  If you need to 
  defragment your system tablespace, you must rebuild the whole database since 
  it is NOT possible to drop the system tablespace.

  References:
    Note 1020182.6 SCRIPT to detect tablespace fragmentation
    Note 1012431.6 Common causes of Fragmentation

8. Objects
----------

  8.1 Number of Extents
  ~~~~~~~~~~~~~~~~~~~~~~
  While the performance hit on over extended objects is not significant, the
  aggregate effect on many over extended objects does impact performance.  The
  following query will list all the objects that have allocated more extents
  than a specified minimum.  Change the <--minext--> value by an actual number,
  in general objects allocating more then 100 a 200 extents can be recreated
  with larger extent sizes:

    SQL> select owner, segment_type, segment_name, tablespace_name, 
                count(blocks), SUM(bytes/1024) "BYTES K", SUM(blocks)
         from dba_extents
         where owner NOT IN ('SYS','SYSTEM')
         group by owner, segment_type, segment_name, tablespace_name
         having count(*) > <--minext-->>
         order by segment_type, segment_name;

  8.2 Next extent
  ~~~~~~~~~~~~~~~~
  It is important that segments can grow and therefore allocate their next extent
  when needed.  If there is not enough free space in the tablespace then the next
  extent can not be allocated and the object will fail to grow.
  The following query returns all the segments that are unable to allocate their
  next extent :

    SQL> select s.owner, s.segment_name, s.segment_type, 
                s.tablespace_name, s.next_extent
         from dba_segments s
         where s.next_extent > (select MAX(f.bytes)
                                from dba_free_space f
                                where f.tablespace_name = s.tablespace_name);

  Note that if there is a lot of fragmentation in the tablespace, then this query
  may give you objects that still are able to grow.  The above query is based on
  the largest free chunk in the tablespace available.  If there are a lot of 
  'small' free chunks after each other, then Oracle will coalesce these to serve
  the extent allocation. 

  Therefore it can be interesting to adapt the script in Note 1020182.6 'SCRIPT 
  to detect tablespace fragmentation' to compare the next extent for each object
  with the 'contiguous' bytes (table space_temp) in the tablespace.

  8.3 Indexes
  ~~~~~~~~~~~~
  An index needs to be maintained, every delete or insert on a table result 
  indirectly on a delete or insert on the underlying index.  Over time an index 
  structure can get fragmented and therefore the index should be rebuilt.

9. AUTO vs MANUAL undo
-----------------------
Starting from Oracle 9i we introduced a new way of managing the before-images. 
Previously this was achieved through the RollBack Segments or also referred to as 
manual undo. Automatic undo is used when the UNDO_MANAGEMENT parameter is set to 
AUTO. When not set or set to MANUAL then we use the 'old' rollback segment mechanism.
Although both versions are still available in current release, automatic undo is preferred.

  9.1 AUTO UNDO
  ~~~~~~~~~~~~~~~
  There is little to no configuration involved to AUM (Automatic Undo Management).
  You basically define the amount of time the before image needs to be kept available. 
  This is controlled through the parameter UNDO_RETENTION, defined in seconds. So a 
  value of 900 indicates 15 minutes.

  It is important to realize that this value is not honored when we are under space 
  pressure in the undo tablespace. 

  Therefore the following formula can be used to calculate the optimal undo tablespace size:

    Note 262066.1: How To Size UNDO Tablespace For Automatic Undo Management

  Starting from Oracle 10g, you may choose to use the GUARANTEE option, to make sure 
  the undo information does not get overwritten before the defined undo_retention time.

    Note 311615.1: Oracle 10G new feature - Automatic Undo Retention Tuning 

  9.2 MANUAL UNDO
  ~~~~~~~~~~~~~~~~~~
  o Damaged rollback segments will prevent the instance to open the database.  Only if 
    names of rollback segments are known, corrective action can be taken.  Therefore
    specify all the rollback segments in the 'rollback_segments' parameter in the
    init.ora

  o Too small or not enough rollback segments can have serious impact on the behavior
  of your database.  Therefore several issues must be taken into account.  The 
  following query will show you if there are not enough rollback segments online
  or if the rollback segments are too small.

  SQL> select d.segment_name, d.tablespace_name, s.waits, s.shrinks,
              s.wraps, s.status
       from v$rollstat s, dba_rollback_segs d
       where s.usn = d.segment_id
       order by 1;

  SEGMENT_NAME     TABLESPACE_NAME             WAITS   SHRINKS     WRAPS STATUS
  ---------------- ----------------------- --------- --------- --------- --------
  RB1              ROLLBACK_DATA                   1         0       160 ONLINE
  RB2              ROLLBACK_DATA                  31         1       149 ONLINE
  SYSTEM           SYSTEM                          0         0         0 ONLINE

  The WAITS indicates which rollback segment headers had waits for them.  Typically
  you would want to reduce such contention by adding rollback segments. 

  If SHRINKS is non zero then the OPTIMAL parameter is set for that particular
  rollback segment, or a DBA explicitly issued a shrink on the rollback segment.
  The number of shrinks indicates the number of times a rollback segment shrinked
  because a transaction has extended it beyond the OPTIMAL size.  If this value is
  too high then the value of the OPTIMAL size should be increased as well as the
  overall size of the rollback segment (the value of minextents can be increased 
  or the extent size itself, this depends mostly on the indications of the WRAPS 
  column).

  The WRAPS column indicate the number of times the rollback segment wrapped to
  another extent to serve the transaction.  If this number is significant then you
  need to increase the extent size of the rollback segment.

Reference:
  Note 62005.1 Creating, Optimizing, and Understanding Rollback Segments

10. Memory Management
---------------------
This chapter is very version driven. Depending on which version you are running the 
option available will be different. Overtime Oracle has invested a great deal of 
time and effort in managing the memory more efficiently and transparently for the end-user. 
Therefore it is advisable to use the automation features as much as possible.

  10.1 Pre Oracle 9i
  ~~~~~~~~~~~~~~~~~~~
  The different memory components (SGA & PGA) needed to be defined at the startup of the 
  database. These values were static. So if one of the memory components was too low the 
  database needed to be restarted to make the changes effective. 
  How to determine the optimal or best value for the different memory components is not 
  covered in this note, since this would lead us too far. However a parameter that was
  often misused in these versions is the sort_area_size.

  The 'sort_area_size' parameter in the init.ora defines the amount of memory that can be 
  used for sorting.  This value should be chosen carefully since this is part of the
  User Global Area (UGA) and therefore is allocated for each user individually. 
  If there are a lot of concurrent users performing large sort operation on the database 
  then the system can run out of memory.

  E.g.: You have a sort_area_size of 1Mb, with 200 concurrent users on the database. 
        Although this memory is allocated dynamically, it can allocate up to 200Mb
        and therefore can cause extensive swapping on the system.

  10.2 Oracle 9i
  ~~~~~~~~~~~~~~~
  Starting from Oracle 9i we introduced the parameters 

    workarea_size_policy = [AUTO | MANUAL]
    pga_aggregate_target = <value>

  This allows you define 1 pool for the PGA memory, which will be shared across sessions. 
  When you often receive ORA-4030 errors, then this can be an indication that this value is 
  specified too low.  

  10.3 Oracle 10g
  ~~~~~~~~~~~~~~~~
  Automatic Shared Memory Management (ASMM) was introduced in 10g. The automatic shared memory
  management feature is enabled by setting the SGA_TARGET parameter to a non-zero value.

  This feature has the advantage that you can share memory resources among the different components. 
  Resources will be allocated and deallocated as needed by Oracle automatically.

  Automatic PGA Memory management is still available through the 'workarea_size_policy' and 
  'pga_aggregate_target' parameters.

  10.4 Oracle 11g
  ~~~~~~~~~~~~~~~~
  Automatic Memory Management (AMM) is being introduced in 11g. This enables automatic tuning 
  of PGA and SGA with use of two new parameters named MEMORY_MAX_TARGET and MEMORY_TARGET. 

Reference:
  Note 443746.1: Automatic Memory Management(AMM) on 11g 

11. Logging & Tracing
---------------------

  11.1 Alert File
  ~~~~~~~~~~~~~~~~
  The alert log file of the database is written chronologically.  Data is always 
  appended and therefore this file can grow to an enormous size.  It should be 
  cleared or truncated on a regular basis, as a large alert file occupies 
  unnecessary disk space and can slow down OS write performance to the file.

    SVRMGR> connect internal/<password>
    SVRMGR> show parameter background_dump_dest
    NAME                           TYPE    VALUE
    ------------------------------ ------- ----------------------------------
    background_dump_dest           string  D:\Oradata\Admin\PROD\Trace\BDump

  Note: starting from Oracle 11g, this location is controlled by the 'diagnostic_dest' parameter

  11.2 Max_dump_file_size
  ~~~~~~~~~~~~~~~~~~~~~~~~
  Oracle Server processes generate trace files for certain errors or conflicts.
  These trace files are of use for further analyzing the probleThe init.ora
  parameter 'max_dump_file_size' limits the size of these trace files.  The value
  of this parameter should be specified in Operating System blocks.
  Make sure the disk space can handle the maximum size specified, if not then 
  this value should be changed.

    SVRMGR> connect internal/<password> 
    SVRMGR> show parameter max_dump_file_size
    NAME                                TYPE    VALUE
    ----------------------------------- ------- ---------------------
    max_dump_file_size                  integer 10240

  11.3 User and core dump size parameters
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  The parameters 'user_dump_size' and 'core_dump_size' can contain a lot of trace information.
  It is important to clear this directory at regular times as this can take up a significant
  amount of space.

  Note: starting from Oracle 11g, this location is controlled by the 'diagnostic_dest' parameter

  11.4 Audit files
  ~~~~~~~~~~~~~~~~~
  By default, every connection as SYS or SYSDBA is logged in an operating system file. 
  The location is controlled through the parameter 'audit_file_dest'. If this parameter is 
  not set then the location defaults to $ORACLE_HOME/rdbms/audit.
  Overtime this directory may contain a lot of auditing information and can take up a 
  significant amount of space.

  11.5 Sqlnet
  ~~~~~~~~~~~~
  By default sqlnet tracing is disabled while logging for sqlnet connections and
  the listener is enabled.  These log files are written chronologically and data is 
  always appended.  This causes some log files to grow to enormous sizes.  These
  large log files should be cleared or truncated on a regular basis.

  Activate tracing in case there is a problem that needs to be analyzed.  Unnecessary
  tracing will slow down the system and occupy free disk space.  Therefore you
  should disable sqlnet and listener tracing when it is not required.

  To find the location of the logfiles or to activate tracing see:
    Note 219968.1: SQL*Net, Net8, Oracle Net Services - Tracing and Logging at a Glance
<<E
 » 如果喜欢可以: 点此订阅本站
分类: 数据库 标签: , ,
  1. 本文目前尚无任何评论.
  1. 本文目前尚无任何 trackbacks 和 pingbacks.