Comprehensive Oracle Database Health Check

Overview

An Oracle database health check involves a thorough review of the database's performance, configuration, and security to ensure optimal operation. It typically includes examining system resource usage (CPU, memory, and storage), checking database logs for errors, and verifying backup and recovery processes. Performance metrics such as query response times, index efficiency, and I/O bottlenecks are analyzed to identify potential issues. A health check also assesses database security by reviewing user access, roles, and privileges. Regular checks help detect early signs of failure or inefficiency, ensuring that the database runs smoothly and securely.

Oracle database deployment and design are critical to ensuring the system is robust, scalable, and efficient. The deployment process involves installing the database on physical or virtual servers, configuring the storage, memory, and network settings for optimal performance, and setting up failover options like Oracle RAC or Data Guard for high availability.

Fundamentally, Oracle's architecture follows a multi-tiered approach with the instance (memory and background processes), the database (physical storage), and clients (applications) interacting with it. Key components include tablespaces, data files, redo logs, and control files, which manage data and recovery. Security, backup strategies, and efficient indexing are essential in the design phase to meet business continuity, performance, and security requirements.

The design should also account for scalability, by utilizing Oracle's partitioning, clustering, and load balancing features, while maintaining security through role-based access and encryption.
Improving Oracle Database performance involves several strategies. First, optimizing SQL queries can significantly reduce execution time. Second, proper indexing can enhance data retrieval speed. Third, regular database maintenance, such as statistics collection and defragmentation, ensures optimal performance. Fourth, using Oracle's built-in performance tuning tools, like Automatic Workload Repository (AWR) and SQL Tuning Advisor, helps identify and resolve bottlenecks. Lastly, scaling hardware resources, such as CPU and memory, can provide the necessary power for high-demand operations.

Below are some major categories involving performing Health check of Oracle Database.

Oracle Database Architecture Health Check
Oracle Database Engine Health Check
Oracle Database Processes Level Health Check

Oracle Database Architecture Health Check

Oracle Database Architecture Health Check ensures optimal performance, much like reviewing the core components of a computer. The database’s memory (SGA and PGA) must be balanced to handle workloads efficiently. ASM (Automatic Storage Management) is used to manage datafiles for efficient disk I/O. Critical components like Redo Log files are stored on fast, high-performance storage for quick recovery, while Archive logs are placed on separate storage for point-in-time recovery. Regular backups are stored on tape drives or other media, isolated from the main database storage, ensuring data protection.

Check CPU Utilization

Top

cat /proc/stat

yum install sysstat
mpstat

Check Storage I/O

iostat

yum install iotop
iotop

Using Resource Monitor:
• Press Ctrl + Shift + Esc to open Task Manager.
• Go to the Performance tab and click Open Resource Monitor at the bottom.
• Switch to the Disk tab in Resource Monitor.
• Here, you can monitor disk I/O statistics, including:
o Disk Reads/sec: Number of read operations per second.
o Disk Writes/sec: Number of write operations per second.
o Disk Queue Length: The length of the queue of I/O operations waiting for disk access.

Using Performance Monitor (PerfMon):
• Press Windows + R, type perfmon, and press Enter.
• In Performance Monitor, add counters for disk-related metrics:
o Right-click on Performance Monitor in the left panel.
o Click Add Counters.
o Choose PhysicalDisk and select counters such as:
 Disk Reads/sec
 Disk Writes/sec
 Avg. Disk sec/Read
 Avg. Disk sec/Write
 Disk Queue Length
These counters give detailed real-time data about I/O performance.

Check Memory Utilization

Free -g

cat /proc/meminfo

vmstat 1

Using Task Manager:
• Press Ctrl + Shift + Esc to open Task Manager.
• Go to the Performance tab, and select Memory on the left.

Using Resource Monitor:
• Open Task Manager (Ctrl + Shift + Esc), go to the Performance tab, and click on Open Resource Monitor.
• Switch to the Memory tab to get detailed information about memory utilization.

Using Performance Monitor (PerfMon):
• Press Windows + R, type perfmon, and press Enter.
• In Performance Monitor, you can add memory-related counters to track usage.
o Right-click in the window, choose Add Counters.
o Select Memory, and choose counters like:
 Available MBytes
 Committed Bytes
 Cache Faults/sec
 Page Faults/sec

Redo Log File Should be on Fast Storage

dd if=/dev/zero of=testfile bs=1G count=1 oflag=direct

dd if=testfile of=/dev/null bs=1G count=1 iflag=direct

winsat disk -drive c

Backups Should Be Copy External Storage Like Tape Drive or Other Storage.

Oracle Database Engine Health Check

An Oracle Database Engine Health Check is essential for maintaining optimal performance and involves monitoring several critical components. The System Global Area (SGA) should be appropriately sized to minimize memory contention, while the Program Global Area (PGA) needs careful management to ensure efficient memory allocation for session data and sorting operations. Additionally, the size of the redo logs plays a significant role in recovery performance, and resizing them can prevent frequent log switches that may hinder throughput. Monitoring log switches is crucial, as frequent occurrences can indicate transaction volume issues, and analyzing wait events provides insights into resource contention, helping identify bottlenecks and optimize the database environment effectively.

• System Global Area (SGA): This shared memory area contains data and control information for the Oracle instance, and its size should be tuned to accommodate the workload, minimizing memory contention.

• Program Global Area (PGA): This memory region is allocated to each Oracle process, and monitoring its usage helps ensure efficient memory allocation for sorting, hashing, and managing session data.

• Redo Log Size: The size of the redo log files impacts recovery performance; appropriately sizing these logs can enhance database recovery speed and prevent frequent log switches.

• Log Switches: Frequent log switches can indicate issues with transaction volume or log size, and monitoring this helps in optimizing the log configuration for better throughput.

• Wait Events: Analyzing wait events provides insights into resource contention, helping identify bottlenecks and performance issues, which is essential for tuning and optimizing the database environment.

• Shared Server and Dedicated Server Process

For Configure Shared Server
select name,value from gv$parameter where name in ('shared_servers','dispatchers','max_shared_servers','shared_server_sessions','max_dispatchers');
SELECT * FROM V$SHARED_SERVER;
SELECT * FROM V$DISPATCHER;
ALTER SYSTEM SET SHARED_SERVERS = 5;
ALTER SYSTEM SET DISPATCHERS = '(PROTOCOL=TCP)(DISPATCHERS=3)';
ALTER SYSTEM SET MAX_SHARED_SERVERS = 10;
ALTER SYSTEM SET SHARED_SERVER_SESSIONS = 100;
ALTER SYSTEM SET MAX_DISPATCHERS = 5;
SELECT * FROM V$SHARED_SERVER;
SELECT * FROM V$DISPATCHER;
• Monitor Block Tracking File Size which use for Incremental Backup.

ALTER DATABASE Disable BLOCK CHANGE TRACKING;
ALTER DATABASE ENABLE BLOCK CHANGE TRACKING USING FILE ‘/u01/blocktracking.ora';
SELECT filename, status, bytes FROM v$block_change_tracking;
• Configure Keep_Pool and Move Frequently Used Objects in Keep Pool

select table_name,BUFFER_POOL from dba_tables where OWNER='T24' AND not BUFFER_POOL='DEFAULT';
alter TABLE T24.F_BATCH storage (buffer_pool keep);

Oracle Database Processes Execution Level Health Check

An Oracle Database Execution Level Health Check focuses on monitoring and optimizing SQL query performance and execution plans. It involves analyzing SQL statements using tools like EXPLAIN PLAN, SQL Trace, and TKPROF to identify inefficient queries, missing indexes, or suboptimal joins. The check also reviews bind variable usage, index efficiency, and parallel execution settings to enhance performance. By detecting long-running queries or heavy resource-consuming statements, this health check ensures that database operations are executed optimally, improving overall query performance and resource utilization.
Operational Health Check

Archive Log Location Space Monitoring

SELECT TRUNC (first_time) "Date", inst_id, TO_CHAR (first_time, 'Dy') "Day",
COUNT (1) "Total",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '00', 1, 0)) "h0",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '01', 1, 0)) "h1",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '02', 1, 0)) "h2",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '03', 1, 0)) "h3",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '04', 1, 0)) "h4",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '05', 1, 0)) "h5",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '06', 1, 0)) "h6",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '07', 1, 0)) "h7",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '08', 1, 0)) "h8",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '09', 1, 0)) "h9",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '10', 1, 0)) "h10",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '11', 1, 0)) "h11",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '12', 1, 0)) "h12",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '13', 1, 0)) "h13",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '14', 1, 0)) "h14",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '15', 1, 0)) "h15",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '16', 1, 0)) "h16",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '17', 1, 0)) "h17",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '18', 1, 0)) "h18",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '19', 1, 0)) "h19",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '20', 1, 0)) "h20",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '21', 1, 0)) "h21",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '22', 1, 0)) "h22",
SUM (DECODE (TO_CHAR (first_time, 'hh24'), '23', 1, 0)) "h23",
ROUND (COUNT (1) / 24, 2) "Avg"
FROM gv$log_history
WHERE thread# = inst_id
AND first_time > sysdate -7
GROUP BY TRUNC (first_time), inst_id, TO_CHAR (first_time, 'Dy')
ORDER BY 1,2;

Monitor Flash Recover Area

select name, floor(space_limit / 1024 / 1024/1024) "Size GB", ceil(space_used / 1024 / 1024/1024) "Used GB" from v$recovery_file_dest order by name;

Check Tablespace Size

SELECT ts.tablespace_name,
size_info.megs_alloc,
size_info.megs_free/1024,
size_info.megs_used,
size_info.pct_free,
size_info.pct_used,
size_info.MAX,
size_info.Available /1024 Available_GB,
Available_Percentage
FROM (SELECT a.tablespace_name,
ROUND (a.bytes_alloc / 1024 / 1024) megs_alloc,
ROUND (NVL (b.bytes_free, 0) / 1024 / 1024) megs_free,
ROUND ( (a.bytes_alloc - NVL (b.bytes_free, 0)) / 1024 / 1024)
megs_used,
ROUND ( (NVL (b.bytes_free, 0) / a.bytes_alloc) * 100)
Pct_Free,
100 - ROUND ( (NVL (b.bytes_free, 0) / a.bytes_alloc) * 100)
Pct_used,
ROUND (maxbytes / 1048576) MAX,
((ROUND (maxbytes / 1048576))- (ROUND ( (a.bytes_alloc - NVL (b.bytes_free, 0)) / 1024 / 1024))) as Available
, (((ROUND (maxbytes / 1048576))- (ROUND ( (a.bytes_alloc - NVL (b.bytes_free, 0)) / 1024 / 1024))) /(ROUND (maxbytes / 1048576)))*100 as Available_Percentage
FROM ( SELECT f.tablespace_name,
SUM (f.bytes) bytes_alloc,
SUM (
DECODE (f.autoextensible,
'YES', f.maxbytes,
'NO', f.bytes))
maxbytes
FROM dba_data_files f
GROUP BY tablespace_name) a,
( SELECT f.tablespace_name, SUM (f.bytes) bytes_free
FROM dba_free_space f
GROUP BY tablespace_name) b
WHERE a.tablespace_name = b.tablespace_name(+)
UNION ALL
SELECT h.tablespace_name,
ROUND (SUM (h.bytes_free + h.bytes_used) / 1048576) megs_alloc,
ROUND (
SUM (
(h.bytes_free + h.bytes_used) - NVL (p.bytes_used, 0))
/ 1048576)
megs_free,
ROUND (SUM (NVL (p.bytes_used, 0)) / 1048576) megs_used,
ROUND (
( SUM (
(h.bytes_free + h.bytes_used)
- NVL (p.bytes_used, 0))
/ SUM (h.bytes_used + h.bytes_free))
* 100)
Pct_Free,
100
- ROUND (
( SUM (
(h.bytes_free + h.bytes_used)
- NVL (p.bytes_used, 0))
/ SUM (h.bytes_used + h.bytes_free))
* 100)
pct_used,
ROUND (
SUM (
DECODE (f.autoextensible,
'YES', f.maxbytes,
'NO', f.bytes)
/ 1048576))
MAX,
(((ROUND (SUM (DECODE (f.autoextensible,'YES', f.maxbytes,'NO', f.bytes) / 1048576))) -(ROUND (SUM (NVL (p.bytes_used, 0)) / 1048576)))/1024) as Available_GB
,((((ROUND (SUM (DECODE (f.autoextensible,'YES', f.maxbytes,'NO', f.bytes) / 1048576))) -(ROUND (SUM (NVL (p.bytes_used, 0)) / 1048576)))) /(ROUND (SUM (DECODE (f.autoextensible,'YES', f.maxbytes,'NO', f.bytes) / 1048576)))) * 100 as Available_Percentage
FROM sys.v_$TEMP_SPACE_HEADER h,
sys.v_$Temp_extent_pool p,
dba_temp_files f
WHERE p.file_id(+) = h.file_id
AND p.tablespace_name(+) = h.tablespace_name
AND f.file_id = h.file_id
AND f.tablespace_name = h.tablespace_name
GROUP BY h.tablespace_name) size_info,
sys.dba_tablespaces ts,
sys.dba_tablespace_groups tsg
WHERE ts.tablespace_name = size_info.tablespace_name
AND ts.tablespace_name = tsg.tablespace_name(+)
order by Available_Percentage asc;

Monitor Database Backups

SELECT status status,
object_type object_type,
input_bytes / 1024 / 1024 / 1024 input_bytes,
output_bytes / 1024 / 1024 / 1024 output_bytes,
TO_CHAR (start_time, 'mm/dd/yyyy:hh:mi:ss') AS start_time,
TO_CHAR (end_time, 'mm/dd/yyyy:hh:mi:ss') AS end_time
FROM v$rman_status
WHERE start_time > SYSDATE - 1 AND operation = 'BACKUP';

Monitor Database Alert Log

Performance Related Health Check

Check Database Session Locking

select blocking_session, sid, serial#, wait_class, seconds_in_wait
from
v$session where blocking_session is not NULL order by blocking_session;
select username, sid,serial#,sql_id , wait_time from v$session where (sid ) in
(
select blocking_session from v$session
);
select inst_id,last_call_et,username,machine,event,status from gv$session where username='T24' and lower(event) like '%contention%';
select * from gv$session a , gv$lock b where a.sid=b.sid and A.INST_ID=b.inst_id and username='T24' and lower(event) like '%contention%';
select * from dba_blockers;

Check the Queries which last_call_et column value high and increasing in your Database

select event,last_call_et,username,module,program,machine,sql_id,prev_sql_id,
((select max(sql_text) from gv$sql a where a.sql_id=s.sql_id)) as sql1,
((select max(sql_text) from gv$sql a where a.sql_id=s.prev_sql_id)) as sql2
from gv$session s where status='ACTIVE' and not username is null order by last_call_et desc;

Check Active / In Active Database Sessions

select inst_id,sid,last_call_et,username,machine,event,status from gv$session where username='T24' order by last_call_et desc;
select inst_id,sid,last_call_et,username,machine,event,status from gv$session where username='T24' and status='ACTIVE' order by last_call_et desc;
Check User Level Session Locking.
SELECT se.inst_id,se.last_call_et,
lk.SID, se.username, se.OSUser, se.Machine,DECODE (lk.TYPE,'TX', 'Transaction','TM', 'DML','UL', 'PL/SQL User Lock',lk.TYPE)
lock_type, DECODE (lk.lmode, 0, 'None', 1, 'Null',
2, 'Row-S (SS)',
3, 'Row-X (SX)',
4, 'Share',
5, 'S/Row-X (SSX)',
6, 'xclusive',
TO_CHAR (lk.lmode))
mode_held,
DECODE (lk.request,
0, 'None',
1, 'Null',
2, 'Row-S (SS)',
3, 'Row-X (SX)',
4, 'Share',
5, 'S/Row-X (SSX)',
6, 'Exclusive',
TO_CHAR (lk.request))
mode_requested,
TO_CHAR (lk.id1) lock_id1,
TO_CHAR (lk.id2) lock_id2,
ob.owner,
ob.object_type,
ob.object_name,
DECODE (lk.Block, 0, 'No', 1, 'Yes', 2, 'Global') block,
se.lockwait
FROM GV$lock lk, dba_objects ob, GV$session se
WHERE lk.TYPE IN ('TX', 'TM', 'UL')
AND lk.SID = se.SID
AND lk.id1 = ob.object_id(+)
AND lk.inst_id = se.inst_id
and last_call_et > 100;

Generate AWR Report
Check "DB Time"

Check "SQL ordered by Elapsed Time"

Check Top 10 Foreground Events by Total Wait Time

Wait Classes by Total Wait Time

Check Table Stats

select max(LAST_ANALYZED),'execute dbms_stats.gather_table_stats(ownname=>' || '''' || 't24' || '''' || ',' || 'tabname=>' || '''' || table_name || '''' || ',DEGREE=> 10, estimate_percent=> 10 );' from dba_tab_statistics where owner='T24' and stale_stats ='YES'
group by table_name order by max(LAST_ANALYZED) asc;
select max(LAST_ANALYZED),'execute dbms_stats.gather_table_stats(ownname=>' || '''' || 't24' || '''' || ',' || 'tabname=>' || '''' || table_name || '''' || ',DEGREE=> 10, estimate_percent=> 10 );' from dba_tab_statistics where owner='T24' and stale_stats ='YES' group by table_name order by max(LAST_ANALYZED) asc;
select owner,table_name,LAST_ANALYZED from dba_tables;
Check Invalid Objects
select * from dba_ind_partitions where index_owner='T24' and not status='USABLE' ;
select index_name,status from dba_indexes where not status='VALID' and owner='T24';
Check Execution Plan for Query
explain plan for insert into test1 select * from test1;
select * from table(dbms_xplan.display);

Check Usage of Indexes

ALTER INDEX abc1 MONITORING USAGE;
SELECT index_name, table_name, monitoring, used FROM V$OBJECT_USAGE;
ALTER INDEX abc1 NO MONITORING USAGE;

Check List of Queries Perform Full Scan

SELECT sql_id,
child_number,
object_owner,
object_name,
operation,
options
FROM v$sql_plan
WHERE operation = 'TABLE ACCESS'
AND options = 'FULL' ;

Overview

Oracle Database Architecture Health Check

Oracle Database Engine Health Check

Oracle Database Processes Execution Level Health Check

about

Contacts

Quick Links