Search

Wednesday, October 24, 2012

11gR2 Rac Administration



11g RAC Administration and Maintenance Tasks and Utilities:

1.    Checking CRS Status
2.    Viewing Cluster name
3.    Viewing No. Of Nodes configured in Cluster
4.    Viewing Votedisk Information
5.    Viewing OCR Disk Information
6.    Various Timeout Settings in Cluster
7.    Add/Remove OCR file in Cluster
8.    Add/Remove Votedisk file in Cluster
9.    Backing Up OCR
10.  Restoring Votedisk
11.  Changing Public and Virtual IP Address


1.Checking CRS Status:

The below two commands are generally used to check the status of CRS. The first command lists the status of CRS
on the local node where as the other command shows the CRS status across all the nodes in Cluster.

crsctl check crs <<-- for the local node

[root@rak3 bin]# ./crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online



crsctl check cluster <<-- for remote nodes in the cluster
[root@rak3 bin]# ./crsctl check cluster
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager i


Checking Viability of CSS across nodes:

crsctl check cluster

For this command to run, CSS needs to be running on the local node. The "ONLINE" status for remote node says that CSS is running on that node.
When CSS is down on the remote node, the status of "OFFLINE" is displayed for that node.

[root@rac1]# crsctl check cluster
rac1 ONLINE
rac2 ONLINE

2.Viewing Cluster name:

I use below command to get the name of Cluster. You can also dump the ocr and view the name from the dump file.

[oracle@rak1 ~]$ cd /u01/app/11.2.0/grid/bin
[oracle@rak1 bin]$ ./cemutlo -n
rak-scan

 or

[oracle@rak1 bin]$ ./olsnodes -n
rak-scan


 or
 Only the master node takes backups of the OCR and any node can become the master node (depending on node evictions). So this "ocrdump" command worked on the master node only

ocrdump -stdout -keyname SYSTEM | grep -A 1 clustername | grep ORATEXT | awk '{print $3}'

[root@rac1]# ocrdump -stdout -keyname SYSTEM | grep -A 1 clustername | grep ORATEXT | awk '{print $3}'
rak-scan
[root@rac1]#

or

ocrconfig -export /tmp/ocr_exp.dat -s online
for i in `strings /tmp/ocr_exp.dat | grep -A 1 clustername` ; do if [ $i != 'SYSTEM.css.clustername' ]; then echo $i; fi; done

[root@rac1]# ocrconfig -export /tmp/ocr_exp.dat -s online
[root@rac1]# for i in `strings /tmp/ocr_exp.dat | grep -A 1 clustername` ; do if [ $i != 'SYSTEM.css.clustername' ]; then echo $i; fi; done
rak-scan

[root@rac1]#

OR

Oracle creates a directory with the same name as Cluster under the $ORA_CRS_HOME/cdata. you can get the cluster name from this directory as well.

[root@rac1]# ls /u01/app/11.2.0/grid/cdata
localhost 
rak-scan


3.Viewing No. Of Nodes configured in Cluster:

The below command can be used to find out the number of nodes registered into the cluster.
It also displays the node's Public name, Private name and Virtual name along with their numbers.

olsnodes

[oracle@rak1 bin]$ olsnodes -n -s
rak1  1 Active
rak3 2 Active

Usage: olsnodes [ [-n] [-i] [-s] [-t] [<node> | -l [-p]] | [-c] ] [-g] [-v]
        where
                -n print node number with the node name
                -p print private interconnect address for the local node
                -i print virtual IP address with the node name
                <node> print information for the specified node
                -l print information for the local node
                -s print node status - active or inactive
                -t print node type - pinned or unpinned
                -g turn on logging
                -v Run in debug mode; use at direction of Oracle Support only.
                -c print clusterware name 



4.Viewing Votedisk Information:

The below command is used to view the no. of Votedisks configured in the Cluster.

crsctl query css votedisk

[oracle@rak1 bin]$ ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   1c8b4a5e50684fc0bfd2cfc5fd3a1df0 (ORCL:DISK1) [DATA]
 2. ONLINE   1cc7906f37314f8ebfd7a737ad917cd2 (ORCL:DISK2) [DATA]
 3. ONLINE   814c85dfefb04f7cbfeb175d0e7b7831 (ORCL:DISK3) [DATA]
Located 3 voting disk(s).



5.Viewing OCR Disk Information:

The below command is used to view the no. of OCR files configured in the Cluster. It also displays the version of OCR as well as storage space information. You can only have 2 OCR files at max. run this command as root user, if we run this command as oracle user we get this message "logical corruption check bypassed due to non-privileged user"

ocrcheck

[root@rak3 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       2776
         Available space (kbytes) :     259344
         ID                       :   33615009
         Device/File Name         :      +DATA
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded


6. Various Timeout Settings in Cluster:

Disktimeout:
Disk Latencies in seconds from node-to-Votedisk. Default Value is 200. (Disk IO)

Misscount:
Network Latencies in second from node-to-node (Interconnect). Default Value is 60 Sec (Linux) and 30 Sec in Unix platform. (Network IO)

Misscount < Disktimeout

NOTE: Do not change them without contacting Oracle Support. This may cause logical corruption to the Data.

IF
(Disk IO Time > Disktimeout) OR (Network IO time > Misscount)
THEN
REBOOT NODE
ELSE
DO NOT REBOOT
END IF;

crsctl get css disktimeout
crsctl get css misscount
crsctl get css reboottime


Disktimeout:
[root@rac1]# crsctl get css disktimeout
200
 

Misscount:
[root@rac1]# crsctl get css misscount
Configuration parameter misscount is not defined. 
 <<<<< This message indicates that the Misscount is not set manually and it is set to it's
Default Value On Linux, it is default to 60 Second. If you want to change it, you can do that as below. (Not recommended)

[root@rac1]# crsctl set css misscount 80
Configuration parameter misscount is now set to 80
 
[root@rac1]# crsctl get css misscount
80

The below command sets the value of misscount back to its Default values:

crsctl unset css misscount
[oracle@rak1 bin]$ ./crsctl unset css misscount
Configuration parameter misscount is reset to default operation value.


[oracle@rak1 bin]$ ./crsctl get css misscount 

60


Rebootingtime:
[oracle@rak1 bin]$ ./crsctl get css reboottime
3



7.Add/Remove OCR file in Cluster:

Removing OCR File

(1) Get the Existing OCR file information by running ocrcheck utility.

[root@rac1]# ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 262120
Used space (kbytes) : 3852
Available space (kbytes) : 258268
ID : 744414276
Device/File Name : /u02/ocfs2/ocr/OCRfile_0 <-- OCR
Device/File integrity check succeeded
Device/File Name : /u02/ocfs2/ocr/OCRfile_1 <-- OCR Mirror
Device/File integrity check succeeded

Cluster registry integrity check succeeded

(2) The First command removes the OCR mirror (/u02/ocfs2/ocr/OCRfile_1). If you want to remove the OCR
file (/u02/ocfs2/ocr/OCRfile_1) run the next command.

ocrconfig -replace ocrmirror
ocrconfig -replace ocr

[root@rac1]# ocrconfig -replace ocrmirror
[root@rac1]# ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 262120
Used space (kbytes) : 3852
Available space (kbytes) : 258268
ID : 744414276
Device/File Name : /u02/ocfs2/ocr/OCRfile_0 <<-- OCR File
Device/File integrity check succeeded

Device/File not configured <-- OCR Mirror not existed any more

Cluster registry integrity check succeeded

Adding OCR

You need to add OCR or OCR Mirror file in a case where you want to move the existing OCR file location to the different Devices.
The below command add ths OCR mirror file if OCR file alread exists.

(1) Get the Current status of OCR:

[root@rac1]# ocrconfig -replace ocrmirror
[root@rac1]# ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 262120
Used space (kbytes) : 3852
Available space (kbytes) : 258268
ID : 744414276
Device/File Name : /u02/ocfs2/ocr/OCRfile_0 <<-- OCR File
Device/File integrity check succeeded

Device/File not configured <-- OCR Mirror does not exist

Cluster registry integrity check succeeded

As You can see, I only have one OCR file but not the second file which is OCR Mirror.
So, I can add second OCR (OCR Mirror) as below command.

ocrconfig -replace ocrmirror <File name>

[root@rac1]# ocrconfig -replace ocrmirror /u02/ocfs2/ocr/OCRfile_1
[root@rac1]# ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 262120
Used space (kbytes) : 3852
Available space (kbytes) : 258268
ID : 744414276
Device/File Name : /u02/ocfs2/ocr/OCRfile_0
Device/File integrity check succeeded
Device/File Name : /u02/ocfs2/ocr/OCRfile_1
Device/File integrity check succeeded

Cluster registry integrity check succeeded

You can have at most 2 OCR devices (OCR itself and its single Mirror) in a cluster. Adding extra Mirror gives you below error message

[root@rac1]# ocrconfig -replace ocrmirror /u02/ocfs2/ocr/OCRfile_2
PROT-21: Invalid parameter
[root@rac1]#

Add/Remove Votedisk file in Cluster:

Adding Votedisk:

Get the existing Vote Disks associated into the cluster. To be safe, Bring crs cluster stack down on all the nodes
but one on which you are going to add votedisk from.

(1) Stop CRS on all the nodes in cluster but one.

[root@rac2]# crsctl stop crs

(2) Get the list of Existing Vote Disks

crsctl query css votedisk

[root@rac1]# crsctl query css votedisk
0. 0 /u02/ocfs2/vote/VDFile_0
1. 0 /u02/ocfs2/vote/VDFile_1
2. 0 /u02/ocfs2/vote/VDFile_2
Located 3 voting disk(s).

(3) Backup the VoteDisk file

Backup the existing votedisks as below as oracle:

dd if=/u02/ocfs2/vote/VDFile_0 of=$ORACLE_BASE/bkp/vd/VDFile_0

[root@rac1]# su - oracle
[oracle@rac1 ~]$ dd if=/u02/ocfs2/vote/VDFile_0 of=$ORACLE_BASE/bkp/vd/VDFile_0
41024+0 records in
41024+0 records out
[oracle@rac1 ~]$

(4) Add an Extra Votedisk into the Cluster:

If it is a OCFS, then touch the file as oracle. On raw devices, initialize the raw devices using "dd" command

touch /u02/ocfs2/vote/VDFile_3 <<-- as oracle
crsctl add css votedisk /u02/ocfs2/vote/VDFile_3 <<-- as oracle
crsctl query css votedisks

[root@rac1]# su - oracle
[oracle@rac1 ~]$ touch /u02/ocfs2/vote/VDFile_3
[oracle@rac1 ~]$ crsctl add css votedisk /u02/ocfs2/vote/VDFile_3
Now formatting voting disk: /u02/ocfs2/vote/VDFile_3.
Successful addition of voting disk /u02/ocfs2/vote/VDFile_3.

(5) Confirm that the file has been added successfully:

[root@rac1]# ls -l /u02/ocfs2/vote/VDFile_3
-rw-r----- 1 oracle oinstall 21004288 Oct 6 16:31 /u02/ocfs2/vote/VDFile_3
[root@rac1]# crsctl query css votedisks
Unknown parameter: votedisks
[root@rac1]# crsctl query css votedisk
0. 0 /u02/ocfs2/vote/VDFile_0
1. 0 /u02/ocfs2/vote/VDFile_1
2. 0 /u02/ocfs2/vote/VDFile_2
3. 0 /u02/ocfs2/vote/VDFile_3
Located 4 voting disk(s).
[root@rac1]#

Removing Votedisk:

Removing Votedisk from the cluster is very simple. Tthe below command removes the given votedisk from cluster configuration.

crsctl delete css votedisk /u02/ocfs2/vote/VDFile_3

[root@rac1]# crsctl delete css votedisk /u02/ocfs2/vote/VDFile_3
Successful deletion of voting disk /u02/ocfs2/vote/VDFile_3.
[root@rac1]#

[root@rac1]# crsctl query css votedisk
0. 0 /u02/ocfs2/vote/VDFile_0
1. 0 /u02/ocfs2/vote/VDFile_1
2. 0 /u02/ocfs2/vote/VDFile_2
Located 3 voting disk(s).
[root@rac1]#

Backing Up OCR

Oracle performs physical backup of OCR devices every 4 hours under the default backup direcory $ORA_CRS_HOME/cdata/<CLUSTER_NAME>
and then it rolls that forward to Daily, weekly and monthly backup. You can get the backup information by executing below command.

ocrconfig -showbackup

[root@rac1]# ocrconfig -showbackup
rac2 2007/09/03 17:46:47 /u01/app/crs/cdata/test-crs/backup00.ocr
rac2 2007/09/03 13:46:45 /u01/app/crs/cdata/test-crs/backup01.ocr
rac2 2007/09/03 09:46:44 /u01/app/crs/cdata/test-crs/backup02.ocr
rac2 2007/09/03 01:46:39 /u01/app/crs/cdata/test-crs/day.ocr
rac2 2007/09/03 01:46:39 /u01/app/crs/cdata/test-crs/week.ocr
[root@rac1]#

Manually backing up the OCR

ocrconfig -manualbackup <<--Physical Backup of OCR

The above command backs up OCR under the default Backup directory. You can export the contents of the OCR using below command (Logical backup).

ocrconfig -export /tmp/ocr_exp.dat -s online <<-- Logical Backup of OCR

Restoring OCR

The below command is used to restore the OCR from the physical backup. Shutdown CRS on all nodes.

ocrconfig -restore <file name>

Locate the avialable Backups

[root@rac1]# ocrconfig -showbackup

rac2 2007/09/03 17:46:47 /u01/app/crs/cdata/test-crs/backup00.ocr
rac2 2007/09/03 13:46:45 /u01/app/crs/cdata/test-crs/backup01.ocr
rac2 2007/09/03 09:46:44 /u01/app/crs/cdata/test-crs/backup02.ocr
rac2 2007/09/03 01:46:39 /u01/app/crs/cdata/test-crs/day.ocr
rac2 2007/09/03 01:46:39 /u01/app/crs/cdata/test-crs/week.ocr
rac1 2007/10/07 13:50:41 /u01/app/crs/cdata/test-crs/backup_20071007_135041.ocr

Perform Restore from previous Backup

[root@rac2]# ocrconfig -restore /u01/app/crs/cdata/test-crs/week.ocr

The above command restore the OCR from week old backup.
If you have logical backup of OCR (taken using export option), then You can import it with the below command.

ocrconfig -import /tmp/ocr_exp.dat

Restoring Votedisks

· Shutdown CRS on all the nodes in Cluster.
· Locate the current location of the Votedisks
· Restore each of the votedisks using "dd" command from the previous good backup of Votedisk taken using the same "dd" command.
· Start CRS on all the nodes.
crsctl stop crs
crsctl query css votedisk
dd if=<backup of Votedisk> of=<Votedisk file> <<-- do this for all the votedisks
crsctl start crs

Changing Public and Virtual IP Address:

Current Config Changed to

Node 1:

Public IP: 216.160.37.154 192.168.10.11
VIP: 216.160.37.153 192.168.10.111
subnet: 216.160.37.159 192.168.10.0
Netmask: 255.255.255.248 255.255.255.0
Interface used: eth0 eth0
Hostname: rac1.crs.com rac1.crs.com

Node 2:

Public IP: 216.160.37.156 192.168.10.22
VIP: 216.160.37.157 192.168.10.222
subnet: 216.160.37.159 192.168.10.0
Netmask: 255.255.255.248 255.255.255.0
Interface used: eth0 eth0
Hostname: rac1.crs.com rac2.crs.com

=======================================================================
(A)

Take the Services, Database, ASM Instances and nodeapps down on both the Nodes in Cluster. Also disable the nodeapps, asm and database instances to prevent them from restarting in case if this node gets rebooted during this process.
srvctl stop service -d test
srvctl stop database -d test
srvctl stop asm -n rac1
srvctl stop asm -n rac2
srvctl stop nodeapps -n rac1,rac12
srvctl disable instance -d test -i test1,test2
srvctl disable asm -n rac1
srvctl disable asm -n rac2
srvctl disable nodeapps -n rac1
srvctl disable nodeapps -n rac2

(B)
Modify the /etc/hosts and/or DNS, ifcfg-eth0 (local node) with the new IP values
on All the Nodes

(C)
Restart the specific network interface in order to use the new IP.
ifconfig eth0 down
ifconfig eth0 up

Or, you can restart the network.
CAUTION: on NAS, restarting entire network may cause the node to be rebooted.

(D)
Update the OCR with the New Public IP.
In case of public IP, you have to delete the interface first and then add it back with the new IP address.

As oracle user, Issue the below command:
oifcfg delif -global eth0
oifcfg setif -global eth0/192.168.10.0:public

(E)
Update the OCR with the New Virtual IP.
Virtual IP is part of the nodeapps and so you can modify the nodeapps to update the Virtual IP information.

As privileged user (root), Issue the below commands:
srvctl modify nodeapps -n rac1 -A 192.168.10.111/255.255.255.0/eth0 <-- for Node 1
srvctl modify nodeapps -n rac1 -A 192.168.10.222/255.255.255.0/eth0 <-- for Node 2

(F)
Enable the nodeapps, ASM, database Instances for all the Nodes.
srvctl enable instance -d test -i test1,test2
srvctl enable asm -n rac1
srvctl enable asm -n rac2
srvctl enable nodeapps -n rac1
srvctl enable nodeapps -n rac2

(G)
Update the listener.ora file on each nodes with the correct IP addresses in case if it uses the IP address instead of the hostname.

(H)
Restart the Nodeapps, ASM and Database instance
srvctl start nodeapps -n rac1
srvctl start nodeapps -n rac2
srvctl start asm -n rac1
srvctl start asm -n rac2
srvctl start database -d test
 
Top of Form
Bottom of Form

---------------------------------------------------------------------------------

1) Check satus of cluster resources

[oracle@Rac2 ~]$ crsctl stat res -t
——————————————————————————–
NAME TARGET STATE SERVER STATE_DETAILS
——————————————————————————–
Local Resources
——————————————————————————–
ora.DATA.dg
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.LISTENER.lsnr
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.asm
ONLINE ONLINE rac1 Started
ONLINE ONLINE rac2 Started
ora.eons
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.gsd
OFFLINE OFFLINE rac1
OFFLINE OFFLINE rac2
ora.net1.network
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.ons
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.registry.acfs
ONLINE ONLINE rac1
ONLINE ONLINE rac2
——————————————————————————–
Cluster Resources
——————————————————————————–
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rac1
ora.LISTENER_SCAN2.lsnr
1 ONLINE ONLINE rac2
ora.LISTENER_SCAN3.lsnr
1 ONLINE ONLINE rac2
ora.oc4j
1 OFFLINE OFFLINE
ora.rac.contact.svc
1 ONLINE ONLINE rac2
ora.rac.db
1 ONLINE ONLINE rac1 Open
2 ONLINE ONLINE rac2 Open
ora.rac1.vip
1 ONLINE ONLINE rac1
ora.rac2.vip
1 ONLINE ONLINE rac2
ora.scan1.vip
1 ONLINE ONLINE rac1
ora.scan2.vip
1 ONLINE ONLINE rac2
ora.scan3.vip
1 ONLINE ONLINE rac2

2) Check status of local RAC Background Processes

[oracle@Rac2 ~]$ crsctl stat res -t -init
——————————————————————————–
NAME TARGET STATE SERVER STATE_DETAILS
——————————————————————————–
Cluster Resources
——————————————————————————–
ora.asm
1 ONLINE ONLINE rac2 Started
ora.crsd
1 ONLINE ONLINE rac2
ora.cssd
1 ONLINE ONLINE rac2
ora.cssdmonitor
1 ONLINE ONLINE rac2
ora.ctssd
1 ONLINE ONLINE rac2 ACTIVE:0
ora.diskmon
1 ONLINE ONLINE rac2
ora.drivers.acfs
1 ONLINE ONLINE rac2
ora.evmd
1 ONLINE ONLINE rac2
ora.gipcd
1 ONLINE ONLINE rac2
ora.gpnpd
1 ONLINE ONLINE rac2
ora.mdnsd
1 ONLINE ONLINE rac2

[oracle@Rac1 ~]$ crsctl stat res -t -init
——————————————————————————–
NAME TARGET STATE SERVER STATE_DETAILS
——————————————————————————–
Cluster Resources
——————————————————————————–
ora.asm
1 ONLINE ONLINE rac1 Started
ora.crsd
1 ONLINE ONLINE rac1
ora.cssd
1 ONLINE ONLINE rac1
ora.cssdmonitor
1 ONLINE ONLINE rac1
ora.ctssd
1 ONLINE ONLINE rac1 ACTIVE:0
ora.diskmon
1 ONLINE ONLINE rac1
ora.drivers.acfs
1 ONLINE ONLINE rac1
ora.evmd
1 ONLINE ONLINE rac1
ora.gipcd
1 ONLINE ONLINE rac1
ora.gpnpd
1 ONLINE ONLINE rac1
ora.mdnsd
1 ONLINE ONLINE rac1

3) Check the status of the OCR

[root@Rac1 ~]# cd /u01/app/11.2.0/grid/bin/
[root@Rac1 bin]# ./orcrcheck
-bash: ./orcrcheck: No such file or directory
[root@Rac1 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 3044
Available space (kbytes) : 259076
ID : 9093549
Device/File Name : +DATA
Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

[root@Rac1 bin]# cat /etc/oracle/ocr.loc
ocrconfig_loc=+DATA
local_only=FALSE


4) Get information about the voting disk

[root@Rac1 bin]# ./crsctl query css votedisk
## STATE File Universal Id File Name Disk group
– —– —————– ——— ———
1. ONLINE 2a3ec883eca14fd9bf55866be66341ef (ORCL:DISK2) [DATA]
Located 1 voting disk(s).



5) Find the active and software version of the Grid Install


[root@Rac1 bin]# ./crsctl query crs activeversion
Oracle Clusterware active version on the cluster is [11.2.0.1.0]
[root@Rac1 bin]# ./crsctl query crs softwareversion
Oracle Clusterware version on node [rac1] is [11.2.0.1.0]


6) Enable, Disable and check status of auto-restart of Clusterware


[root@Rac1 bin]# crsctl config crs
CRS-4622: Oracle High Availability Services autostart is enabled.

[root@Rac1 bin]# crsctl disable crs
CRS-4621: Oracle High Availability Services autostart is disabled.

[root@Rac1 bin]# crsctl enable crs
CRS-4622: Oracle High Availability Services autostart is enabled.


7) Check nodes in cluster

[root@Rac1 bin]# olsnodes -n
rac1 1
rac2 2

can also give -s option to see which nodes are active or inactive incase of node eviction