11gR2 CRS doesn't startup after node reboot [ID 1050164.1]
Applies
to:
Oracle Server - Enterprise Edition -
Version: 11.2.0.1.0 to 11.2.0.1.0 - Release: 11.2 to 11.2
Generic Linux
Symptoms
- Installation of the 11gR2 Grid Infrastructure on a
Linux cluster completed successfully
- OCR & Voting files located in ASM diskgroup
- using ASMLIB driver
- ASM disks are located on multipath devices
(/dev/mapper/)
- following a node reboot CRS does not startup
- CSS daemon log shows the following message:
2010-01-13
09:04:15.075: [ CSSD][1150449984]clssnmvDDiscThread: using discovery string for
initial discovery
2010-01-13 09:04:15.075: [ SKGFD][1150449984]Discovery with str::
2010-01-13 09:04:15.075: [ SKGFD][1150449984]UFS discovery with ::
2010-01-13 09:04:15.075: [ SKGFD][1150449984]OSS discovery with ::
2010-01-13 09:04:15.076: [ SKGFD][1150449984]Discovery with asmlib
:ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: str ::
2010-01-13 09:04:15.076: [ SKGFD][1150449984]Fetching asmlib disk :ORCL:DATA1:
2010-01-13 09:04:15.076: [ SKGFD][1150449984]Fetching asmlib disk :ORCL:DATA2:
2010-01-13 09:04:15.076: [ SKGFD][1150449984]Fetching asmlib disk :ORCL:DATA3:
2010-01-13 09:04:15.076: [ SKGFD][1150449984]Fetching asmlib disk :ORCL:DATA4:
2010-01-13 09:04:15.077: [ SKGFD][1150449984]ERROR: -15(asmlib
ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not
permitted)
2010-01-13 09:04:15.077: [ SKGFD][1150449984]ERROR: -15(asmlib
ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not
permitted)
2010-01-13 09:04:15.077: [ SKGFD][1150449984]ERROR: -15(asmlib
ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not
permitted)
2010-01-13 09:04:15.077: [ SKGFD][1150449984]ERROR: -15(asmlib
ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error Operation not
permitted)
2010-01-13 09:04:15.077: [ CSSD][1150449984]clssnmvDiskVerify: Successful
discovery of 0 disks
2010-01-13 09:04:15.077: [ CSSD][1150449984]clssnmCompleteInitVFDiscovery:
Completing initial voting file discovery
2010-01-13 09:04:15.077: [ CSSD][1150449984]clssnmvFindInitialConfigs: No
voting files found
2010-01-13 09:04:15.077: [ CSSD][1150449984]###################################
2010-01-13 09:04:15.077: [ CSSD][1150449984]clssscExit: CSSD signal 11 in
thread clssnmvDDiscThread
2010-01-13 09:04:15.077: [ CSSD][1150449984]###################################
2010-01-13 09:04:15.077: [ CSSD][1139960128]clssgmClientShutdown: total
iocapables 0
2010-01-13 09:04:15.077: [ CSSD][1139960128]clssgmClientShutdown: graceful
shutdown completed.
2010-01-13 09:04:15.077: [ CSSD][1150449984]
- running the cluster verification utility returns the
following messages:
/cluvfy stage -post
crsinst -n racnode1
Performing post-checks for cluster services setup
Checking node reachability...
Node reachability check passed from node "racnode1"
Checking user equivalence...
User equivalence check passed for user "grid"
Checking time zone consistency...
Time zone consistency check passed.
ERROR:
Cluster manager integrity check failed
PRVF-5434 : Cannot identify the current CRS software version
UDev attributes check for OCR locations started...
UDev attributes check passed for OCR locations
UDev attributes check for Voting Disk locations started...
ERROR:
PRVF-5197 : Failed to retrieve voting disk locations
UDev attributes check failed for Voting Disk locations
Default user file creation mask check passed
Checking cluster integrity...
Cluster integrity check failed This check did not run on the following node(s):
racnode1
Checking OCR integrity...
Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations
ERROR:
PRVF-5300 : Failed to retrieve active version for CRS on this node
OCR integrity check failed
Checking CRS integrity...
ERROR:
PRVF-5300 : Failed to retrieve active version for CRS on this node
CRS integrity check failed
OCR detected on ASM. Running ACFS Integrity checks...
Starting check to see if ASM is running on all cluster nodes...
PRVF-5137 : Failure while checking ASM status on node "racnode1"
Starting Disk Groups check to see if at least one Disk Group configured...
PRVF-5112 : An Exception occurred while checking for Disk Groups
PRVF-5114 : Disk Group check failed. No Disk Groups configured
Task ACFS Integrity check failed
Checking Oracle Cluster Voting Disk configuration...
ERROR:
PRVF-5434 : Cannot identify the current CRS software version
PRVF-5431 : Oracle Cluster Voting Disk configuration check failed
User "grid" is not part of "root" group. Check passed
Post-check for cluster services setup was unsuccessful on all the nodes.
Changes
Node was rebooted after install.
Cause
The CSS daemon crashes because it cannot locate any Voting files in any of the
discovered ASM disks, which is indicated by the following message in the CSS
daemon log (<grid_home>/log/<node_name>/cssd/ocssd.log):
2010-01-13 09:04:15.077: [ CSSD][1150449984]clssnmvFindInitialConfigs:
No voting files found
This error is preceded by the following ASMLIB error:
2010-01-13 09:04:15.077: [ SKGFD][1150449984]ERROR:
-15(asmlib ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so op asm_open error
Operation not permitted)
suggesting that ASMLIB has problem accessing the ASM
disk.
Solution
1. either edit the file /etc/sysconfig/oracleasm-_dev_oracleasm
and change the lines:
ORACLEASM_SCANORDER=""
ORACLEASM_SCANEXCLUDE=""
to
ORACLEASM_SCANORDER="dm"
ORACLEASM_SCANEXCLUDE="sd"
or
alternatively run the following command (as user root)
/usr/sbin/oracleasm configure -i -e -u user -g group
-o "dm" -x "sd"
2. stop & restart ASMLIB as user root using:
/usr/sbin/oracleasm exit
/usr/sbin/oracleasm init
3. restart CRS or reboot node
The above steps need to be executed on all nodes