Monday, October 5, 2015

What does root.sh perform on RAC Cluster Installation

I thought, it may be useful for others to understand what actions are being performed while running root.sh on RAC cluster installation. I spend some time today and update this topic on my blog for every one benefit :-)

At the end of the RAC cluster installation,  we run root.sh and orainstRoot.sh file.  

You will see this screen right before the installation completion.  Please note, i captured this screen on 11.2.0 installation. But oracle12c. you may not see this screen and GUI tool will run the root,.sh. 


What does orainstRoot.sh perform?

  1. It creates the inventory pointer file (/etc/oraInst.loc)
  2. It Changes groupname of the oraInventory directory to oinstall group.

The log is from first node.

[root@RAC1 grid]# sh /grid/app/oraInventory/orainstRoot.sh
Changing permissions of /grid/app/oraInventory.
Adding read,write permissions for group.
Removing read,write,execute permissions for world.
Changing groupname of /grid/app/oraInventory to oinstall.
The execution of the script is complete.
[root@RAC1 grid]# 

 What does Root.sh perform?  
  1. It will set the environment variable for ORACLE_HOME and ORACLE_OWNER
  2. Copy dbhome,oraenv,coraenv to /usr/local/bin
  3. Creates the /etc/oratab file
  4. Verify the super user privileges for the user.
  5. Creating the trace directory
  6. Creates  OCR keys for the 'root' user
  7. Adding clusterware Daemon information  to inittab file
  8. Starts Oracle High Availability service Daemon(OHASD) service
  9. Start the clusterware daemons(gipcd,mdnsd,gpnpd,cssmonitor,cssd,diskmon,ctssd etc)
  10. Creates and configures an ASM instance and ASM disk group will be created.
  11. Creates required ASM disk groups, if ASM is being used to put OCR and voting files.
  12. It Configures the OCR and voting disks (only on the first node)
  13. Starts up the Cluster Ready Service Daemon (CRSD) process
  14. updating the profile by adding voting disk
  15. It Puts the voting disk on the ASM disk group ,if ASM type is selected.
  16. It Displays voting disk details
  17. Attempting to stop and start the cluster resources(crsd,asm, ctssd,cssmonitor, cssd,gpnpd, gipcd etc)
  18. updates the OLR files
  19. Installs the cvuqdisk-1.0.7-1 package
  20. Configure Oracle Grid Infrastructure for Cluster
  21. Updating inventory properties for clusterware
  22. Starts the Oracle universal Installer
  23. Check the swapspace and make sure we have enough space
  24. Completes with the UpdateNodeList success operation. 
The log is from first node.


[root@RAC1 grid]# pwd
/grid/app/11.2.0/grid
[root@RAC1 grid]# sh root.sh
Running Oracle 11g root.sh script...

The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /grid/app/11.2.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
   Copying dbhome to /usr/local/bin ...
   Copying oraenv to /usr/local/bin ...
   Copying coraenv to /usr/local/bin ...


Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root.sh script.
Now product-specific root actions will be performed.
2015-10-04 17:08:18: Parsing the host name
2015-10-04 17:08:18: Checking for super user privileges
2015-10-04 17:08:18: User has super user privileges
Using configuration parameter file: /grid/app/11.2.0/grid/crs/install/crsconfig_params
Creating trace directory
LOCAL ADD MODE
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
  root wallet
  root wallet cert
  root cert export
  peer wallet
  profile reader wallet
  pa wallet
  peer wallet keys
  pa wallet keys
  peer cert request
  pa cert request
  peer cert
  pa cert
  peer root cert TP
  profile reader root cert TP
  pa root cert TP
  peer pa cert TP
  pa peer cert TP
  profile reader pa cert TP
  profile reader peer cert TP
  peer user cert
  pa user cert
Adding daemon to inittab
CRS-4123: Oracle High Availability Services has been started.
ohasd is starting
ADVM/ACFS is not supported on oraclelinux-release-6Server-0.0.5.x86_64



CRS-2672: Attempting to start 'ora.gipcd' on 'rac1'
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1'
CRS-2676: Start of 'ora.gipcd' on 'rac1' succeeded
CRS-2676: Start of 'ora.mdnsd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1'
CRS-2676: Start of 'ora.gpnpd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac1'
CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'rac1'
CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded

ASM created and started successfully.

DiskGroup DATA created successfully.

clscfg: -install mode specified
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
CRS-2672: Attempting to start 'ora.crsd' on 'rac1'
CRS-2676: Start of 'ora.crsd' on 'rac1' succeeded
CRS-4256: Updating the profile
Successful addition of voting disk 5df60344ce984fd3bf6a36825778db82.
Successfully replaced voting disk group with +DATA.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   5df60344ce984fd3bf6a36825778db82 (ORCL:DISK1) [DATA]
Located 1 voting disk(s).
CRS-2673: Attempting to stop 'ora.crsd' on 'rac1'
CRS-2677: Stop of 'ora.crsd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'rac1'
CRS-2677: Stop of 'ora.asm' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'rac1'
CRS-2677: Stop of 'ora.ctssd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'rac1'
CRS-2677: Stop of 'ora.cssdmonitor' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'rac1'
CRS-2677: Stop of 'ora.cssd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac1'
CRS-2677: Stop of 'ora.gpnpd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rac1'
CRS-2677: Stop of 'ora.gipcd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac1'
CRS-2677: Stop of 'ora.mdnsd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1'
CRS-2676: Start of 'ora.mdnsd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'rac1'
CRS-2676: Start of 'ora.gipcd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1'
CRS-2676: Start of 'ora.gpnpd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac1'
CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'rac1'
CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac1'
CRS-2676: Start of 'ora.asm' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'rac1'
CRS-2676: Start of 'ora.crsd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.evmd' on 'rac1'
CRS-2676: Start of 'ora.evmd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac1'
CRS-2676: Start of 'ora.asm' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.DATA.dg' on 'rac1'
CRS-2676: Start of 'ora.DATA.dg' on 'rac1' succeeded

rac1     2015/10/04 17:13:29     /grid/app/11.2.0/grid/cdata/rac1/backup_20151004_171329.olr
Preparing packages for installation...
cvuqdisk-1.0.7-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded
Updating inventory properties for clusterware
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 3998 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /grid/app/oraInventory
'UpdateNodeList' was successful.
[root@RAC1 grid]#

The log for root.sh from second node.

[root@RAC2 grid]# sh root.sh
Running Oracle 11g root.sh script...

The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /grid/app/11.2.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The file "dbhome" already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]:
The file "oraenv" already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]:
The file "coraenv" already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]:

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root.sh script.
Now product-specific root actions will be performed.
2015-10-04 19:46:01: Parsing the host name
2015-10-04 19:46:01: Checking for super user privileges
2015-10-04 19:46:01: User has super user privileges
Using configuration parameter file: /grid/app/11.2.0/grid/crs/install/crsconfig_                                                                                                                                                             params
LOCAL ADD MODE
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Adding daemon to inittab
CRS-4123: Oracle High Availability Services has been started.
ohasd is starting
ADVM/ACFS is not supported on oraclelinux-release-6Server-0.0.5.x86_64

CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on                                                                                                                                                      node rac1, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac2'
CRS-2676: Start of 'ora.mdnsd' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'rac2'
CRS-2676: Start of 'ora.gipcd' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac2'
CRS-2676: Start of 'ora.gpnpd' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac2'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac2'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac2'
CRS-2676: Start of 'ora.diskmon' on 'rac2' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'rac2'
CRS-2676: Start of 'ora.ctssd' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac2'
CRS-2676: Start of 'ora.asm' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'rac2'
CRS-2676: Start of 'ora.crsd' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.evmd' on 'rac2'
CRS-2676: Start of 'ora.evmd' on 'rac2' succeeded

rac2     2015/10/04 19:48:17     /grid/app/11.2.0/grid/cdata/rac2/backup_20151004_194817                                                                                                                                                     .olr
Preparing packages for installation...
cvuqdisk-1.0.7-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded
Updating inventory properties for clusterware
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 3999 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /grid/app/oraInventory
[root@RAC2 grid]#




Hope this helps! Please let me know if there is any correction or suggestion on this topic.

Enjoy reading!

Thursday, October 1, 2015

CRS-4124: Oracle High Availability Services startup failed

While  installing Oracle11g(11.2.0.1) clusterware on RedHat Linux 6, i got the below error at time of running root.sh file. It is two node RAC and node names are RACNODE1 and RACNODE2.

CRS-4124: Oracle High Availability Services startup failed.
CRS-4000: Command Start failed, or completed with errors.
ohasd failed to start: Inappropriate ioctl for device
ohasd failed to start at /grid/app/11.2.0/grid/crs/install/rootcrs.pl line 443.

Here is the complete log.
[root@racnode1 grid]# sh root.sh
Running Oracle 11g root.sh script...
The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /grid/app/11.2.0/grid
Enter the full pathname of the local bin directory: [/usr/local/bin]:
   Copying dbhome to /usr/local/bin ...
   Copying oraenv to /usr/local/bin ...
   Copying coraenv to /usr/local/bin ...

Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root.sh script.
Now product-specific root actions will be performed.
2015-09-28 11:17:19: Parsing the host name
2015-09-28 11:17:19: Checking for super user privileges
2015-09-28 11:17:19: User has super user privileges
Using configuration parameter file: /grid/app/11.2.0/grid/crs/install/crsconfig_           params
Creating trace directory
LOCAL ADD MODE
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
  root wallet
  root wallet cert
  root cert export
  peer wallet
  profile reader wallet
  pa wallet
  peer wallet keys
  pa wallet keys
  peer cert request
  pa cert request
  peer cert
  pa cert
  peer root cert TP
  profile reader root cert TP
  pa root cert TP
  peer pa cert TP
  pa peer cert TP
  profile reader pa cert TP
  profile reader peer cert TP
  peer user cert
  pa user cert
Adding daemon to inittab
CRS-4124: Oracle High Availability Services startup failed.
CRS-4000: Command Start failed, or completed with errors.
ohasd failed to start: Inappropriate ioctl for device
ohasd failed to start at /grid/app/11.2.0/grid/crs/install/rootcrs.pl line 443.
After googling, i found that 11.2.0.1 does not support RHL 6.0.

Here is workaround to fix the issue.

Step 1

Login to racnode1 and open the file  $GRID_HOME/crs/install/s_crsconfig_lib.pm
Add the following command before # Start OHASD

my $UPSTART_OHASD_SERVICE = "oracle-ohasd";
my $INITCTL = "/sbin/initctl";

($status, @output) = system_cmd_capture ("$INITCTL start $UPSTART_OHASD_SERVICE");
if (0 != $status)
{
error ("Failed to start $UPSTART_OHASD_SERVICE, error: $!");
return $FAILED;
}

Please see the screenshot.





Step 2

Create a file /etc/init/oracle-ohasd.conf with below content.

# Oracle OHASD startup
start on runlevel [35]
stop on runlevel [!35]
respawn
exec /etc/init.d/init.ohasd run >/dev/null 2>&1













Step 3

Rollback the root.sh changes on racnode1. 

cd $GRID_HOME/crs/install


[root@racnode1 install]# ./roothas.pl -deconfig -force -verbose
2015-09-28 13:03:50: Checking for super user privileges
2015-09-28 13:03:50: User has super user privileges
2015-09-28 13:03:50: Parsing the host name
Using configuration parameter file: ./crsconfig_params
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Stop failed, or completed with errors.
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Delete failed, or completed with errors.
CRS-4544: Unable to connect to OHAS
CRS-4000: Command Stop failed, or completed with errors.
/grid/app/11.2.0/grid/bin/acfsdriverstate: line 51: /lib/acfstoolsdriver.sh: No such file or directory
/grid/app/11.2.0/grid/bin/acfsdriverstate: line 51: exec: /lib/acfstoolsdriver.sh: cannot execute: No such file or directory

Successfully deconfigured Oracle Restart stack

Step 4

Rerun the root.sh on racnode1

[root@racnode1 grid]# sh root.sh
Running Oracle 11g root.sh script...

The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /grid/app/11.2.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The file "dbhome" already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]: y
   Copying dbhome to /usr/local/bin ...
The file "oraenv" already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]: y
   Copying oraenv to /usr/local/bin ...
The file "coraenv" already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]: y
   Copying coraenv to /usr/local/bin ...

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root.sh script.
Now product-specific root actions will be performed.
2015-09-28 13:04:35: Parsing the host name
2015-09-28 13:04:35: Checking for super user privileges
2015-09-28 13:04:35: User has super user privileges
Using configuration parameter file: /grid/app/11.2.0/grid/crs/install/crsconfig_params
LOCAL ADD MODE
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Adding daemon to inittab
CRS-4123: Oracle High Availability Services has been started.
ohasd is starting
ADVM/ACFS is not supported on oraclelinux-release-6Server-0.0.5.x86_64

CRS-2672: Attempting to start 'ora.gipcd' on 'racnode1'
CRS-2672: Attempting to start 'ora.mdnsd' on 'racnode1'
CRS-2676: Start of 'ora.gipcd' on 'racnode1' succeeded
CRS-2676: Start of 'ora.mdnsd' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'racnode1'
CRS-2676: Start of 'ora.gpnpd' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'racnode1'
CRS-2676: Start of 'ora.cssdmonitor' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'racnode1'
CRS-2672: Attempting to start 'ora.diskmon' on 'racnode1'
CRS-2676: Start of 'ora.diskmon' on 'racnode1' succeeded
CRS-2676: Start of 'ora.cssd' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'racnode1'
CRS-2676: Start of 'ora.ctssd' on 'racnode1' succeeded

ASM created and started successfully.

DiskGroup DATA created successfully.

clscfg: -install mode specified
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
CRS-2672: Attempting to start 'ora.crsd' on 'racnode1'
CRS-2676: Start of 'ora.crsd' on 'racnode1' succeeded
CRS-4256: Updating the profile
Successful addition of voting disk 2c2253ff0c544fdabf043b4752630d9e.
Successfully replaced voting disk group with +DATA.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   2c2253ff0c544fdabf043b4752630d9e (ORCL:DISK1) [DATA]
Located 1 voting disk(s).
CRS-2673: Attempting to stop 'ora.crsd' on 'racnode1'
CRS-2677: Stop of 'ora.crsd' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'racnode1'
CRS-2677: Stop of 'ora.asm' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'racnode1'
CRS-2677: Stop of 'ora.ctssd' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'racnode1'
CRS-2677: Stop of 'ora.cssdmonitor' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'racnode1'
CRS-2677: Stop of 'ora.cssd' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'racnode1'
CRS-2677: Stop of 'ora.gpnpd' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'racnode1'
CRS-2677: Stop of 'ora.gipcd' on 'racnode1' succeeded
CRS-2673: Attempting to stop 'ora.mdnsd' on 'racnode1'
CRS-2677: Stop of 'ora.mdnsd' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.mdnsd' on 'racnode1'
CRS-2676: Start of 'ora.mdnsd' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'racnode1'
CRS-2676: Start of 'ora.gipcd' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'racnode1'
CRS-2676: Start of 'ora.gpnpd' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'racnode1'
CRS-2676: Start of 'ora.cssdmonitor' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'racnode1'
CRS-2672: Attempting to start 'ora.diskmon' on 'racnode1'
CRS-2676: Start of 'ora.diskmon' on 'racnode1' succeeded
CRS-2676: Start of 'ora.cssd' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'racnode1'
CRS-2676: Start of 'ora.ctssd' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'racnode1'
CRS-2676: Start of 'ora.asm' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'racnode1'
CRS-2676: Start of 'ora.crsd' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.evmd' on 'racnode1'
CRS-2676: Start of 'ora.evmd' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'racnode1'
CRS-2676: Start of 'ora.asm' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.DATA.dg' on 'racnode1'
CRS-2676: Start of 'ora.DATA.dg' on 'racnode1' succeeded

racnode1     2015/09/28 13:08:57     /grid/app/11.2.0/grid/cdata/racnode1/backup_20150928_130857.olr
Preparing packages for installation...
cvuqdisk-1.0.7-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded
Updating inventory properties for clusterware
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 3999 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /grid/app/oraInventory
'UpdateNodeList' was successful.

[root@racnode1 grid]#

Now racnode1 is successfully started OHASD service. Let us move on to racnode2 and run the root.sh

Step 5

[root@racnode2 oraInventory]#  sh orainstRoot.sh
Creating the Oracle inventory pointer file (/etc/oraInst.loc)
Changing permissions of /grid/app/oraInventory.
Adding read,write permissions for group.
Removing read,write,execute permissions for world.

Changing groupname of /grid/app/oraInventory to oinstall.
The execution of the script is complete.
[root@racnode2 oraInventory]#  cd /grid/app/11.2.0/grid
[root@racnode2 grid]# sh root.sh
Running Oracle 11g root.sh script...

The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /grid/app/11.2.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
   Copying dbhome to /usr/local/bin ...
   Copying oraenv to /usr/local/bin ...
   Copying coraenv to /usr/local/bin ...

Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root.sh script.
Now product-specific root actions will be performed.
2015-09-28 11:30:55: Parsing the host name
2015-09-28 11:30:55: Checking for super user privileges
2015-09-28 11:30:55: User has super user privileges
Using configuration parameter file: /grid/app/11.2.0/grid/crs/install/crsconfig_                                   params
Creating trace directory
LOCAL ADD MODE
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
  root wallet
  root wallet cert
  root cert export
  peer wallet
  profile reader wallet
  pa wallet
  peer wallet keys
  pa wallet keys
  peer cert request
  pa cert request
  peer cert
  pa cert
  peer root cert TP
  profile reader root cert TP
  pa root cert TP
  peer pa cert TP
  pa peer cert TP
  profile reader pa cert TP
  profile reader peer cert TP
  peer user cert
  pa user cert
Adding daemon to inittab
CRS-4124: Oracle High Availability Services startup failed.
CRS-4000: Command Start failed, or completed with errors.
ohasd failed to start: Inappropriate ioctl for device
ohasd failed to start at /grid/app/11.2.0/grid/crs/install/rootcrs.pl line 443.
[root@racnode2 grid]#
[root@racnode2 grid]#

Opps.. RACNODE2 also has similar issue as we had in RACNODE1.

Step 6

Repeat the step1, step2 & step3 on RACNODE2. 
Run root.sh on racnode2.

The OHASD service again failed  and found the below info on log

2015-09-28 18:23:38.934: [    CSSD][1588156160]clssnmvDHBValidateNCopy: node 1, , has a disk HB, but no network HB, DHB has rcfg 338904345, wrtcnt, 4437, LATS 10349194, lastSeqNo 4437, uniqueness 1443489200, timestamp 1443490040/4294892050
2015-09-28 18:23:38.934: [    CSSD][1588156160]clssnmvDHBValidateNCopy: node 2, , has a disk HB, but no network HB, DHB has rcfg 338904344, wrtcnt, 733, LATS 10349194, lastSeqNo 733, uniqueness 0, timestamp 1443478983/10313614
2015-09-28 18:23:38.954: [    CLSF][1588156160]Closing handle:0x1a6ed70
2

Realized that firewall issue on private network which is blocking between racnode1 and racnode. 

Found the above info in $GRID_HOME/log/racnode2/cssd/ocssd.log

Execute the below command on  both node to fix the firewall issue on private network

service iptables stop
service ip6tables stop
chkconfig iptables off
chkconfig ip6tables off

Step 7

Run root.sh on racnode2 as below

[root@racnode2 ~]# pwd
/root
[root@racnode2 ~]# cd /grid/app/11.2.0/grid
[root@racnode2 grid]# pwd
/grid/app/11.2.0/grid
[root@racnode2 grid]# cd crs
[root@racnode2 crs]# cd install
[root@racnode2 install]# ./roothas.pl -deconfig -force -verbose
2015-09-28 18:41:44: Checking for super user privileges
2015-09-28 18:41:44: User has super user privileges
2015-09-28 18:41:44: Parsing the host name
Using configuration parameter file: ./crsconfig_params
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Stop failed, or completed with errors.
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Delete failed, or completed with errors.
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'racnode2'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'racnode2'
CRS-2673: Attempting to stop 'ora.cssdmonitor' on 'racnode2'
CRS-2673: Attempting to stop 'ora.ctssd' on 'racnode2'
CRS-2677: Stop of 'ora.cssdmonitor' on 'racnode2' succeeded
cd ..
cd CRS-2677: Stop of 'ora.ctssd' on 'racnode2' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'racnode2'
..
CRS-2677: Stop of 'ora.mdnsd' on 'racnode2' succeeded
CRS-2677: Stop of 'ora.cssd' on 'racnode2' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'racnode2'
CRS-2673: Attempting to stop 'ora.diskmon' on 'racnode2'
pwd
CRS-2677: Stop of 'ora.gpnpd' on 'racnode2' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'racnode2'
CRS-2677: Stop of 'ora.gipcd' on 'racnode2' succeeded
CRS-2677: Stop of 'ora.diskmon' on 'racnode2' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'racnode2' has completed
CRS-4133: Oracle High Availability Services has been stopped.
ADVM/ACFS is not supported on oraclelinux-release-6Server-0.0.5.x86_64

ACFS-9201: Not Supported
Successfully deconfigured Oracle Restart stack
[root@racnode2 install]# cd ..
[root@racnode2 crs]# cd ..
[root@racnode2 grid]# pwd
/grid/app/11.2.0/grid
[root@racnode2 grid]# sh root.sh
Running Oracle 11g root.sh script...

The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /grid/app/11.2.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:
The file "dbhome" already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]:
The file "oraenv" already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]:
The file "coraenv" already exists in /usr/local/bin.  Overwrite it? (y/n)
[n]:

Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root.sh script.
Now product-specific root actions will be performed.
2015-09-28 18:42:17: Parsing the host name
2015-09-28 18:42:17: Checking for super user privileges
2015-09-28 18:42:17: User has super user privileges
Using configuration parameter file: /grid/app/11.2.0/grid/crs/install/crsconfig_params
LOCAL ADD MODE
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Adding daemon to inittab
CRS-4123: Oracle High Availability Services has been started.
ohasd is starting
ADVM/ACFS is not supported on oraclelinux-release-6Server-0.0.5.x86_64
CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node racnode1, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
CRS-2672: Attempting to start 'ora.mdnsd' on 'racnode2'
CRS-2676: Start of 'ora.mdnsd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.gipcd' on 'racnode2'
CRS-2676: Start of 'ora.gipcd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'racnode2'
CRS-2676: Start of 'ora.gpnpd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'racnode2'
CRS-2676: Start of 'ora.cssdmonitor' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'racnode2'
CRS-2672: Attempting to start 'ora.diskmon' on 'racnode2'
CRS-2676: Start of 'ora.diskmon' on 'racnode2' succeeded
CRS-2676: Start of 'ora.cssd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'racnode2'
CRS-2676: Start of 'ora.ctssd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'racnode2'
CRS-2676: Start of 'ora.asm' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'racnode2'
CRS-2676: Start of 'ora.crsd' on 'racnode2' succeeded
CRS-2672: Attempting to start 'ora.evmd' on 'racnode2'
CRS-2676: Start of 'ora.evmd' on 'racnode2' succeeded

racnode2     2015/09/28 18:44:22     /grid/app/11.2.0/grid/cdata/racnode2/backup_20150928_184422.olr
Configure Oracle Grid Infrastructure for a Cluster ... succeeded
Updating inventory properties for clusterware
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 3999 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /grid/app/oraInventory
[root@racnode2 grid]# ps -ef|grep pmon
oracle   10130     1  0 18:43 ?        00:00:00 asm_pmon_+ASM2
root     10689  8438  0 18:44 pts/0    00:00:00 grep pmon
[root@racnode2 grid]#

Step 8

Validate both node and make sure all looks good.


Wednesday, September 30, 2015

Tablespace missing in DBA_FREE_SPACE

Today i came across one of the interesting issue in my development database. Developer was loading the data and he got the ORA-01653: unable to extend table XXX.YYYYY by 128 in tablespace ZZZZ  error and data load got failed.  Immediately, he called me and reported this error.  I went and checked the DBA_FREE_SPACE and not able to find the tablespace... 

hmmm... interesting!

I found that tablespace will disappear in DBA_FREE_SPACE for the following scenario.


1. The tablespace has no free space. 

2. Tablespace is offline
3. Tablespace is temporary tablespace

Let us test the first case.  


 Create the tablespace.

SQL> create tablespace test
datafile  '/u02/oradata/test.dbf' size 10M;    2
Tablespace created.
Verify if the tablespace available in DBA_FREE_SPACE
SQL>
select tablespace_name from dba_free_space
where tablespace_name='TEST';  SQL>   2
TABLESPACE_NAME
------------------------------
TEST
Load the data  until it runs out of space
SQL>
create table gt_objects as select *
from dba_objects;
SQL>   2
Table created.
SQL>
SQL> alter table gt_objects move tablespace test;
Table altered.
SQL>
begin
for i in 1..1000000 loop
insert into gt_objects select * from dba_objects;
commit;
end  loop;
end;
/
SQL>   2    3    4    5    6    7  begin
*
ERROR at line 1:
ORA-01653: unable to extend table SYS.GT_OBJECTS by 128 in tablespace TEST
ORA-06512: at line 3
 Check if the tablespace disappeard in DBA_FREE_SPACE
SQL> SQL> select tablespace_name from dba_free_space
  2  where tablespace_name='TEST';
no rows selected
SQL>

Drop the table and see if it appears in DBA_FREE_SPACE
SQL> drop table gt_objects;
Table dropped.
SQL> select tablespace_name from dba_free_space
  2   where tablespace_name='TEST';
TABLESPACE_NAME
------------------------------
TEST
SQL>

Drop the tablespace
SQL> drop tablespace test including contents and datafiles;
Tablespace dropped.
SQL>

Conclusion :  The minimum space required for the tablespace to be reported in the DBA_FREE_SPACE is that of the extent size of the tablespace. We have to be careful if we use DBA_FREE_SPACE for tablespace monitoring query.

Friday, August 21, 2015

Oracle11g Data Guard manual switch over steps

The Manual Switch over steps were tested in oracle 11.2.0.4 and below steps were followed in Critical production database many times in the past and switch over were completed successfully!
We have primary and two standby database on this data guard environment. I tested the switch over between primary(devdb12) and first standby(devdb12_dg).
Step 1 Database verification
  1. Verify the standby database and make sure standby database is in sync with primary database.
  2. Tail the alert log on both primary and standby to monitor the database during the entire switch over time.  Use this command  :  tail -f  alert_$ORACLE_SID.log
  3. Make sure below data guard parameters(log_archive_dest, log_archive_dest_state, log_archive_config, fal_client, fal_server etc) are correctly configured on standby site. This would help log  shipping go smooth after the switch over.
Let application team bring the application down.  Now DBA start switching over the database!
Step 2 Database Active Connection Verification
Make sure, no application connection on the database.
select count(*),username from v$session group by v$session
To be safer side, run the below script to kill any non local process to complete our switch over smoothly.
tokill=`ps -ef | grep LOCAL=NO | grep -v grep | awk '{print $2}'`; echo $tokill;
tokill=`ps -ef | grep LOCAL=NO | grep -v grep | awk '{print $2}'`; kill -9 $tokill;

Step 3 The primary and standby should show the status as below.
set linesize 200
col DB_UNIQUE_NAME form a30
col DATABASE_ROLE for a20
col OPEN_MODE for a30
col SWITCHOVER_STATU for a30
select DB_UNIQUE_NAME,Database_role,open_mode,switchover_status from v$database;











The Primary database SWITCHOVER_STATUS can be either TO STANDBY or SESSIONS ACTIVE.  The standby database SWITCHOVER_STATUS should be NOT ALLOWED.

Step 4: Convert Primary to Standby:
On Primary
Execute the below command on primary (devdb12)
alter database commit to switchover to physical standby with session shutdown;






At this stage, the primary database(devdb12) is not completely converted to standby.  The primary database is down and it is ready to covert for standby database. The standby database (devdb12_dg and devdb12_dr) are ready to convert to primary.  After completing Step 6, the primary(devdb12) will be turned to standby. Step 6 can be executed on this step. But it is always good practice to start standby after starting the primary to reduce the application down time. All three database status should show as below after running the above command.














The devdb12 SWITCHOVER_STATUS should be  RECOVERY NEEDED. The devdb12_dg,devdb12_dr  database SWITCHOVER_STATUS should be TO PRIMARY

Step 5:  Convert Standby to Primary
We are converting devdb12_dg to primary database. Hence login as devdb12_dg and run the below command.

alter database commit to switchover to primary with session shutdown;
alter database open;







After opening the database, the status should be as below. The switchover_status is FAILED_DESTINATION.





Now devdb12_dg becomes primary and ready to take the transactions.  Still it is not ready to ship the archive log file.  At this moment, DBA can ask application team to start the application. Now application is pointing to new primary(devdb12_dg). The old primary(devdb12) and second standby(devdb12_dr)  status should be as below. The  old primary(devdb12) switch over status is RECOVERY_NEEDED. The second standby(devdb12_dr) switch over status is NOT ALLOWED.











Step 6: - Convert Original Primary(devdb12)  to Standby
Login to old primary(devdb12) and run the below command.

shutdown immediate;
startup nomount;
alter database mount standby database;
alter database recover managed standby database disconnect from session;
-- Wait for few minutes and make sure archive logs are shipping to standby database
alter database recover managed standby database cancel;
alter database open;
alter database recover managed standby database using current logfile disconnect;








































Step 7  Verify the database mode for both primary and standby. Now devdb12 is standby and devdb12_dg is primary database. Devdb12_dr is still standby and no change on devdb12_dr.
















Step 8 Verify that archive logs are shipping correctly.

Standby database UNNAMED file error issue

This Scenario was tested in Active Data Guard 11.2.0.4.

Some time, DBA create or add the data file on primary site with out verifying the file system size on standby site. In this case, standby file creation will fail and log shipping will break in standby database. 

Let us test this scenario. For this testing, devdb12 will be primary DB and devdb12_dg will be standby DB. The standby site /data01 file system size is lower than the primary site.

Let us add data file on primary site.

sys@devdb12> create tablespace test datafile
'/data01/oradata/devdb12/test.f01' size 3000m
/
Tablespace created.

sys@devdb12>

The data file was created successfully in primary. But it is failed in standby due to insufficient disk space. We get the below error message in standby alert log. MRP background process shutdown due to this issue.

The data file is created in primary site and not found in standby database.


sys@devdb12> select status,bytes/1024/1024 from
  2   v$datafile where name= '/data01/oradata/devdb12/test.f01';

STATUS                BYTES/1024/1024
--------------------- ---------------
ONLINE                           3000
sys@devdb12>

sys@devdb12_dg> select status,bytes/1024/1024 from
  2   v$datafile where name= '/data01/oradata/devdb12/test.f01';

no rows selected

sys@devdb12_dg>

However, the data file is created in standby site as below.

SQL> select name from v$datafile where name like '%UNNAME%';

NAME
----------------------------------------------------
/ora/app/oracle/product/11.2.0/db_1/dbs/UNNAMED00006

SQL>


Now the Data Guard replication is broken and file was not able to ship to standby due to sufficient space on the /data01 file system.

How do we fix this issue?

Step 1  Add More space in standby /data01 file system and run the below command in standby database. If you don't have luxury to add space, then we can  map the file to different file system in standby database.


Step 2 Execute the below command in standby database. If you decided to create the data file in different file system, then modify the script accordingly. 

shutdown immediate;
startup nomount;
alter database mount standby database;
alter system set standby_file_management='MANUAL' scope=both;
alter database create datafile  '/ora/app/oracle/product/11.2.0/db_1/dbs/UNNAMED00006' as '/data01/oradata/devdb12/test.f01';
alter system set standby_file_management='AUTO' scope=both;
alter database recover managed standby database disconnect from session;
alter database recover managed standby database cancel;
alter database open;
alter database recover managed standby database using current logfile disconnect;


Step 3   Verify the datafile in standby database.

devdb12_dg> select status,bytes/1024/1024 from
v$datafile where name= '/data01/oradata/devdb12/test.f01';
  2
STATUS                BYTES/1024/1024
--------------------- ---------------
ONLINE                           3000

devdb12_dg>


Step 4 Monitor the alert log on primary and standby and  make sure no issues