You may have a small RAC database that is managed with the dbconsole rather than the grid. The documentation states that the console must be started manually on the applicable cluster node(s) whenever it is required. Couldn't the CRS cluster start and stop the dbconsole automatically?
The following setup is assumed.
- A three node RAC cluster.
- Grid infrastructure installed as user oragrid under /opt/oracle/app/11.2.0/grid.
- Oracle Database installed as user oracle under /opt/oracle/app/oracle/product/11.2.0/dbhome_1.
- The database Oracle Home is mounted at /opt/oracle/app/oracle using an ASM clustered filesystem.
- The unique name of the database to be monitored is racd3.
- The dbconsole (or agent) will be available to be started on all nodes.
As oracle (the user to run the dbconsole), create a new script /opt/oracle/app/11.2.0/grid/crs/public/local.agent.racd3_actionScript.sh.
#!/bin/sh set -e PATH=/usr/local/bin:/usr/bin:/bin ORACLE_UNQNAME=racd3; export ORACLE_UNQNAME ORACLE_SID=$ORACLE_UNQNAME ORAENV_ASK=NO . oraenv unset ORACLE_SID case "$1" in start) $ORACLE_HOME/bin/emctl start dbconsole ;; stop) $ORACLE_HOME/bin/emctl stop dbconsole ;; check) $ORACLE_HOME/bin/emctl status dbconsole ;; *) exit 1 ;; esac exit 0
Now use this 'action script' as a dbconsole wrapper for a new cluster managed resource named local.agent.racd3.
crsctl add resource local.agent.racd3 -type local_resource -attr \ " ACTION_SCRIPT=/opt/oracle/app/11.2.0/grid/crs/public/local.agent.racd3_actionScript.sh, \ DESCRIPTION=Oracle agent for racd3 database, DEGREE=1, ENABLED=1, \ AUTO_START=restore, START_TIMEOUT=0, UPTIME_THRESHOLD=1h, CHECK_INTERVAL=60, \ STOP_TIMEOUT=0, SCRIPT_TIMEOUT=300, RESTART_ATTEMPTS=3, OFFLINE_CHECK_INTERVAL=60, \ START_DEPENDENCIES=hard(ora.c03vol1.vorabase.acfs), \ STOP_DEPENDENCIES=hard(ora.c03vol1.vorabase.acfs)"
The above will copy the action script to all cluster nodes. Being a local resource, it will start on each node. It has a start and stop hard dependency on the Oracle Home's filesystem (the executables in this filesystem are not available immediately upon reboot, but appear later once the cluster has initialised and the cluster filesystem has been mounted). Upon reboot, the dbconsole will be restarted if it was running when the cluster was shut down.
Check the status of this new resource.
[root@h01-c03-test ~]# /opt/oracle/app/11.2.0/grid/bin/crsctl status resource local.agent.racd3 NAME=local.agent.racd3 TYPE=local_resource TARGET=OFFLINE, OFFLINE, OFFLINE STATE=OFFLINE, OFFLINE, OFFLINE
Start the resource and recheck the status.
[root@h01-c03-test ~]# /opt/oracle/app/11.2.0/grid/bin/crsctl start resource local.agent.racd3 CRS-2672: Attempting to start 'local.agent.racd3' on 'h02-c03-test' CRS-2672: Attempting to start 'local.agent.racd3' on 'h01-c03-test' CRS-2672: Attempting to start 'local.agent.racd3' on 'h03-c03-test' CRS-2676: Start of 'local.agent.racd3' on 'h02-c03-test' succeeded CRS-2676: Start of 'local.agent.racd3' on 'h03-c03-test' succeeded CRS-2676: Start of 'local.agent.racd3' on 'h01-c03-test' succeeded [root@h01-c03-test ~]# /opt/oracle/app/11.2.0/grid/bin/crsctl status resource local.agent.racd3 NAME=local.agent.racd3 TYPE=local_resource TARGET=ONLINE , ONLINE , ONLINE STATE=ONLINE on h01-c03-test, ONLINE on h02-c03-test, ONLINE on h03-c03-test [root@h01-c03-test ~]# netstat -ant|grep 1158 tcp 0 0 :::1158 :::* LISTEN
CRS-4000: Command Start failed, or completed with errors.
[grid@tes1~]$ crsctl start res em_dbconsole
CRS-2672: Attempting to start 'em_dbconsole' on 'tes1'
CRS-2672: Attempting to start 'em_dbconsole' on 'tes2'
CRS-2672: Attempting to start 'em_dbconsole' on 'tes3'
CRS-2672: Attempting to start 'em_dbconsole' on 'tes4'
CRS-2674: Start of 'em_dbconsole' on 'tes2' failed
CRS-2674: Start of 'em_dbconsole' on 'tes1' failed
CRS-2674: Start of 'em_dbconsole' on 'tes4' failed
CRS-2674: Start of 'em_dbconsole' on 'tes3' failed
CRS-4000: Command Start failed, or completed with errors.
any idea how to resolve this
Could you try to run your
In reply to CRS-4000: Command Start failed, or completed with errors. by Thana (not verified)
Could you try to run your action script manually and scan for errors. Also, please review the cluster's alert log (alerthostname.log). Let me know what you find.