
There is a very handy new command available in Oracle Data Guard Broker 18c. Too bad I discovered it AFTER I solved my problem 😉
This quick blog post demonstrates what can be easily spotted with this new VALIDATE NETWORK CONFIGURATION
command.
In this example, I have 2 sites, each with a 2-node Oracle 18.6 Grid Infrastructure cluster. On both clusters, there is an Oracle 18.6 RAC database.
The primary database on site 1 is called tool
, and the standby database on site 2 is called opeth
. Both RAC databases have 2 instances.
I had trouble setting a correct Oracle Data Guard configuration between database tool
and database opeth
.
After all the steps to build and synchronize the standby database were completed, I would get the following output when issuing show configuration verbose
with DGMGRL
from the first primary instance :
DGMGRL> show configuration verbose
Configuration - dgconf_tool_opeth
Protection Mode: MaxPerformance
Members:
TOOL - Primary database
Error: ORA-16778: redo transport error for one or more members
OPETH - Physical standby database
Warning: ORA-16854: apply lag could not be determined
[...]
Fast-Start Failover: DISABLED
Configuration Status:
ERROR
And from the second primary instance, I would get :
DGMGRL> show configuration verbose
Configuration - dgconf_TOOL_OPETH
Protection Mode: MaxPerformance
Members:
TOOL - Primary database
Error: ORA-16778: redo transport error for one or more members
OPETH - Physical standby database
Error: ORA-16883: unable to translate DGConnectIdentifier property
[...]
Fast-Start Failover: DISABLED
Configuration Status:
ERROR
There is obviously a problem with DGConnectIdentifier
on database opeth
. When issuing show database verbose OPETH
, I got the following output :
DGMGRL> show database verbose OPETH
Database - OPETH
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: (unknown)
Apply Lag: (unknown)
Average Apply Rate: (unknown)
Active Apply Rate: (unknown)
Maximum Apply Rate: (unknown)
Real Time Query: OFF
Instance(s):
OPETH1
OPETH2 (apply instance)
Properties:
DGConnectIdentifier = 'OPETH'
[...]
When trying to reach DGConnectIdentifier OPETH
from node 1 of primary database, everything went well. But from node 2 of primary database, I got :
DGMGRL> host tnsping opeth
Executing operating system command(s):" tnsping opeth"
TNS Ping Utility for Linux: Version 18.0.0.0.0 - Production on 17-JUN-2019 14:00:53
Copyright (c) 1997, 2018, Oracle. All rights reserved.
Used parameter files:
<my_tns_admin_path>/sqlnet.ora
TNS-03505: Failed to resolve name
Of course, standby database opeth
was unreachable from the 2nd node of the primary database. Indeed, all members of the Data Guard configuration should be able to connect to all the other members. In other words, all instances should be able to contact each other. This connectivity is achieved using the DGConnectIdentifier
property for each member. And in particular, the documentation specifies : “The connect identifier for a configuration member must […] allow all instances of an Oracle RAC database to be reached.”
That’s where the VALIDATE NETWORK CONFIGURATION
command comes in handy : using VALIDATE NETWORK CONFIGURATION FOR ALL
, at first glance, I can directly spot which instance is not able to reach another member, as this command tries all the possible combinations :
DGMGRL> VALIDATE NETWORK CONFIGURATION FOR ALL ;
Connecting to instance "TOOL1" on database "TOOL" ...
Connected to "TOOL"
Checking connectivity from instance "TOOL1" on database "TOOL to instance "OPETH2" on database "OPETH"...
Succeeded.
Checking connectivity from instance "TOOL1" on database "TOOL to instance "OPETH1" on database "OPETH"...
Succeeded.
Connecting to instance "TOOL2" on database "TOOL" ...
Connected to "TOOL"
Checking connectivity from instance "TOOL2" on database "TOOL to instance "OPETH2" on database "OPETH"...
ORA-16883: unable to translate DGConnectIdentifier property
ORA-06512: at "SYS.DBMS_DRS", line 1845
ORA-06512: at line 1
Failed.
Checking connectivity from instance "TOOL2" on database "TOOL to instance "OPETH1" on database "OPETH"...
ORA-16883: unable to translate DGConnectIdentifier property
ORA-06512: at "SYS.DBMS_DRS", line 1845
ORA-06512: at line 1
Failed.
Connecting to instance "OPETH2" on database "OPETH" ...
Connected to "OPETH"
Checking connectivity from instance "OPETH2" on database "OPETH to instance "TOOL1" on database "TOOL"...
Succeeded.
Checking connectivity from instance "OPETH2" on database "OPETH to instance "TOOL2" on database "TOOL"...
Succeeded.
Connecting to instance "OPETH1" on database "OPETH" ...
Connected to "OPETH"
Checking connectivity from instance "OPETH1" on database "OPETH to instance "TOOL1" on database "TOOL"...
Succeeded.
Checking connectivity from instance "OPETH1" on database "OPETH to instance "TOOL2" on database "TOOL"...
Succeeded.
Oracle Clusterware on database "TOOL" is available for database restart.
Oracle Clusterware on database "OPETH" is available for database restart.
I can now easily fix the problem : instance TOOL2
(primary database on node 2) is unable to reach any instance of standby databse OPETH
. Of course I forgot an entry in the corresponding tnsnames.ora
on node 2 of primary database TOOL
. (I usually avoid using tnsnames.ora
but this test runs on a lab environment.)
As soon as I added the right entry in the corresponding tnsnames.ora
file, the problem was fixed, the Data Guard configuration was up and running, and the output of VALIDATE NETWORK CONFIGURATION FOR ALL
showed no error anymore :
DGMGRL> VALIDATE NETWORK CONFIGURATION FOR ALL ;
Connecting to instance "TOOL1" on database "TOOL" ...
Connected to "TOOL"
Checking connectivity from instance "TOOL1" on database "TOOL to instance "OPETH2" on database "OPETH"...
Succeeded.
Checking connectivity from instance "TOOL1" on database "TOOL to instance "OPETH1" on database "OPETH"...
Succeeded.
Connecting to instance "TOOL2" on database "TOOL" ...
Connected to "TOOL"
Checking connectivity from instance "TOOL2" on database "TOOL to instance "OPETH2" on database "OPETH"...
Succeeded.
Checking connectivity from instance "TOOL2" on database "TOOL to instance "OPETH1" on database "OPETH"...
Succeeded.
Connecting to instance "OPETH2" on database "OPETH" ...
Connected to "OPETH"
Checking connectivity from instance "OPETH2" on database "OPETH to instance "TOOL1" on database "TOOL"...
Succeeded.
Checking connectivity from instance "OPETH2" on database "OPETH to instance "TOOL2" on database "TOOL"...
Succeeded.
Connecting to instance "OPETH1" on database "OPETH" ...
Connected to "OPETH"
Checking connectivity from instance "OPETH1" on database "OPETH to instance "TOOL1" on database "TOOL"...
Succeeded.
Checking connectivity from instance "OPETH1" on database "OPETH to instance "TOOL2" on database "TOOL"...
Succeeded.
Oracle Clusterware on database "TOOL" is available for database restart.
Oracle Clusterware on database "OPETH" is available for database restart.
The lesson to be learned for me is to read more thoroughly the documentation, especially the list of changes in a release 😉
Hi,
thanks for this great post, could You tell if DGConnectIdentifier is pointing to vips or scan addreses ?
Regards.
Grzegorz
LikeLike
Hi Grzegorz.
Thank you for your feedback!
In this case, it points to SCAN.
LikeLike