Oracle Data Guard Broker 18 : new VALIDATE NETWORK CONFIGURATION command

(Photo by Chrissa Giannakoudi, on Unsplash)

There is a very handy new command available in Oracle Data Guard Broker 18c. Too bad I discovered it AFTER I solved my problem 😉
This quick blog post demonstrates what can be easily spotted with this new VALIDATE NETWORK CONFIGURATION command.

In this example, I have 2 sites, each with a 2-node Oracle 18.6 Grid Infrastructure cluster. On both clusters, there is an Oracle 18.6 RAC database.
The primary database on site 1 is called tool, and the standby database on site 2 is called opeth. Both RAC databases have 2 instances.

I had trouble setting a correct Oracle Data Guard configuration between database tool and database opeth.

After all the steps to build and synchronize the standby database were completed, I would get the following output when issuing show configuration verbose with DGMGRL from the first primary instance :

DGMGRL> show configuration verbose

Configuration - dgconf_tool_opeth

  Protection Mode: MaxPerformance
  Members:
  TOOL - Primary database
    Error: ORA-16778: redo transport error for one or more members

    OPETH - Physical standby database
      Warning: ORA-16854: apply lag could not be determined

[...]

Fast-Start Failover: DISABLED

Configuration Status:
ERROR

And from the second primary instance, I would get :

DGMGRL> show configuration verbose

Configuration - dgconf_TOOL_OPETH

  Protection Mode: MaxPerformance
  Members:
  TOOL - Primary database
    Error: ORA-16778: redo transport error for one or more members

    OPETH - Physical standby database
      Error: ORA-16883: unable to translate DGConnectIdentifier property

[...]

Fast-Start Failover: DISABLED

Configuration Status:
ERROR

There is obviously a problem with DGConnectIdentifier on database opeth. When issuing show database verbose OPETH, I got the following output :

DGMGRL> show database verbose OPETH

Database - OPETH

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      (unknown)
  Apply Lag:          (unknown)
  Average Apply Rate: (unknown)
  Active Apply Rate:  (unknown)
  Maximum Apply Rate: (unknown)
  Real Time Query:    OFF
  Instance(s):
    OPETH1
    OPETH2 (apply instance)

  Properties:
    DGConnectIdentifier             = 'OPETH'

[...]

When trying to reach DGConnectIdentifier OPETH from node 1 of primary database, everything went well. But from node 2 of primary database, I got :

DGMGRL> host tnsping opeth
Executing operating system command(s):" tnsping opeth"

TNS Ping Utility for Linux: Version 18.0.0.0.0 - Production on 17-JUN-2019 14:00:53

Copyright (c) 1997, 2018, Oracle.  All rights reserved.

Used parameter files:
<my_tns_admin_path>/sqlnet.ora

TNS-03505: Failed to resolve name

Of course, standby database opeth was unreachable from the 2nd node of the primary database. Indeed, all members of the Data Guard configuration should be able to connect to all the other members. In other words, all instances should be able to contact each other. This connectivity is achieved using the DGConnectIdentifier property for each member. And in particular, the documentation specifies : “The connect identifier for a configuration member must […] allow all instances of an Oracle RAC database to be reached.”

That’s where the VALIDATE NETWORK CONFIGURATION command comes in handy : using VALIDATE NETWORK CONFIGURATION FOR ALL, at first glance, I can directly spot which instance is not able to reach another member, as this command tries all the possible combinations :

DGMGRL> VALIDATE NETWORK CONFIGURATION FOR ALL ;
Connecting to instance "TOOL1" on database "TOOL" ...
Connected to "TOOL"
Checking connectivity from instance "TOOL1" on database "TOOL to instance "OPETH2" on database "OPETH"...
Succeeded.
Checking connectivity from instance "TOOL1" on database "TOOL to instance "OPETH1" on database "OPETH"...
Succeeded.
Connecting to instance "TOOL2" on database "TOOL" ...
Connected to "TOOL"
Checking connectivity from instance "TOOL2" on database "TOOL to instance "OPETH2" on database "OPETH"...
ORA-16883: unable to translate DGConnectIdentifier property
ORA-06512: at "SYS.DBMS_DRS", line 1845
ORA-06512: at line 1

Failed.
Checking connectivity from instance "TOOL2" on database "TOOL to instance "OPETH1" on database "OPETH"...
ORA-16883: unable to translate DGConnectIdentifier property
ORA-06512: at "SYS.DBMS_DRS", line 1845
ORA-06512: at line 1

Failed.
Connecting to instance "OPETH2" on database "OPETH" ...
Connected to "OPETH"
Checking connectivity from instance "OPETH2" on database "OPETH to instance "TOOL1" on database "TOOL"...
Succeeded.
Checking connectivity from instance "OPETH2" on database "OPETH to instance "TOOL2" on database "TOOL"...
Succeeded.
Connecting to instance "OPETH1" on database "OPETH" ...
Connected to "OPETH"
Checking connectivity from instance "OPETH1" on database "OPETH to instance "TOOL1" on database "TOOL"...
Succeeded.
Checking connectivity from instance "OPETH1" on database "OPETH to instance "TOOL2" on database "TOOL"...
Succeeded.

Oracle Clusterware on database "TOOL" is available for database restart.

Oracle Clusterware on database "OPETH" is available for database restart.

I can now easily fix the problem : instance TOOL2 (primary database on node 2) is unable to reach any instance of standby databse OPETH. Of course I forgot an entry in the corresponding tnsnames.ora on node 2 of primary database TOOL. (I usually avoid using tnsnames.ora but this test runs on a lab environment.)

As soon as I added the right entry in the corresponding tnsnames.ora file, the problem was fixed, the Data Guard configuration was up and running, and the output of VALIDATE NETWORK CONFIGURATION FOR ALL showed no error anymore :

DGMGRL> VALIDATE NETWORK CONFIGURATION FOR ALL ;
Connecting to instance "TOOL1" on database "TOOL" ...
Connected to "TOOL"
Checking connectivity from instance "TOOL1" on database "TOOL to instance "OPETH2" on database "OPETH"...
Succeeded.
Checking connectivity from instance "TOOL1" on database "TOOL to instance "OPETH1" on database "OPETH"...
Succeeded.
Connecting to instance "TOOL2" on database "TOOL" ...
Connected to "TOOL"
Checking connectivity from instance "TOOL2" on database "TOOL to instance "OPETH2" on database "OPETH"...
Succeeded.
Checking connectivity from instance "TOOL2" on database "TOOL to instance "OPETH1" on database "OPETH"...
Succeeded.
Connecting to instance "OPETH2" on database "OPETH" ...
Connected to "OPETH"
Checking connectivity from instance "OPETH2" on database "OPETH to instance "TOOL1" on database "TOOL"...
Succeeded.
Checking connectivity from instance "OPETH2" on database "OPETH to instance "TOOL2" on database "TOOL"...
Succeeded.
Connecting to instance "OPETH1" on database "OPETH" ...
Connected to "OPETH"
Checking connectivity from instance "OPETH1" on database "OPETH to instance "TOOL1" on database "TOOL"...
Succeeded.
Checking connectivity from instance "OPETH1" on database "OPETH to instance "TOOL2" on database "TOOL"...
Succeeded.

Oracle Clusterware on database "TOOL" is available for database restart.

Oracle Clusterware on database "OPETH" is available for database restart.

The lesson to be learned for me is to read more thoroughly the documentation, especially the list of changes in a release 😉

2 thoughts on “Oracle Data Guard Broker 18 : new VALIDATE NETWORK CONFIGURATION command

  1. goryszewskig

    Hi,
    thanks for this great post, could You tell if DGConnectIdentifier is pointing to vips or scan addreses ?
    Regards.
    Grzegorz

    Like

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s