10 Comparing and Converging Data

This chapter describes how to compare data in shared database objects at two different databases. It also describes how to converge divergent data in shared database objects.

This chapter contains the following sections:

About Comparing and Converging Data in Different Databases

You can share database objects at two or more databases. When copies of the same database object exist at multiple databases, the database object is a shared database object. Shared database objects might be maintained by data replication. For example, materialized views or Oracle Streams components might replicate the database objects and maintain them at multiple databases. A custom application might also maintain shared database objects. Typically, replication environments share database objects that contain data, such as tables, and other types of databases objects, such as indexes.

When a change is made to a shared database object at one database, the change is transferred to and made at each of the other databases that share the database object. In this way, the replication environment keeps the shared database object synchronized at each database.

Sometimes, shared database objects that contain data can become inconsistent at different databases. That is, the data might diverge in the different instances of the shared database object. For example, if the database object is a table, then one instance of the table might have more rows than another instance of the table, or two instances of the table might have different data in the same rows.

When shared database objects diverge in an Oracle Streams replication environment, it is usually for one of the following reasons:

  • Data changes are not being captured at one or more of the databases.

  • Data changes are being captured, but they are not being transferred from one database to another.

  • Data changes are being captured and transferred from one database to another, but they are not being made to shared database objects at the other databases.

Common causes of data divergence are network problems, incorrect configurations, or user errors. When shared database objects diverge in a replication environment that uses materialized views, it might be because there is a problem with the materialized view refresh.

The DBMS_COMPARISON package enables you to compare database objects at different databases and identify differences. This package also enables you to converge the database objects so that they are consistent at different databases. The DBMS_COMPARISON package is an Oracle-supplied PL/SQL package that is always installed with Oracle Database.

The DBMS_COMPARISON package can compare and converge the following types of database objects:

  • Tables

  • Single-table views

  • Materialized views

  • Synonyms for tables, single-table views, and materialized views

Database objects of different types can be compared and converged at different databases. For example, a table at one database and a materialized view at another database can be compared and converged.

In the examples in this guide, the shared database object has the same name at the two databases, and the entire database object is compared and converged. However, the DBMS_COMPARISON package provides flexibility for differences in the shared database object at different databases. The database objects being compared do not need to have the same name. In addition, column names can also be different in the database objects, if the corresponding columns are the same data type. You can compare and converge the entire shared database object or subsets of columns and rows.

To create a comparison, use the CREATE_COMPARISON procedure in the DBMS_COMPARISON package. This procedure identifies one or more index columns in the shared database object. The DBMS_COMPARISON package must be able to identify at least one column that it can use as an index column. If the specified database object does not have a column that can be used as an index column, then the CREATE_COMPARISON procedure cannot create a comparison for the database object.

Note:

If your environment has shared database objects that have diverged, then you should investigate the cause of the problem and correct it. Although the DBMS_COMPARISON package can compare and converge shared database objects, it is better if these database objects do not diverge.

See Also:

Tutorial: Preparing to Compare and Converge Data

Suppose you share the hr.departments table in two databases. You want to compare this table at these databases to see if their data is consistent. If the tables have diverged at the two databases, then you want to converge them to make them consistent.

Meet the following prerequisites to complete this tutorial:

  • Configure network connectivity so that the two databases can communicate with each other. See Oracle Database 2 Day DBA for information about configuring network connectivity between databases.

  • Ensure that the hr sample schema is installed on both databases. The hr sample schema is installed by default with Oracle Database.

In this example, the global names of the databases are ii1.example.com and ii2.example.com, but you can substitute any two databases in your environment that meet the prerequisites.

To prepare for comparison and convergence of the hr.departments table at the ii1.example.com and ii2.example.com databases: 

  1. For the purposes of this example, make the hr.departments table diverge at the two databases:

    1. On a command line, open SQL*Plus and connect to the ii2.example.com database as hr user.

      See Oracle Database 2 Day DBA for more information about starting SQL*Plus.

    2. Delete the department in the hr.departments table with the department_id equal to 270:

      DELETE FROM hr.departments WHERE department_id=270;
      COMMIT;
      
    3. Modify the data in a row in the hr.departments table:

      UPDATE hr.departments SET manager_id=114 WHERE department_id=10;
      COMMIT;
      
    4. Insert a row into the hr.departments table:

      INSERT INTO hr.departments VALUES(280, 'Bean Counters', 108, 2700);
      COMMIT;
      
    5. Exit SQL*Plus:

      EXIT;
      

    Note:

    Usually, Step 1 is not required. It is included in this example to ensure that the hr.departments table diverges at the two databases.
  2. Create a database link from the ii1.example.com database to the ii2.example.com database.

    The database link should connect from an administrative user in ii1.example.com to an administrative user schema in ii2.example.com. The administrative user at both databases should have the necessary privileges to access and modify the hr.departments table and the necessary privileges to run subprograms in the DBMS_COMPARISON package. If you are not sure which user has these privileges, then use SYSTEM user. Also, both the name and the service name of the database link must be ii2.example.com. See "Tutorial: Creating a Database Link" for instructions.

Tutorial: Comparing Data in Two Different Databases

This example continues the scenario described in "Tutorial: Preparing to Compare and Converge Data". Complete the steps in that topic before continuing.

You can use the CREATE_COMPARISON procedure in the DBMS_COMPARISON package to define a comparison of a shared database object at two different databases. Once the comparison is defined, you can use the COMPARE function in this package to compare the database object specified in the comparison at the current point in time. You can run the COMPARE function multiple times for a specific comparison. Each time you run the function, it results one or more scans of the database objects, and each scan has its own scan ID.

To compare the entire hr.departments table at the ii1.example.com and ii2.example.com databases: 

  1. On a command line, open SQL*Plus and connect to the ii1.example.com database as the administrative user who owns the database link created in "Tutorial: Preparing to Compare and Converge Data". For example, if SYSTEM user owns the database link, then connect as SYSTEM user:

    sqlplus system@ii1.example.com
    Enter password: password
    

    See Oracle Database 2 Day DBA for more information about starting SQL*Plus.

  2. Run the CREATE_COMPARISON procedure to create the comparison for the hr.departments table:

    BEGIN
      DBMS_COMPARISON.CREATE_COMPARISON(
        comparison_name => 'compare_departments',
        schema_name     => 'hr',
        object_name     => 'departments',
        dblink_name     => 'ii2.example.com');
    END;
    /
    

    Note that the name of the new comparison is compare_departments. This comparison is owned by the user who runs the CREATE_COMPARISON procedure.

  3. Run the COMPARE function to compare the hr.departments table at the two databases:

    SET SERVEROUTPUT ON
    DECLARE
      consistent   BOOLEAN;
      scan_info    DBMS_COMPARISON.COMPARISON_TYPE;
    BEGIN
      consistent := DBMS_COMPARISON.COMPARE(
                      comparison_name => 'compare_departments',
                      scan_info       => scan_info,
                      perform_row_dif => TRUE);
      DBMS_OUTPUT.PUT_LINE('Scan ID: '||scan_info.scan_id);
      IF consistent=TRUE THEN
        DBMS_OUTPUT.PUT_LINE('No differences were found.');
      ELSE
        DBMS_OUTPUT.PUT_LINE('Differences were found.');
      END IF;
    END;
    /
    
    Scan ID: 1
    Differences were found.
     
    PL/SQL procedure successfully completed.
    

    Specify the name of the comparison created in Step 2 for the comparison_name parameter.

    The function prints the scan ID for the comparison. The scan ID is important when you are querying data dictionary views for information about the comparison and when you are converging the database objects.

    The function also prints whether differences were found in the table at the two databases:

    • If the function prints 'No differences were found', then the table is consistent at the two databases.

    • If the function prints 'Differences were found', then the table has diverged at the two databases.

  4. Make a note of the scan ID returned by the function in the previous step. In this example, assume the scan ID is 1.

  5. If differences were found in Step 3, then run the following query to show the number of differences found:

    COLUMN OWNER HEADING 'Comparison Owner' FORMAT A16
    COLUMN COMPARISON_NAME HEADING 'Comparison Name' FORMAT A20
    COLUMN SCHEMA_NAME HEADING 'Schema Name' FORMAT A11
    COLUMN OBJECT_NAME HEADING 'Object Name' FORMAT A11
    COLUMN CURRENT_DIF_COUNT HEADING 'Differences' FORMAT 9999999
     
    SELECT c.OWNER, 
           c.COMPARISON_NAME, 
           c.SCHEMA_NAME, 
           c.OBJECT_NAME, 
           s.CURRENT_DIF_COUNT 
      FROM DBA_COMPARISON c, DBA_COMPARISON_SCAN s
      WHERE c.COMPARISON_NAME = s.COMPARISON_NAME AND
            c.OWNER           = s.OWNER AND
            s.SCAN_ID         = 1;
    

    Specify the scan ID you recorded in Step 4 in the WHERE clause of the query.

    The output will be similar to the following:

    Comparison Owner Comparison Name      Schema Name Object Name Differences
    ---------------- -------------------- ----------- ----------- -----------
    SYSTEM           COMPARE_DEPARTMENTS  HR          DEPARTMENTS           3
    
  6. To see which rows were different in the database object being compared, run the following query:

    COLUMN COLUMN_NAME HEADING 'Index Column' FORMAT A15
    COLUMN INDEX_VALUE HEADING 'Index Value' FORMAT A15
    COLUMN LOCAL_ROWID HEADING 'Local Row Exists?' FORMAT A20
    COLUMN REMOTE_ROWID HEADING 'Remote Row Exists?' FORMAT A20
     
    SELECT c.COLUMN_NAME,
           r.INDEX_VALUE, 
           DECODE(r.LOCAL_ROWID,
                    NULL, 'No',
                          'Yes') LOCAL_ROWID,
           DECODE(r.REMOTE_ROWID,
                    NULL, 'No',
                          'Yes') REMOTE_ROWID
      FROM DBA_COMPARISON_COLUMNS c,
           DBA_COMPARISON_ROW_DIF r,
           DBA_COMPARISON_SCAN s
      WHERE c.COMPARISON_NAME = 'COMPARE_DEPARTMENTS' AND
            r.SCAN_ID         = s.SCAN_ID AND
            s.PARENT_SCAN_ID  = 1 AND
            r.STATUS          = 'DIF' AND
            c.INDEX_COLUMN    = 'Y' AND
            c.COMPARISON_NAME = r.COMPARISON_NAME AND
            c.OWNER           = r.OWNER
      ORDER BY r.INDEX_VALUE;
    

    In the WHERE clause, specify the name of the comparison and the scan ID for the comparison. In this example, the name of the comparison is compare_departments and the scan ID is 1.

    The output will be similar to the following:

    Index Column    Index Value     Local Row Exists?    Remote Row Exists?
    --------------- --------------- -------------------- --------------------
    DEPARTMENT_ID   10              Yes                  Yes
    DEPARTMENT_ID   270             Yes                  No
    DEPARTMENT_ID   280             No                   Yes
    

    This output shows the index column for the table being compared and the index value for each row that is different in the shared database object. In this example, the index column is the primary key column for the hr.departments table (department_id). The output also shows the type of difference for each row:

    • If Local Row Exists? and Remote Row Exists? are both Yes for a row, then the row exists in both instances of the database object, but the data in the row is different.

    • If Local Row Exists? is Yes and Remote Row Exists? is No for a row, then the row exists in the local database object but not in the remote database object.

    • If Local Row Exists? is No and Remote Row Exists? is Yes for a row, then the row exists in the remote database object but not in the local database object.

Tutorial: Converging Divergent Data

This example continues the scenario described in "Tutorial: Comparing Data in Two Different Databases". Complete the steps in that topic before continuing.

When a shared database object has diverged at two different databases, you can use the CONVERGE procedure in the DBMS_COMPARISON package to converge the two instances of the database object. After the CONVERGE procedure runs successfully, the shared database object is consistent at the two databases. To run the CONVERGE procedure, you must specify the following information:

  • The name of an existing comparison created using the CREATE_COMPARISON procedure in the DBMS_COMPARISON package

  • The scan ID of the comparison that you want to converge

The scan ID contains information about the differences that will be converged. In this example, the name of the comparison is compare_departments and the scan ID is 1.

Also, when you run the CONVERGE procedure, you must specify which database "wins" when the shared database object is converged. If you specify that the local database wins, then the data in the database object at the local database replaces the data in the database object at the remote database when the data is different. If you specify that the remote database wins, then the data in the database object at the remote database replaces the data in the database object at the local database when the data is different. In this example, the local database ii1.example.com wins.

To converge divergent data in the hr.departments table at the ii1.example.com and ii2.example.com databases: 

  1. On a command line, open SQL*Plus and connect to the ii1.example.com database as the administrative user who owns the database link created in "Tutorial: Preparing to Compare and Converge Data". For example, if the SYSTEM user owns the database link, then connect as the SYSTEM user:

    sqlplus system@ii1.example.com
    Enter password: password
    

    See Oracle Database 2 Day DBA for more information about starting SQL*Plus.

  2. Run the CONVERGE procedure to converge the hr.departments table at the two databases:

    SET SERVEROUTPUT ON
    DECLARE
      scan_info    DBMS_COMPARISON.COMPARISON_TYPE;
    BEGIN
      DBMS_COMPARISON.CONVERGE(
        comparison_name  => 'compare_departments',
        scan_id          => 1,
        scan_info        => scan_info,
        converge_options => DBMS_COMPARISON.CMP_CONVERGE_LOCAL_WINS);
      DBMS_OUTPUT.PUT_LINE('Local Rows Merged: '||scan_info.loc_rows_merged);
      DBMS_OUTPUT.PUT_LINE('Remote Rows Merged: '||scan_info.rmt_rows_merged);
      DBMS_OUTPUT.PUT_LINE('Local Rows Deleted: '||scan_info.loc_rows_deleted);
      DBMS_OUTPUT.PUT_LINE('Remote Rows Deleted: '||scan_info.rmt_rows_deleted);
    END;
    /
    
    Local Rows Merged: 0
    Remote Rows Merged: 2
    Local Rows Deleted: 0
    Remote Rows Deleted: 1
    
    PL/SQL procedure successfully completed.
    

The CONVERGE procedure synchronizes the portion of the database object compared by the specified scan and returns information about the changes it made. Some scans might compare a subset of the database object. In this example, the specified scan compared the entire table. So, the entire table is synchronized, assuming no new differences appeared after the comparison scan completed.

The local table wins in this example because the converge_options parameter is set to DBMS_COMPARISON.CMP_CONVERGE_LOCAL_WINS in the procedure. That is, for the rows that are different in the two databases, the rows at the local database replace the corresponding rows at the remote database. If some rows exist at the remote database but not at the local database, then the extra rows at the remote database are deleted. If instead you want the remote database to win, then set the converge_options parameter to DBMS_COMPARISON.CMP_CONVERGE_REMOTE_WINS in the procedure.

In addition, if you run the CONVERGE procedure on a shared database object that is part of an Oracle Streams replication environment, then you might not want the changes made by the procedure to be replicated to other databases. In this case, you can set the following parameters in the CONVERGE procedure to values that will prevent the changes from being replicated:

  • local_converge_tag

  • remote_converge_tag

When one of these parameters is set to a non-NULL value, a tag is set in the session that makes the changes during convergence. The local_converge_tag parameter sets the tag in the session at the local database, while the remote_converge_tag parameter sets the tag in the session at the remote database. If you do not want the changes made by the CONVERGE procedure to be replicated, then set these parameters to a value that will prevent Oracle Streams capture processes and synchronous captures from capturing the changes.