1 Introduction to Data Replication and Integration

As a database administrator with multiple databases to manage, you are responsible for making information available when and where it is needed.

This chapter contains the following sections:

About This Guide

Oracle Database 2 Day + Data Replication and Integration Guide teaches you how to perform common tasks that are necessary to configure and administer several different types of data replication and integration environments. See "About Data Replication and Integration" and "About Data Replication and Integration Features" for information about the types of environments.

This guide helps you decide which data replication and integration environments are best for you. This guide also provides basic, task-oriented instructions for configuring, maintaining, monitoring, and troubleshooting common types of data replication and integration environments.

The primary interfaces used in this guide are Oracle Enterprise Manager and SQL*Plus.

This section contains the following topics:

Before Using This Guide

Before using this guide, you must:

Because this guide describes data replication and integration at multiple databases, more than one database is required for most of the tasks described in this guide.

What This Guide Is Not

The Oracle Database 2 Day + Data Replication and Integration Guide is task-oriented. The objective is to describe common data replication and integration tasks. Where appropriate, it describes the concepts necessary for understanding and completing the current task.

Data replication and integration involves several Oracle Database features. These features include distributed SQL, Oracle Database Gateway, Oracle Streams, and materialized views. This guide does not provide exhaustive information about these features. For complete conceptual information about these features and detailed instructions for using them, see the appropriate Oracle documentation:

Also, this guide describes using some data replication and integration features available in Oracle Enterprise Manager, but this guide does not provide exhaustive information about Enterprise Manager. For information about Enterprise Manager, see the Enterprise Manager online Help.

About Data Replication and Integration

As organizations expand, it becomes increasingly important for them to be able to share information among multiple databases and applications. Data replication and integration enables you to access information when and where you need it in a distributed environment. Oracle Database provides secure and standard mechanisms that enable communication between databases, applications, and users. These mechanisms include queues, data replication, messaging, and distributed access in both homogeneous and heterogeneous environments.

This guide describes using distributed SQL, replication, and message queuing. You can make efficient use of your computing resources by using these features to complete the following types of tasks:

  • Replicate data between databases

  • Provide easy access to data in distributed databases

  • Exchange data between Oracle databases and non-Oracle databases

  • Enable communication between applications

  • Exchange information with customers, partners, and suppliers

  • Provide event notification and workflow

Oracle Database provides the following types of data replication and integration solutions to address your specific requirements:

  • Consolidation: All data is moved into a single database and managed from a central location. Oracle Real Application Clusters (Oracle RAC), Grid computing, and Virtual Private Database (VPD) can help you consolidate information into a single database that is highly available, scalable, and secure.

  • Federation: Data appears to be integrated in a single virtual database, while actually remaining in its current distributed locations. Distributed queries, distributed SQL, and Oracle Database Gateway can help you create a federated database.

  • Sharing: Multiple copies of same information are maintained in multiple databases and application data stores. Data replication and messaging can help you share information at multiple databases.

About Data Replication and Integration Features

It is not always possible for an organization to consolidate all of its data into a single database. The data might be spread over several geographic locations, and some remote locations might not have good connectivity with a primary site. In some cases, the data might be consolidated, but the organization might need a method for best of breed applications to communicate with each other. These are only a few reasons why organizations might need to share information between locations or applications.

Oracle Database provides several ways for organizations to achieve their data replication and integration goals. This topic helps you decide which data replication and integration features are best for your organization.

The following topics describe when to use different data replication and integration features:

You can choose to use one of these features or a combination of them to meet your requirements.

Note:

In addition to the data replication and integration features described in this guide, Oracle Warehouse Builder is another option that you can use to integrate information. Oracle Warehouse Builder is a flexible tool that enables you to design and deploy various types of data integration strategies. Projects commonly implemented using Warehouse Builder involve mission critical operational systems, migration scenarios, integration of disparate operational systems, and traditional data warehousing. Oracle Warehouse Builder includes a set of graphical user interfaces to assist you in implementing solutions for integrating data. See Oracle Warehouse Builder Concepts.

When to Access and Modify Information in Multiple Databases

Despite their best efforts to consolidate information, many organizations find themselves with multiple, distributed databases. Even if these organizations might prefer to centralize this data, at least in the short term, it might not be possible. These organizations must have a method of accessing these distributed data sources as if they were a single, centralized database. Using distributed SQL, applications and users can access and modify information at multiple Oracle or non-Oracle databases as if it resided in a single Oracle database.

Because information does not need to be moved or copied, using distributed SQL to federate their distributed data sources provides organizations with the fastest, and easiest, path to information integration. If information is later moved, then it is not necessary to rewrite an application. This is especially useful for organizations that are transitioning to a consolidated approach, but need a method for accessing the distributed data now.

For example, by using distributed SQL with the appropriate Oracle Database Gateway, applications can access legacy data immediately, without waiting until it can be imported into an Oracle Database. Distributed SQL is also useful to organizations that want to perform ad hoc queries or updates on infrequently accessed data that is more appropriately located elsewhere.

When to Replicate Data with Oracle Streams

If connectivity is not an issue, then organizations might prefer to replicate data in a near-real-time manner. Doing so ensures that the data is up to date at all locations as soon as possible. Oracle Streams supports near-real-time data replication in a variety of configurations, depending on an organization's specific requirements. In an Oracle Streams replication environment, databases push changes to each other automatically.

Common uses for Oracle Streams replication include:

  • Creating a reporting site to offload processing from a primary online transaction processing (OLTP) site

  • Providing load balancing and improved scalability and availability for a call center or similar application

  • Providing site autonomy between locations to satisfy certain common business requirements

  • Transforming and consolidating data from multiple locations, such as regional offices

  • Replicating data between different platforms and Oracle Database releases, and across a wide area network (WAN)

There are two common types of Oracle Streams replication configurations: n-way and hub-and-spoke. Specifically, a multimaster (or n-way) configuration is frequently used by organizations that must provide scalability and availability of data. Often, these applications use a "follow the sun" model, with replicas located around the globe. For example, an organization might have call centers in the United States, Europe, and Asia, each with a complete copy of the customer data. Customer calls can be routed to the appropriate call center depending on the time of day. Each call center has fast, local access to the data. If a site becomes unavailable for any reason, then transactions can be routed to a surviving location. This type of configuration can also be used to provide load balancing between multiple locations.

Another common configuration is hub-and-spoke. For example, an insurance company might use this configuration to share customer data between its headquarters and local sales offices. A networked version of this configuration can be especially useful in cases of limited connectivity between the end spokes and the hub. Suppose local sales offices have direct connectivity to regional offices, which in turn connect to headquarters, but the local offices have no direct connectivity to headquarters. This type of networked routing can eliminate some complexity that results when there are direct connections between all locations. The hub-and-spoke configuration is also useful in data warehousing environments, where detailed data is maintained at each store or spoke, and higher-level data can be shared with the data warehouse or hub.

In both n-way and hub-and-spoke configurations, organizations can configure Oracle Streams replication to allow updates to the replicated data at multiple locations. In such replication environments, data conflicts are possible. Oracle Streams provides conflict resolution methods that can resolve these conflicts automatically.

Because Oracle Streams provides a flexible infrastructure for all information sharing requirements, including messaging and replication, it is easy for an organization to change its configuration as its needs change.

When to Replicate Data with Materialized Views

It might not always be practical for all users to access data that is stored in a single location. For example, field sales personnel might need access to a price list when they are at a customer site, without immediate access to the corporate databases. They might want to process an order, even if they cannot connect to the primary or master database. These users require a replica of the database or a portion of the database.

Unlike Oracle Streams replication, materialized views do not continuously replicate data at all times. A materialized view is a replica of a table or a subset of a table that can be refreshed to a transactionally consistent point in time. During a refresh, only the final values of the changed rows are pulled down and applied to the materialized view, no matter how many updates were applied to the master table. This reduces the amount of time that the remote site must be connected to the master site.

Materialized views are especially useful for locations with limited connectivity to the master site. Updatable materialized views allow these locations to function autonomously, even when connectivity is unavailable. When updates are allowed at multiple locations, ownership is typically partitioned in some manner between the locations to prevent conflicting updates. When conflicts are possible, Oracle provides conflict resolution methods that can resolve these conflicts automatically.

In addition to supporting disconnected computing, organizations can also use materialized views to improve performance and scalability by providing local access to data and by off loading processing at the primary location. For example, one or more materialized views might be used to off load reporting activity from an order-entry system.

When to Send Messages Between Databases

As organizations grow, they typically develop a variety of applications to automate processes and manage tasks. Although these applications do not share data directly, they might not operate entirely autonomously. These applications need a way to communicate with one another to coordinate tasks and exchange information.

Using Oracle Streams Advanced Queuing (AQ), applications can securely and reliably communicate with one another in an asynchronous manner. Oracle Streams AQ supports all of the standard features of message queuing systems, including multiconsumer queues, publish and subscribe, content-based routing, Internet propagation, and transformations. So, for example, the shipping department can easily notify the billing department when a product has shipped, and the customer can be billed accordingly.

By combining Oracle Streams AQ with the appropriate messaging gateway, applications can even interoperate with other message queuing systems, such as TIBCO Rendezvous or IBM Websphere MQ. This ability can be especially useful when it is necessary to share information with business partners or customers.