<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html> <head profile="http://internetalchemy.org/2003/02/profile"> <link rel="foaf" type="application/rdf+xml" title="FOAF" href="http://www.openlinksw.com/dataspace/uda/about.rdf" /> <link rel="schema.dc" href="http://purl.org/dc/elements/1.1/" /> <meta name="dc.title" content="11. Data Replication, Synchronization and Transformation Services" /> <meta name="dc.subject" content="11. Data Replication, Synchronization and Transformation Services" /> <meta name="dc.creator" content="OpenLink Software Documentation Team ; " /> <meta name="dc.copyright" content="OpenLink Software, 1999 - 2009" /> <link rel="top" href="index.html" title="OpenLink Virtuoso Universal Server: Documentation" /> <link rel="search" href="/doc/adv_search.vspx" title="Search OpenLink Virtuoso Universal Server: Documentation" /> <link rel="parent" href="repl.html" title="Chapter Contents" /> <link rel="prev" href="replintro.html" title="Introduction" /> <link rel="next" href="proctransrepl.html" title="Transactional Replication " /> <link rel="shortcut icon" href="../images/misc/favicon.ico" type="image/x-icon" /> <link rel="stylesheet" type="text/css" href="doc.css" /> <link rel="stylesheet" type="text/css" href="/doc/translation.css" /> <title>11. Data Replication, Synchronization and Transformation Services</title> <meta http-equiv="Content-Type" content="text/xhtml; charset=UTF-8" /> <meta name="author" content="OpenLink Software Documentation Team ; " /> <meta name="copyright" content="OpenLink Software, 1999 - 2009" /> <meta name="keywords" content="" /> <meta name="GENERATOR" content="OpenLink XSLT Team" /> </head> <body> <div id="header"> <a name="SNAPSHOT" /> <img src="../images/misc/logo.jpg" alt="" /> <h1>11. Data Replication, Synchronization and Transformation Services</h1> </div> <div id="navbartop"> <div> <a class="link" href="repl.html">Chapter Contents</a> | <a class="link" href="replintro.html" title="Introduction">Prev</a> | <a class="link" href="proctransrepl.html" title="Transactional Replication ">Next</a> </div> </div> <div id="currenttoc"> <form method="post" action="/doc/adv_search.vspx"> <div class="search">Keyword Search: <br /> <input type="text" name="q" /> <input type="submit" name="go" value="Go" /> </div> </form> <div> <a href="http://www.openlinksw.com/">www.openlinksw.com</a> </div> <div> <a href="http://docs.openlinksw.com/">docs.openlinksw.com</a> </div> <br /> <div> <a href="index.html">Book Home</a> </div> <br /> <div> <a href="contents.html">Contents</a> </div> <div> <a href="preface.html">Preface</a> </div> <br /> <div class="selected"> <a href="repl.html">Data Replication, Synchronization and Transformation Services</a> </div> <br /> <div> <a href="replintro.html">Introduction</a> </div> <div class="selected"> <a href="SNAPSHOT.html">Snapshot Replication</a> <div> <a href="#NONINC" title="Non incremental snapshot replication">Non incremental snapshot replication</a> <a href="#INC" title="Incremental snapshot replication">Incremental snapshot replication</a> <a href="#COMMANDS" title="Command reference">Command reference</a> <a href="#bidirrepl" title="Bi-Directional Snapshot Replication">Bi-Directional Snapshot Replication</a> <a href="#snpreg" title="Registry variables">Registry variables</a> <a href="#snpheter" title="Heterogeneous snapshot replication">Heterogeneous snapshot replication</a> <a href="#snpmap" title="Data type mappings">Data type mappings</a> <a href="#snpincobj" title="Objects created by incremental snapshot replication">Objects created by incremental snapshot replication</a> <a href="#snpbidirobj" title="Objects created by bi-directional snapshot replication">Objects created by bi-directional snapshot replication</a> <a href="#replSYSTABLES" title="Replication system tables">Replication system tables</a> <a href="#SNLOG" title="Table snapshot logs">Table snapshot logs</a> </div> </div> <div> <a href="proctransrepl.html">Transactional Replication </a> </div> <div> <a href="SCHEDULER.html">Virtuoso scheduler</a> </div> <div> <a href="replexamples.html">Transactional Replication Example</a> </div> <div> <a href="replsample.html">Replication Logger Sample</a> </div> <br /> </div> <div id="text"> <a name="SNAPSHOT" /> <h2>11.2. Snapshot Replication</h2> <a name="NONINC" /> <h3>11.2.1. Non incremental snapshot replication</h3> <p> The Virtuoso Server periodically evaluates a query and inserts the result into a table, replacing the content. The source and target can be anywhere but typically will be on different databases of which at least one is remote. </p> <p>This makes a two party distributed transaction. Only one of the transaction branches is writing, hence this goes with a one phase commit. </p> <div class="note"> <div class="notetitle">Note:</div> <p> This can be used to replicate between two non-Virtuoso databases. </p> </div> <p> The prerequisite of this mode of replication is that all tables exist. The schema is never replicated. </p> <br /> <a name="INC" /> <h3>11.2.2. Incremental snapshot replication</h3> <p> A table can be replicated from a generic, possibly non-Virtuoso source in an incremental fashion if the data source meets certain criteria: </p> <p> The data should come from a single table. </p> <p> The source table should have an explicit primary key. </p> <p> There source query should have the form </p> <div> <pre class="programlisting">SELECT fields FROM source_table WHERE scalar_conditions </pre> </div> <p> A snapshot log on the source table should exist. </p> <p> There should be insert, delete and update triggers on the source table to update the snapshot log. </p> <p> The incremental update is done in the following way: </p> <p> All the changed records with a snaptime greater or equal from REPL_START_TIME(SN_LAST_TS) are processed. If the record to insert already exists, then it is updated. If the record to update does not exist in the destination table, then it is inserted. This contributes for conflict resolving. </p> <br /> <a name="COMMANDS" /> <h3>11.2.3. Command reference</h3> <a name="CREATELOG" /> <h4>11.2.3.1. CREATE SNAPSHOT LOG FOR source_table</h4> <div> <pre class="programlisting">CREATE SNAPSHOT LOG FOR source_table</pre> </div> <div> <pre class="programlisting">repl_create_snapshot_log(in source_table varchar)</pre> </div> <p> Creates a snapshot log of a native (non VDB) table and the appropriate log update triggers. </p> <p> Note that when replicating data from a remote Virtuoso a snapshot log table can be created on the remote virtuoso and then the source and log tables can be attached into the local Virtuoso. However the snapshot log table and the update triggers should be created manually on a non-Virtuoso data source. </p> <p> Examples (based on Virtuoso demo database): </p> <div> <pre class="programlisting"> create snapshot log for Orders; </pre> </div> <br /> <a name="DROPLOG" /> <h4>11.2.3.2. DROP SNAPSHOT LOG FOR source_table</h4> <div> <pre class="programlisting">DROP SNAPSHOT LOG FOR source_table</pre> </div> <div> <pre class="programlisting">repl_drop_snapshot_log(in source_table varchar)</pre> </div> <p> Drops a snapshot log of a table and the log update triggers. </p> <p> This results in 4 SQL commands : </p> <div> <pre class="programlisting"> drop trigger xxx_I_log; drop trigger xxx_U_log; drop trigger xxx_D_log; drop table RLOG_xxx; </pre> </div> <p> where xxx is the source table name (fully qualified - DB.DBA.Orders becomes DB_DBA_Orders). </p> <p> Examples (based on Virtuoso demo database): </p> <div> <pre class="programlisting"> drop snapshot log for Orders; </pre> </div> <br /> <a name="PURGELOG" /> <h4>11.2.3.3. PURGE SNAPSHOT LOG FOR source_table</h4> <div> <pre class="programlisting">PURGE SNAPSHOT LOG FOR source_table</pre> </div> <div> <pre class="programlisting">repl_purge_snapshot_log(in source_table varchar)</pre> </div> <p> There can be old snapshot log rows which have been replayed in all the snapshots of a given source table. These rows are no longer needed and can be purged from the snapshot log for faster operation. </p> <p> This command checks if there are any rows in SYS_REPLICATION referring to that source table and if there are, then it deletes all the rows in the log table with SNAPTIME earlier then REPL_START_TIME(MIN(SN_LAST_TS)). </p> <p> If there are no rows in SYS_SNAPSHOT it assumes that the log is used for "pull" replication and does not do anything. </p> <p> Examples (based on Virtuoso demo database): </p> <div> <pre class="programlisting"> purge snapshot log for Orders; </pre> </div> <br /> <a name="CREATENONINC" /> <h4>11.2.3.4. CREATE NONINCREMENTAL SNAPSHOT</h4> <div> <pre class="programlisting">CREATE NONINCREMENTAL SNAPSHOT dest_table as 'query_def'</pre> </div> <div> <pre class="programlisting">repl_create_snapshot(in query_def varchar, in dest_table varchar)</pre> </div> <p> Creates a non-incremental snapshot log of the data returned by the query query_def. </p> <p> The server first checks for the existence of a table with the same name. If it does not exist, then a table with a layout to accommodate the resultset produced by query_def is created. </p> <p> Then a entry containing query_def and tablename is added to SYS_SNAPSHOT and the destination table is filled in with an insert into tablename query_def. </p> <p> Examples (based on Virtuoso demo database): </p> <div> <pre class="programlisting">create nonincremental snapshot sub_orders as 'select * from Orders where OrderID < 5' </pre> </div> <br /> <a name="CREATEINC" /> <h4>11.2.3.5. CREATE INCREMENTAL SNAPSHOT</h4> <div> <pre class="programlisting">CREATE SNAPSHOT dest_table FROM source_table [ 'source_column_list' ] [ WHERE 'condition' ]</pre> </div> <div> <pre class="programlisting">repl_create_inc_snapshot( in source_column_list varchar, in source_table varchar, in condition varchar, in dest_table varchar)</pre> </div> <p> Creates a incremental snapshot log using source query like that : </p> <div> <pre class="programlisting"> SELECT source_table_pk_cols, source_column_list FROM source_table WHERE condition </pre> </div> <p> If the dest table already exists, then it's column count is checked to be greater than the primary key parts count of the source table. </p> <p> If the source_column_list is omitted, then the destination table is created to have the same columns as the source table. </p> <p> Examples (based on Virtuoso demo database): </p> <div> <pre class="programlisting"> CREATE SNAPSHOT sub_orders FROM Orders WHERE 'OrderID < 5' </pre> </div> <br /> <a name="UPDATESNAP" /> <h4>11.2.3.6. UPDATE SNAPSHOT</h4> <div> <pre class="programlisting">UPDATE SNAPSHOT snapshot_table_name [ AS NONINCREMENTAL ]</pre> </div> <div> <pre class="programlisting">repl_refresh_inc_snapshot( in snapshot_table_name varchar)</pre> </div> <div> <pre class="programlisting">repl_refresh_noninc_snapshot( in snapshot_table_name varchar)</pre> </div> <p> Without the optional argument updates the snapshot according to it's type. If an incremental snapshot needs to be updated as a nonincremental, then the full form should be used. </p> <p> Examples (based on Virtuoso demo database): </p> <div> <pre class="programlisting"> UPDATE SNAPSHOT sub_orders </pre> </div> <br /> <a name="DROPSNAP" /> <h4>11.2.3.7. DROP SNAPSHOT</h4> <div> <pre class="programlisting">DROP SNAPSHOT snapshot_table_name [ WITH DELETE ]</pre> </div> <div> <pre class="programlisting">repl_drop_snapshot( in snapshot_table_name varchar, in delete integer)</pre> </div> <p> Removes a snapshot definition from SYS_SNAPSHOT and optionally drops the snapshot destination table. </p> <p> Examples (based on Virtuoso demo database): </p> <div> <pre class="programlisting"> DROP SNAPSHOT sub_orders WITH DELETE </pre> </div> <br /> <br /> <a name="bidirrepl" /> <h3>11.2.4. Bi-Directional Snapshot Replication</h3> <p>Bi-directional snapshot replication allows you to set up snapshot replication between multiple servers where updates can be performed on all servers. Bidirectional snapshot replication uses the publisher-subscriber model where each table or DAV collection has only one publisher and when an update is performed on subscriber it goes to publisher first, and then to all other subscribers. Conflict resolution may need to take place on the publisher when data coming from a subscriber is processed. Bi-directional snapshot replication uses snapshot logs on publisher and subscribers to track changes in published table and its replicas.</p> <p>It is assumed that all tables published using bi-directional snapshot replication have primary keys columns that are never modified.</p> <p>To perform conflict resolution a ROWGUID column is added to every published table. This column should never be modified manually. All INSERT statements should specify exact column-select-lists that do not include the ROWGUID column. Likewise the ROWGUID column should never feature in any UPDATE statements.</p> <p>Each server participating in bi-directional snapshot replication must have unique name (replication name). For Virtuoso servers replication name is assigned to the server in its virtuoso.ini file in the DBName setting. For other RDBMS servers replication name is the name of instance or the name of database. Replication name of remote server can be obtained using <span class="computeroutput">REPL_SERVER_NAME()</span> (after this server is defined, see below).</p> <p> <span class="computeroutput">REPL_CREATE_SNAPSHOT_PUB()</span> function should be used to create bi-directional snapshot publication.</p> <p>To create subscription for the publication, replication subscriber server should be defined first using <span class="computeroutput">REPL_SNP_SERVER()</span> function. The name of the server can be obtained later using <span class="computeroutput">REPL_SERVER_NAME()</span> function. After this <span class="computeroutput">REPL_CREATE_SNAPSHOT_SUB()</span> function should be used to create a subscription (and create replicated table on subscriber if table replication takes place).</p> <p>To load initial data on subscriber <span class="computeroutput">REPL_INIT_SNAPSHOT()</span> should be used after subscription is created. Loading initial data can take some time so <span class="computeroutput">REPL_INIT_SNAPSHOT()</span> performs commits after every 100 rows are copied from source table to the table on subscriber to prevent running out of transaction log or deadlocks. It is possible to specify an alternative value for number of rows per transaction (last parameter of <span class="computeroutput">REPL_INIT_SNAPSHOT()</span> function). In DAV case commits are performed per copied resource. </p> <p> <span class="computeroutput">REPL_UPDATE_SNAPSHOT()</span> should be called periodically after initial data is loaded on subscriber to sync published items (tables or DAV collections) on publisher and subscribers. This function reads snapshot logs attached from subscribers and replays them with possible conflict resolution. After all snapshot logs from subscribers are processed an updating procedure reads snapshot log on publisher and replays it on all subscribers.</p> <div class="note"> <div class="notetitle">Note:</div> <p>Please note that all operations in bi-directional snapshot replication (publication, subscription, doing initial copy, syncing) should be performed on publisher.</p> </div> <a name="" /> <div class="example"> <div class="exampletitle">Creating bi-directional snapshot publication</div> <p>This example demonstrates creating bi-directional snapshot publication of table 'Demo.demo.Shippers'.</p> <div> <pre class="screen"> SQL> REPL_CREATE_SNAPSHOT_PUB ('Demo.demo.Shippers', 2); </pre> </div> </div> <a name="" /> <div class="example"> <div class="exampletitle">Creating bi-directional snapshot subscription</div> <p>This example demonstrates creating bi-directional snapshot subscription for table 'Demo.demo.Shippers' and loading initial data on subscriber with DSN 'localhost:1121'.</p> <div> <pre class="screen"> SQL> REPL_SNP_SERVER ('localhost:1121', 'dba', 'dba'); SQL> REPL_CREATE_SNAPSHOT_SUB (REPL_SERVER_NAME ('localhost:1121'), 'Demo.demo.Shippers', 2); SQL> REPL_INIT_SNAPSHOT (REPL_SERVER_NAME ('localhost:1121'), 'Demo.demo.Shippers', 2); </pre> </div> </div> <a name="" /> <div class="example"> <div class="exampletitle">Syncing bi-directional snapshot publication</div> <p>This example demonstrates syncing bi-directional snapshot publication of table 'Demo.demo.Shippers'.</p> <div> <pre class="screen"> SQL> REPL_UPDATE_SNAPSHOT ('Demo.demo.Shippers', 2); </pre> </div> </div> <div class="tip"> <div class="tiptitle">See Also:</div> <p>The following functions are used for creating, dropping and updating publications and subscriptions to them:</p> <p> <a href="fn_REPL_CREATE_SNAPSHOT_SUB.html">REPL_CREATE_SNAPSHOT_SUB()</a> </p> <p> <a href="fn_REPL_CREATE_SNAPSHOT_PUB.html">REPL_CREATE_SNAPSHOT_PUB()</a> </p> <p> <a href="fn_REPL_DROP_SNAPSHOT_SUB.html">REPL_DROP_SNAPSHOT_SUB()</a> </p> <p> <a href="fn_REPL_DROP_SNAPSHOT_PUB.html">REPL_DROP_SNAPSHOT_PUB()</a> </p> <p> <a href="fn_REPL_INIT_SNAPSHOT.html">REPL_INIT_SNAPSHOT()</a> </p> <p> <a href="fn_REPL_UPDATE_SNAPSHOT.html">REPL_UPDATE_SNAPSHOT()</a> </p> <p> <a href="fn_REPL_SNP_SERVER.html">REPL_SNP_SERVER()</a> </p> <p> <a href="fn_REPL_SERVER_NAME.html">REPL_SERVER_NAME()</a> </p> </div> <a name="bidireplconflictrsln" /> <h4>11.2.4.1. Conflict Resolution</h4> <p>Since every table can have only one publisher, conflicts can only occur on the publisher when modifications from a subscriber are attempted. When DML operations originating on a subscriber are being replayed on the publisher, three types of conflicts can arise:</p> <ol> <li> <div class="formalpara"> <strong>uniqueness conflict (insert conflict)</strong> <p>occurs when the row with some primary key <PK> already exists in publisher's table.</p> </div> </li> <li> <div class="formalpara"> <strong>update conflict</strong> <p>occurs when UPDATE modifies a row which has already been modified on publisher (by the publisher or another subscriber)</p> </div> </li> <li> <div class="formalpara"> <strong>delete conflict</strong> <p>occurs when DELETE deletes a row that does not exist on publisher anymore.</p> </div> </li> </ol> <p>Delete conflicts when UPDATE modifies a row that does not exist on publisher can't be detected in snapshot replication case.</p> <p>Every table has a number of conflict resolvers which are used for conflict resolution which are enlisted in the <span class="computeroutput">DB.DBA.SYS_SNAPSHOT_CR</span> system table. Each conflict resolver has a type, one of ('I', 'U', or 'D'), and an order. Conflict resolvers are applied in ascending order.</p> <p>The conflict resolver is a Virtuoso/PL procedure that receives a conflicting row from a subscriber and some other arguments. The conflict resolver can modify the row, which is passed as an 'inout' argument. The conflict resolver should return an integer value, which will be used for conflict resolution.</p> <p>Conflict resolvers of different types have different signatures:</p> <ul> <li> <p> <strong>'I' - Insert conflict resolvers</strong> </p> <p>(<ALLCOLS>, inout _origin varchar)</p> </li> <li> <p> <strong>'U' - Update conflict resolvers</strong> </p> <p>(<ALLCOLS>, inout _origin varchar)</p> </li> <li> <p> <strong>'D' - Deletion conflict resolvers</strong> </p> <p>(<PK>, inout _origin varchar)</p> </li> </ul> <p>where</p> <p><ALLCOLS> are the new values of all columns (including the ROWGUID column), <PK> are the values of primary key columns, and _origin is transaction originator.</p> <p>Conflict resolvers can return the following integer values; The conflict resolver types concerned for each are listed in parentheses:</p> <ul> <li> <div class="formalpara"> <strong>0 - un-decide (I, U, D)</strong> <p>next conflict resolver will be fired.</p> </div> </li> <li> <div class="formalpara"> <strong>1 - subscriber wins (I, U, D)</strong> <p>DML operation will be applied with <ALLCOLS> All the subscribers except originator will receive modifications (originator already has them).</p> </div> </li> <li> <div class="formalpara"> <strong>2 - subscriber wins, change origin (I, U)</strong> <p>DML operation will be applied with <ALLCOLS> and origin of transaction will be changed to publisher's server name. All the subscribers (including originator) will receive modifications. This return value is useful when conflict resolver changed some of the columns of the row that were passed in. Although all parameters of conflict resolver are inout only changing of <ALLCOLS> (non-PK columns) parameters makes sense.</p> </div> </li> <li> <div class="formalpara"> <strong>3 - publisher wins (U)</strong> <p>DML operation will be applied with <ALLCOLS> taken from publisher's table. All the subscribers will receive modifications.</p> </div> </li> <li> <div class="formalpara"> <strong>4 - reserved</strong> <p /> </div> </li> <li> <div class="formalpara"> <strong>5 - ignore (D)</strong> <p>DML operation is ignored.</p> </div> </li> </ul> <p>Conflict resolution stops when some conflict resolver returns a non-zero value meaning that it has made a decision.</p> <a name="ex_conflictresln" /> <div class="example"> <div class="exampletitle">Conflict Resolution</div> <p>Suppose we have the following table:</p> <div> <pre class="programlisting"> create table items( item_id integer primary key, name varchar, price decimal ); </pre> </div> <p>"Publisher wins" 'I' conflict resolver will look like:</p> <div> <pre class="programlisting"> create procedure items_cr( inout _item_id integer, inout _name varchar, inout _price decimal, inout _origin varchar) returns integer { return 3; } </pre> </div> <p>The conflict resolver that will make a decision based on the minimal price column will look like:</p> <div> <pre class="programlisting"> create procedure items_cr( inout _item_id integer, inout _name varchar, inout _price decimal, inout _rowguid varchar, inout _origin varchar) returns integer { declare p decimal; -- get current price value select price into p from items where item_id = _item_id; if (p < _price) return 3; -- publisher wins else if (p > _price) return 1; -- subscriber wins return 0; -- can't decide } </pre> </div> <p>The conflict resolver that will change the price to the minimal value will look like:</p> <div> <pre class="programlisting"> create procedure items_cr( inout _item_id integer, inout _name varchar, inout _price decimal, inout _rowguid varchar, inout _origin varchar) returns integer { declare p decimal; -- get current price value select price into p from items where item_id = _item_id; if (p < _price) { _price := p; return 2; -- publisher wins, change origin } return 1; -- subscriber wins } </pre> </div> </div> <p>Conflict resolution occurs differently for each kind of DML operation:</p> <ul> <li> <div class="formalpara"> <strong>INSERT</strong> <p>When INSERT of some row with primary key <PK> is replayed, the row in the publisher's table with such <PK> is looked-up. If the row does not exist then there is no conflict, conflict resolution stops and the INSERT is replayed. If the row exists then we have a "uniqueness conflict". In this case 'I' conflict resolvers are fired-up. If none of the 'I' conflict resolvers were able to make a decision (return non-zero value) the default action is 'publisher wins'.</p> </div> </li> <li> <div class="formalpara"> <strong>UPDATE</strong> <p>When there is an UPDATE of some row with primary key <PK> is replayed, the row (and its ROWGUID) in publisher's table with such <PK> is looked-up. If the row does not exist then we have a "delete conflict", 'D' conflict resolvers are fired up. If none of the 'D' conflict resolvers were able to make a decision the default action will be to 'ignore'. If the row exists in the publisher's table and its ROWGUID is the same as that from the subscriber then there is no conflict. Conflict resolution stops and the UPDATE is replayed. If the row exists and its ROWGUID differs from the one that came from subscriber then we have an "update conflict". In this case the 'U' conflict resolvers are fired-up. If none of the 'U' conflict resolvers were able to make a decision (return non-zero value) the default action will be 'publisher wins'.</p> </div> </li> <li> <div class="formalpara"> <strong>DELETE</strong> <p>When DELETE operation of some row with primary key <PK> is replayed, the row in the publisher's table with such <PK> is looked-up. If the row does not exist or if the row exists but its ROWGUID differs from the one that came from subscriber then we have "delete conflict". The 'D' conflict resolvers are fired-up. If none of the 'D' conflict resolvers were able to make a decision then the default action will be taken to 'ignore'. Otherwise it is assumed that there is no conflict and DELETE statement is replayed.</p> </div> </li> </ul> <br /> <a name="bidirreplconflictdav" /> <h4>11.2.4.2. Conflict Resolution in WebDAV</h4> <p>Conflict resolvers in DAV are found based on the collection. The closest collection that specifies resolvers will be the one providing the resolver set. Resolvers from any enclosing collection will not be invoked.</p> <p>There is special subcollection in each replicated collection named '_SYS_REPL_BACKUP'. This subcollection is never replicated and is used to store backup copies of resources that lose conflict resolution.</p> <p>If a resource is locked during an update it will be stored in temporary location. Conflict resolution will take place after the lock is released. If the update was performed on the publisher then no conflict resolution will be performed on the subscribers and the resource from temporary location will simply replace an existing resource after the lock is released.</p> <p>Default conflict resolution is 'publisher wins'.</p> <p>Delete conflicts are not handled when replicating DAV because updates are the same as inserts -- if the resource that was updated does not exist it will be created. INSERT or uniqueness conflict for the same reasons.</p> <p>Subcollections are not considered in conflict resolution. Subcollections are always created as needed during an update. This means that if a resource is updated locally but collection that holds this resource does not exist on remote peer it will be created.</p> <br /> <a name="bidirreplsutogenres" /> <h4>11.2.4.3. Automatically Generated Conflict Resolvers</h4> <p>Simple table conflict resolvers can be generated automatically by calling the <span class="computeroutput">REPL_ADD_SNAPSHOT_CR()</span> function. DAV conflict resolvers be generated by calling <span class="computeroutput">REPL_ADD_DAV_CR()</span> function.</p> <div class="tip"> <div class="tiptitle">See Also:</div> <p> <a href="fn_REPL_ADD_DAV_CR.html">REPL_ADD_DAV_CR()</a> </p> <p> <a href="fn_REPL_ADD_SNAPSHOT_CR.html">REPL_ADD_SNAPSHOT_CR()</a> </p> </div> <p>The generated procedures can be modified afterwards. In particular it is possible to change the notification e-mail address by setting the _notify_email parameter, and the notification text by setting the _notify_text parameter.</p> <p>The default behaviour for generated procedures is 'pub_wins', making a backup and notifying the owner by e-mail.</p> <br /> <br /> <a name="snpreg" /> <h3>11.2.5. Registry variables</h3> <ul> <li> <p> <strong>snp_repl_tolerance_offset</strong> (default 15)</p> <p>In incremental and bi-directional snapshot time stamp of last replayed snapshot log entry (LAST_TS) is kept in system tables. LAST_TS is used to determine the starting point of the next update. This value can't be used as is for the following reasons: <ul> <li> <p>Local time may be adjusted occasionally on the servers which participate in replication. </p> </li> <li> <p>Virtuoso snapshot replication triggers use <span class="computeroutput">now()</span> function to get snapshot log entry time stamp. If transaction is started before updating snapshot log and ended after updating snapshot log is finished, data modifications made in this transaction will be missed.</p> </li> </ul> In order to resolve the issues mentioned above Virtuoso reads snapshot logs starting from LAST_TS minus some tolerance offset. The value of tolerance offset (in minutes) is kept in "snp_repl_tolerance_offset" registry variable. For the reasons mentioned above tolerance offset should be longer than length of any transaction which modifies published tables. </p> </li> <li> <p> <strong>snp_repl_purge_offset</strong> (default 30)</p> <p>In order to prevent snapshot log entries to be replayed more than once Virtuoso keeps log of replayed snapshot log entries (rplog). Rplog entries need to be purged periodically. Virtuoso automatically schedules rplog purger when snapshot log for some table is created. All the records in rplog with time stamp less than MIN(LAST_TS) minus some offset are purged. The value of purge offset (in minutes) is kept in "snp_repl_purge_offset" registry variable. Normally, purge offset should be greater than tolerance offset. </p> </li> </ul> <br /> <a name="snpheter" /> <h3>11.2.6. Heterogeneous snapshot replication</h3> <p> Virtuoso allows incremental and bi-directional snapshot replication flavors to be used with non-Virtuoso databases. </p> <p> The following databases are supported in incremental snapshot replication: <ul> <li> <p>SQL Server 2000 and later</p> </li> <li> <p>Oracle 8i and later</p> </li> <li> <p>IBM DB2 8.1 and later</p> </li> <li> <p>Informix IDS 9.40 and later</p> </li> </ul> </p> <p> The following databases are supported in bi-directional snapshot replication: <ul> <li> <p>SQL Server 2000 and later</p> </li> <li> <p>Oracle 8i and later</p> </li> <li> <p>IBM DB2 8.1 and later</p> </li> </ul> </p> <p> DBMS-specific notes: <ul> <li> <p> <strong>SQL Server 2000 and later</strong> </p> <p>Bi-directional snapshot triggers for SQL Server require "nested triggers" server option to be "On" (which is "On" by default). </p> </li> <li> <p> <strong>Oracle 8i</strong> </p> <ul> <li> <p>Oracle 8i does not have support for "localtimestamp" data type out of box. Required functionality can be enabled by editing $ORACLE_HOME/rdbms/admin/standard.sql script. The following statement </p> <div> <pre class="programlisting"> pragma new_names('8.1.6', dbtimezone, sessiontimezone, localtimestamp, localtime, to_local_tz, to_db_tz, cube, rollup, grouping, "TIMESTAMP WITH LOCAL TIME ZONE"); </pre> </div> <p> should be changed to </p> <div> <pre class="programlisting"> --pragma new_names('8.1.6', -- dbtimezone, sessiontimezone, localtimestamp, -- localtime, to_local_tz, to_db_tz, -- cube, rollup, grouping, "TIMESTAMP WITH LOCAL TIME ZONE"); </pre> </div> <p> After $ORACLE_HOME/rdbms/admin/standard.sql is edited it should be executed with SYS user privileges:</p> <div> <pre class="programlisting"> $sqlplus SYS/CHANGE_ON_INSTALL @$ORACLE_HOME/rdbms/admin/standard.sql </pre> </div> <p>where "CHANGE_ON_INSTALL" is password for user SYS.</p> <p>Oracle 9i and later has this type in the default installation and this step is not necessary.</p> </li> <li> <p>ODBC driver from Oracle 8.1.6 (and earlier versions) is known to be buggy and may cause Virtuoso server to crash when snapshot replication with Oracle database is set up. ODBC driver from Oracle 8.1.7 and later is recommended.</p> </li> <li> <p>Oracle ODBC driver uses Oracle NLS settings by default for determining decimal separator. If decimal separator in the locale is not '.' (period) there will be interoperability errors with Virtuoso. Oracle ODBC driver can be forced to use US Settings for numerics: open Oracle ODBC driver configuration dialog, select "Application" tab and set "Numeric Settings" to "Use US Settings".</p> </li> </ul> </li> <li> <p> <strong>DB2 8.1 ESE</strong> </p> <ul> <li> <p>FixPak 3 is recommended because ODBC driver from earlier DB2 8 versions may cause Virtuoso server to crash when BLOB columns exist in replicated tables.</p> </li> <li> <p>DB2 CLI/ODBC driver uses default locale's decimal separator. If decimal separator in the locale is not '.' (period) Virtuoso fails to parse numeric values. The default behavior of the DB2 CLI/ODBC driver can be modified by specifying PATCH2 CLI/ODBC configuration keyword in the db2cli.ini initialization file. Add the following line to your DSN config in db2cli.ini: </p> <div> <pre class="programlisting"> PATCH2=15 </pre> </div> <p>This will force DB2 CLI/ODBC driver to always use period ('.') as decimal separator.</p> </li> </ul> </li> <li> <p> <strong>Informix IDS 9.40</strong> </p> <p>Informix IDS can't report current time in UTC. This may cause snapshot replication to fail when timezone change occurs. In order to set up snapshot replication with Informix server properly any of the following steps must be taken: <ul> <li>Informix server must be run in UTC time zone.</li> <li>"snp_repl_time_offset" registry variable should be set to value greater than 60.</li> </ul> The second method may cause additional system load on updates and will cause transactions to be replayed out of order when timezone change occurs. </p> </li> </ul> </p> <br /> <a name="snpmap" /> <h3>11.2.7. Data type mappings</h3> <p>Heterogeneous replication requires data type mapping to be performed when Virtuoso table is created on replica.</p> <table class="data"> <caption>Table: 11.2.7.1. Data type mappings</caption> <tr> <td class="data"> <strong>Virtuoso</strong> </td> <td class="data"> <strong>SQL Server</strong> </td> <td class="data"> <strong>Oracle</strong> </td> <td class="data"> <strong>DB2</strong> </td> <td class="data"> <strong>Informix</strong> </td> </tr> <tr> <td class="data"> <strong>varchar</strong> </td> <td class="data">varchar(8000)</td> <td class="data">VARCHAR2(4000)</td> <td class="data">VARCHAR(32000)</td> <td class="data">VARCHAR(254)</td> </tr> <tr> <td class="data"> <strong>varchar(n)</strong> </td> <td class="data">varchar(n)</td> <td class="data">VARCHAR2(n)</td> <td class="data">VARCHAR(n)</td> <td class="data">VARCHAR(n)</td> </tr> <tr> <td class="data"> <strong>integer</strong> </td> <td class="data">int</td> <td class="data">INTEGER</td> <td class="data">INTEGER</td> <td class="data">INTEGER</td> </tr> <tr> <td class="data"> <strong>smallint</strong> </td> <td class="data">smallint</td> <td class="data">SMALLINT</td> <td class="data">SMALLINT</td> <td class="data">SMALLINT</td> </tr> <tr> <td class="data"> <strong>real</strong> </td> <td class="data">real</td> <td class="data">FLOAT</td> <td class="data">REAL</td> <td class="data">SMALLFLOAT</td> </tr> <tr> <td class="data"> <strong>double precision</strong> </td> <td class="data">float</td> <td class="data">DOUBLE PRECISION</td> <td class="data">DOUBLE</td> <td class="data">FLOAT</td> </tr> <tr> <td class="data"> <strong>numeric</strong> </td> <td class="data">numeric(38, 15)</td> <td class="data">NUMERIC(38, 15)</td> <td class="data">NUMERIC(32, 15)</td> <td class="data">DECIMAL(31, 15)</td> </tr> <tr> <td class="data"> <strong>numeric(p, s)</strong> </td> <td class="data">numeric(p, s)</td> <td class="data">NUMERIC(p, s)</td> <td class="data">NUMERIC(p, s)</td> <td class="data">DECIMAL(p, s)</td> </tr> <tr> <td class="data"> <strong>date</strong> </td> <td class="data">datetime</td> <td class="data">DATE</td> <td class="data">DATE</td> <td class="data">DATE</td> </tr> <tr> <td class="data"> <strong>time</strong> </td> <td class="data">datetime</td> <td class="data">DATE</td> <td class="data">TIME</td> <td class="data">DATETIME HOUR TO SECOND</td> </tr> <tr> <td class="data"> <strong>datetime, timestamp</strong> </td> <td class="data">datetime</td> <td class="data">DATE</td> <td class="data">TIMESTAMP</td> <td class="data">DATETIME YEAR TO FRACTION(5)</td> </tr> <tr> <td class="data"> <strong>varbinary</strong> </td> <td class="data">varbinary(8000)</td> <td class="data">RAW(2000)</td> <td class="data">VARCHAR(32000) FOR BIT DATA</td> <td class="data">BYTE</td> </tr> <tr> <td class="data"> <strong>varbinary(n)</strong> </td> <td class="data">varbinary(n)</td> <td class="data">RAW(n)</td> <td class="data">VARCHAR(n) FOR BIT DATA</td> <td class="data">BYTE</td> </tr> <tr> <td class="data"> <strong>long varbinary</strong> </td> <td class="data">image</td> <td class="data">BLOB</td> <td class="data">BLOB</td> <td class="data">BYTE</td> </tr> <tr> <td class="data"> <strong>long varchar</strong> </td> <td class="data">text</td> <td class="data">CLOB</td> <td class="data">CLOB</td> <td class="data">TEXT</td> </tr> <tr> <td class="data"> <strong>nvarchar</strong> </td> <td class="data">nvarchar(4000)</td> <td class="data">NVARCHAR2(4000)</td> <td class="data">VARCHAR(32000) FOR MIXED DATA</td> <td class="data">NVARCHAR(254)</td> </tr> <tr> <td class="data"> <strong>nvarchar(n)</strong> </td> <td class="data">nvarchar(n)</td> <td class="data">NVARCHAR2(n)</td> <td class="data">VARCHAR(n) FOR MIXED DATA</td> <td class="data">NVARCHAR(n)</td> </tr> <tr> <td class="data"> <strong>long nvarchar</strong> </td> <td class="data">ntext</td> <td class="data">NCLOB</td> <td class="data">DBCLOB</td> <td class="data">TEXT</td> </tr> </table> <br /> <br /> <a name="snpincobj" /> <h3>11.2.8. Objects created by incremental snapshot replication</h3> <p>Table "DB.DBA.RPLOG_<name>" (replay log) is created in Virtuoso database for replayed snapshot log entries.</p> <p>Table "RLOG_<name>" (snapshot log) is created in the source DSN. This table is attached as "<qual>"."<dsn>"."RLOG_<name>" if source table is an attached table. Other objects created by incremental snapshot replication in the source DSN are:</p> <ul> <li> <p> <strong>Virtuoso</strong> </p> <p>Triggers "<qual>_<owner>_<name>_I_log", "<qual>_<owner>_<name>_U_log" and "<qual>_<owner>_<name>_D_log" on replicated table. </p> </li> <li> <p> <strong>SQL Server</strong> </p> <p>Triggers "<name>_I_log", "<name>_U_log" and "<name>_D_log" on replicated table.</p> </li> <li> <p> <strong>Oracle</strong> </p> <p>Triggers "<name>_I_log", "<name>_U_log" and "<name>_D_log" on replicated table.</p> <p>Function OPL_GETUTCDATE().</p> </li> <li> <p> <strong>DB2</strong> </p> <p>Triggers "<name>_I", "<name>_U" and "<name>_D" on replicated table.</p> <p>Sequence opl_seq_rowguid.</p> </li> <li> <p> <strong>Informix</strong> </p> <p>Triggers "<name>_I_log", "<name>_U_log" and "<name>_D_log" on replicated table.</p> <p>Sequence opl_seq_rowguid.</p> <p>Stored procedures "<name>_R_proc" and "<name>_U_proc". </p> </li> </ul> <br /> <a name="snpbidirobj" /> <h3>11.2.9. Objects created by bi-directional snapshot replication</h3> <p>Table "DB"."DBA"."RPLOG_<name>" (replay log) is created in Virtuoso database for replayed snapshot log entries.</p> <p>Table "RLOG_<name>" (snapshot log) is created on the publisher and on subscribers. "RLOG_<name>" table and replicated table itself are attached from subscribers as "<qual>"."<dsn>"."RLOG_<name>" and "<qual>"."<dsn>"."<name>" respectively. Other objects created by bi-directional snapshot replication on the publisher and on subscribers are:</p> <ul> <li> <p> <strong>Virtuoso</strong> </p> <p>Triggers "<qual>_<owner>_<name>_I_log", "<qual>_<owner>_<name>_U_log" and "<qual>_<owner>_<name>_D_log" on replicated table. </p> </li> <li> <p> <strong>SQL Server</strong> </p> <p>Triggers "<name>_I_log", "<name>_U_log" and "<name>_D_log" on replicated table.</p> </li> <li> <p> <strong>Oracle</strong> </p> <p>Triggers "<name>_I_log", "<name>_IR_log", "<name>_U_log", "<name>_UR_log", "<name>_UD_log", "<name>_UDR_log", and "<name>_D_log" on replicated table.</p> <p>Function OPL_GETUTCDATE().</p> <p>Global temporary table "DLOG_<name>".</p> </li> <li> <p> <strong>DB2</strong> </p> <p>Triggers "<name>_I", "<name>_U", "<name>_UD" and "<name>_D" on replicated table.</p> <p>Sequence opl_seq_rowguid.</p> </li> </ul> <br /> <a name="replSYSTABLES" /> <h3>11.2.10. Replication system tables</h3> <a name="replSYS_SNAPSHOT" /> <h4>11.2.10.1. SYS_SNAPSHOT</h4> <div> <pre class="programlisting"> CREATE TABLE SYS_SNAPSHOT ( SN_NAME VARCHAR(255), SN_QUERY VARCHAR(1024), SN_LAST_TS VARBINARY(30), SN_IS_INCREMENTAL INTEGER, SN_SOURCE_TABLE VARCHAR(255), PRIMARY KEY (SN_NAME)) </pre> </div> <p> This table describes all defined snapshot replication relations. SN_NAME is the snapshot's name (destination table name). SN_QUERY is the query to get the data from. SN_LAST_TS is the time of the last update reflected for incremental snapshots (from the snapshot log's SNAPTIME). SN_IS_INCREMENTAL is null for nonincremental snapshots and non-null for the incremental ones. SN_SOURCE_TABLE is the source table for the incremental snapshots, NULL otherwise. </p> <p> This implies that an incremental snapshot can be updated in two ways - incrementally and nonincrementally as the nonincremental definition is a subset of the incremental definition. </p> <br /> <a name="replSYS_SNAPSHOT" /> <h4>11.2.10.2. SYS_SNAPSHOT_LOG</h4> <div> <pre class="programlisting"> CREATE TABLE SYS_SNAPSHOT_LOG ( SNL_SOURCE varchar (320) NOT NULL, SNL_RLOG varchar (320) NOT NULL, SNL_RPLOG varchar (320) NOT NULL, primary key (SNL_SOURCE) ) </pre> </div> <p> This table describes all defined snapshot replication relation logs. SNL_SOURCE is the base table for which a snapshot log is defined. SNL_RLOG is the name of the RLOG snapshot log auxiliary table. SNL_RPLOG is the name of the RPLOG snapshot log auxiliary table. </p> <br /> <br /> <a name="SNLOG" /> <h3>11.2.11. Table snapshot logs</h3> <p> In order to be able to create incremental snapshots on a table there should be a snapshot log table defined in the following fashion: </p> <div> <pre class="programlisting"> create table RLOG_sourcetable ( RLOG_sourcetable_primarykey_parts, ... SNAPTIME DATETIME, DMLTYPE CHAR(1), RLOG_ROWGUID VARCHAR(255), PRIMARY KEY (RLOG_*)); </pre> </div> <p> The name of the snapshot log table is constructed from the name of the source table by prefixing it with RLOG_ . </p> <p> Snapshot log table contains all the primary key parts of the source table prefixed with RLOG_ . </p> <p> Snapshot log table contains at most one row per updated source table row with the update time and update type in DMLTYPE (I - for insert, U - for update, D - for delete). </p> <p> Snapshot log table contains RLOG_ROWGUID column which uniquely identifies this snapshow log row. </p> <p>An updating procedure for incremental snapshot or bi-directional snapshot (table case) reads records from snapshot log table, ordered by SNAPTIME. Commits are performed each time an updating procedure notices that SNAPTIME has changed. In Virtuoso case (when native table is snapshot-replicated) all snapshot log records with equal SNAPTIME belong to the same transaction and all such records constitute almost a transaction (it is not exactly a transaction because only the last DML operation for each primary key in the source table is recorded in snapshot log table). So in Virtuoso case an updating procedure commits almost per transaction on source table. Snapshot log records are not transaction-bound in heterogeneous case however. Nevertheless, the technique used to determine when to commit described above prevents running out of transaction log or deadlocks even in heterogeneous case.</p> <p>Replay log is created on the server which performs sync (server on which snapshot log was create in ordinary case, publisher in bi-directional snapshot case) to handle correct snapshot log replaying. Replay log is purged periodically using <span class="computeroutput">REPL_PURGE_RPLOGS()</span> function in ordinary case and <span class="computeroutput">REPL_PURGE_URPLOGS()</span> in bi-directional case. The calls to this functions are automatically scheduled when snapshot log or bi-directional snapshot publication is created and there is no need to call them manually. </p> <p> There should be triggers on the source table to update the log like that: </p> <div> <pre class="programlisting"> create trigger xxx_I_log after insert on xxx { insert replacing RLOG_xxx values (xxx_pk, now(), 'I', uuid()); }; create trigger xxx_D_log after delete on xxx { insert replacing RLOG_xxx values (xxx_pk, now(), 'D', uuid()); }; create trigger xxx_U_log after update on xxx { if (OLD.xxx_pk differs from NEW.xxx_pk) { insert replacing RLOG_xxx values (OLD.xxx_pk, now(), 'D', uuid()); insert replacing RLOG_xxx values (NEW.xxx_pk, now(), 'I', uuid()); } else insert replacing RLOG_xxx values (xxx_pk, now(), 'U', uuid()); }; </pre> </div> <p> where xxx is the source table name, RLOG_xxx is the snapshot log table name, xxx_pk is a comma separated list of the source table's primary key parts. The names of the triggers are constructed from the fully qualified name (DB.DBA.Orders makes DB_DBA_Orders_I_log). </p> <p>Table snapshot logs and triggers are created automatically for Virtuoso and databases listed in <a href="SNAPSHOT.html#snpheter">Heterogeneous Snapshot Replication</a> section. </p> <br /> <table border="0" width="90%" id="navbarbottom"> <tr> <td align="left" width="33%"> <a href="replintro.html" title="Introduction">Previous</a> <br />Introduction</td> <td align="center" width="34%"> <a href="repl.html">Chapter Contents</a> </td> <td align="right" width="33%"> <a href="proctransrepl.html" title="Transactional Replication ">Next</a> <br />Transactional Replication </td> </tr> </table> </div> <div id="footer"> <div>Copyright© 1999 - 2009 OpenLink Software All rights reserved.</div> <div id="validation"> <a href="http://validator.w3.org/check/referer"> <img src="http://www.w3.org/Icons/valid-xhtml10" alt="Valid XHTML 1.0!" height="31" width="88" /> </a> <a href="http://jigsaw.w3.org/css-validator/"> <img src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!" height="31" width="88" /> </a> </div> </div> </body> </html>