| |
|
Easy Access
Replication: Part III Conflict Resolution and Error Atonement
The previous installments of this three-part series examined how to perform replication tasks - such as creating replicas and synchronizing databases - by using VBA code, which is quite straightforward. Now let's look at possible conflicts and data errors, and how to handle them - again, using VBA.
Conflicts and Errors Replicated databases introduce a new problem: What happens when users at different replicas simultaneously make incompatible modifications to the data or design? There are two broad categories of replication problems:
Understanding Conflicts Consider the simple case of two users simultaneously updating the same record in different replicas. (Of course, in a traditional, non-replicated database, the locking mechanism will prevent such a problem.) Each record is uniquely identified by the ReplicaID in the s_GUID field, which is a system field - one the user can't modify.
When the replicas try to synchronize, Access recognizes a conflict. The "winner" is the record that has been changed more times. (A change occurs when the record is committed.) In the event of a tie, the replica with the lowest ReplicaID prevails. However, Access automatically allocates the ReplicaID when the replica database is created. The ReplicaID can't be set or subsequently modified by the user; it's best described as "randomly deterministic." Although the assignment of ReplicaID is random, the choice of a conflict winner isn't.
The winning record is updated at both replicas. The "losing" replica creates, with the same design, a new table named TableName_Conflict, and inserts the losing record.
There's one exception to this data-conflict rule: Delete always "wins." For example, if one replica modifies a record that another replica deletes, the record will be deleted in both replicas.
When Access users open the table, they're informed that a conflict has occurred, and that the replica "lost" the conflict. ("Winning" users are never told that a conflict occurred.) The users are asked whether they want to resolve the conflict immediately. If so, the Access Conflict Resolver is called, and simply displays the "winning" and "losing" records side by side. Users have two choices:
Synchronization between replicas occurs on a record-by-record basis. Therefore, a conflict is reported on a single record in a single table. A group of records can't be tied together in an integral transaction during a synchronization. This means that it's possible for a database transaction at one replica, which modified perhaps six related records, to apply only five of the records with another replica during a synchronization. The sixth record isn't applied immediately, because it "lost" a conflict.
Using the Access Conflict Resolver Let's configure a sample database to demonstrate the Access Conflict Resolver. In Access, create a simple database called db3.mdb, with one table and one field:
Table = 'Customers' Field = 'Name', of type Text No primary key
Follow these steps:
2) From the Tools menu, choose Replication | Create Replica to make the database replicable and to replicate it. Call the replica Replica of db3.mdb. The original database is now the Design Master for the replica set. 3) In the Design Master (db3.mdb), edit the first record in the Customers table so that it now reads "Thomas." Close the Design Master. 4) Open Replica of db3.mdb, and edit the first record in the Customers table so that it now reads "Tommy". 5) Synchronize the replicas. (From the Tools menu choose Replication | Synchronize Now; then choose OK at any prompts.) A conflict will have occurred in one of the two replicas. It's impossible to predetermine which replica will "win" and which will "lose." You may need to reopen each replica; Access will prompt you that a conflict has occurred (see FIGURE 2). The replica with the conflict will have a new table named Customers_Conflict.
If you didn't receive a dialog box warning that a conflict had occurred, do the following:
6) Click Yes to resolve the conflicts now. 7) You're informed that there's one conflict in the Customers table. Click the Resolve Conflict button. The existing (winning) record is displayed next to the conflicting (losing) record (see FIGURE 3).
If you click the Keep Existing Record button, the conflict record will be deleted.
If you click the Overwrite with Conflict Record button, the losing record will update the database exactly as if the user had committed a regular update, by re-entering the losing data. At the next synchronization, the "new" record will be inserted at the remote replica.
When you resolve the conflict, the Customers_Conflict table is deleted.
Alternative Methods You may be tempted to replace the Access Conflict Resolver with a conflict resolver that better meets your business needs. For example, you may want to resolve conflicts so the most recent update automatically wins. You should consider a number of issues before doing this:
Care must be taken in selecting a conflict-resolution algorithm:
Your algorithm must not only deal with data conflicts, but also with any data and design errors (covered later in this article) that may occur during synchronizations. It's not prudent to simply register your new function to deal exclusively with data conflicts, and expect the Access Conflict Resolver to deal with the rest.
Only experienced users should try to resolve conflicts themselves, rather than let Access handle them. You should estimate at least 500 lines of VBA code to replace the Access Conflict Resolver. If your replacement doesn't correctly handle all the conflict and error conditions described in the following sections, your replicas may get corrupted and never converge.
A conflict-identifying algorithm. The following algorithm identifies replica tables that have outstanding conflicts. It takes the name of a replica and prints out any tables with conflicts.
' Find tables in conflict. Sub Resolve(strDBName As String)
Dim dbs As Database Dim tdf As TableDef
Set dbs = OpenDatabase(strDBName)
For Each tdf In dbs.TableDefs If (tdf.ConflictTable <> "") Then Debug.Print tdf.Name & " had a conflict" End If Next tdf
End Sub
The last-update-wins algorithm. The following algorithm is designed to work with replicas that, in each row, have a special Date/Time field (called Date) that records the time an update was made. It makes no allowances for the time zone of each replica (such as GMT or PST).
This is the pseudo-code for the algorithm:
Open database For Each table If there is a conflict table For each conflict row GUID == base data row GUID If conflict has a later date/time stamp field Then replace base row with conflict row delete conflict table row End If Next conflict table row End If Next table
Listing One shows the actual code.
Getting to Know Replication Errors Most users will only encounter data conflicts. However, replication errors can cause data inconsistencies between replicas. Each of the four categories of errors - locking errors, unique-key errors, referential-integrity errors, and table-level validation errors - require specific resolution.
Locking errors. These are the easiest to understand, and usually fix themselves without user intervention. They occur when one replica tries to synchronize with another, but the exchange fails because of a regular multi-user locking conflict at one of the replicas. The exchange tries a specified number of times to get a lock; if unsuccessful, it returns an error. However, if one of the replicas is opened exclusively, any attempt to get a lock will fail.
The solution is to ensure that the replicas won't always create locking failures - either by making sure there's no exclusive opening of the database, or by synchronizing at night, when there are no other users. If the synchronization is successful, the previous locking-error record will be deleted from the database.
Unique-key errors. These can occur when a unique-key constraint is placed on a table field. In a regular, non-replicated database, an attempt to enter two records with the same value - into a field defined as accepting only unique values - will fail. In a replicated database, it's quite possible to enter - in separate replicas - two records with, paradoxically, identical unique-key values. When an exchange is attempted, the uniqueness rule will reject each entry.
For example, consider a table with a unique index on order number. There are two replicas: one in Redmond and one in London. The Redmond replica enters an order and gives it the number 123. The London replica simultaneously enters a different order, and also gives it an order number of 123. Then the replicas synchronize. A unique-key synchronization error occurs (see FIGURE 4).
The attempt to insert the Redmond order into the London replica is rejected, because there's already an order with the number 123; therefore, the unique-key constraint rejects the second order. Similarly, the attempt to insert the London order into the Redmond replica is rejected, because there's already an order with the number 123. Hence, the unique-key constraint rejects the second order also.
The two replicas now have inconsistent data. The solution is to delete one of the records and re-enter the data using a new order number, so the uniqueness rule isn't violated.
Referential-integrity synchronization errors. Another source of errors can occur when there's an enforced relationship between tables, and cascading updates and deletes aren't enabled. Consider a simple database with two tables: Customers and Orders. Cust# is the primary key in the Customers table, and the foreign key in the Orders table. Referential integrity is enforced, and cascades aren't switched on.
Create a record in the Customers table with Cust# 123. Now create two replicas - one in London and one in Redmond. In the London replica, add a new record in the Orders table, with the foreign key Cust# 123. In the Redmond replica's Customers table, delete the record with Cust# 123. Now synchronize. The situation shown in FIGURE 5 occurs.
The Redmond replica tries to enter the new order received from London, but fails because the primary key no longer exists. The London replica tries to delete the Customers record, but fails because a related order exists. The solution is to delete the foreign-key record (the entry in the London replica's Orders table) and synchronize twice again. The first synchronization removes the error caused by the foreign key; the second removes the error caused by the primary key.
You can avoid this type of error by careful application design. If cascades for updates and deletes are always selected whenever referential integrity is enforced, many potential errors will never occur, and other potential errors will be resolved automatically after the subsequent synchronization. Whether errors are avoided or reported depends on the direction of the exchange.
For example, consider the earlier London/Redmond example, where London had deleted a Customers record (with the primary key) for which Redmond had entered a customer-order record (with the foreign key). Referential integrity is enforced, and cascading deletes are selected. If the synchronization first sends the foreign-key customer-order record from Redmond to London, an error is reported, because a customer record doesn't exist for that customer order. Simultaneously, the primary-key Customers-record delete is sent from London to Redmond, and succeeds. Cascades will delete the foreign-key customer-order record in the Redmond replica. The next synchronization sends the deleted customer-order record from Redmond to London, automatically removing the previous error.
Table-level validation errors. Table-level validation (TLV) errors can occur when a validation rule isn't applied to existing data. Suppose you have two replicas - one in London and one in Redmond. In the Redmond Design Master replica, you apply a rule that all employees must be age 18 or older. However, the London replica has simultaneously added an employee of age 17. Now the replicas exchange. The new London employee data is rejected at the Redmond Design Master replica.
The way to prevent such errors is to synchronize with all replicas immediately before applying a TLV rule, and again immediately after. If an error does occur and you want to insert the errant data, the rule must be removed, the synchronization retried, and the TLV rule reapplied.
Conclusion You may not run into all the errors discussed in this article when working with replication. But foreknowledge is crucial in making sure they don't occur, or at least being ready when they do.
This series has covered a lot of material about replication and Access. Now get in there - if you haven't already - and start adding replication functionality to your applications. Have fun!
This article was partially excerpted from F. Scott Barker's book, Access 97 Power Programming [QUE, 1997]. The author wishes to thank Microsoft's Tony Poll (known as "Replication Man").
F. Scott Barker is a former member of the Microsoft Access and FoxPro development teams, and now does contract development at Microsoft and other companies. Scott has trained for three years for Application Developers Training Company, and also is a regular speaker at Office conferences both national and international, including European Tech Ed 96. Besides having written for various Office magazines, including Smart Access and Access/Visual Basic Advisor, Scott is also the author of several books, including User Friendly Using Microsoft Access 97 [QUE, 1997]. He can be reached at mailto:FSBarker@MSN.Com.
Listing One - The Last-Update-Wins Algorithm Private Sub Conflicts_Click()
' Resolve conflicts for DB. Dim dbs As DATABASE Dim tdf As TableDef Dim rstWin As Recordset ' Create record set for conflict winner. Dim rstConflict As Recordset ' Create record set for conflict Loser. Dim fld As Field
' Open the database and tables. Set dbs = CurrentDb()
For Each tdf In dbs.TableDefs ' Does an associated conflict table exist? If (tdf.ConflictTable <> "") Then ' Get the records from the conflict table. Set rstConflict = dbs.OpenRecordset( _ "SELECT * FROM " & tdf.ConflictTable & _ " ORDER BY s_GUID")
' Get the records from the base table. Set rstWin = dbs.OpenRecordset("SELECT * FROM " & _ tdf.Name & " ORDER BY s_GUID")
' Process each record, starting at the first. rstWin.MoveFirst rstConflict.MoveFirst
While Not rstConflict.EOF ' If date on the conflict row > date on winning ' row, then resubmit the conflict row. If rstConflict("s_GUID") = rstWin("s_GUID") Then If rstConflict("Date") > rstWin("Date") Then rstWin.Edit For Each fld In rstConflict.Fields rstWin(fld.Name) = _ rstConflict(fld.Name) Next fld rstWin.UPDATE End If
' Remove conflicting record and clean up. rstConflict.Delete rstConflict.MoveNext End If rstWin.MoveNext Wend
rstWin.Close rstConflict.Close End If
Next
End Sub End Listing One
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||