Part 1 – Upload Synchronization where the Client and Server Primary Keys are different


Intrigued by this post Strategies for retrofitting existing databases with sync in the Sync Forums, I decided to write a short test if this could be done.

This is not an uncommon scenario when retrofitting existing databases for synchronization where a lot of times we don’t have the flexibility to re-design the table schemas to make them work with Sync Framework. And this is further complicated by the ease of use of using Identity columns as Primary Keys (mostly with the same seed value). However, when the eventual aggregation/consolidation of multiple client databases into a central store is not taken into consideration, this results to primary key collisions in the central store.

To summarize the scenario in the forum post, there are multiple client databases whose primary key is based on an identity column. Naturally, when this multiple client databases are aggregated/consolidated to a central store, the PK values will collide. As a workaround, a composite PK is implemented on the central table not only to guarantee uniqueness but to identify which client a particular row has come from.

So let me share how to go about doing an Upload only scenario where the Client and Server have different PKs and address the scenario being asked in the forum post above.

Here’s the SQL script to represent the client table:

Code Snippet
  1. SET ANSI_NULLS ON
  2. GO
  3.  
  4. SET QUOTED_IDENTIFIER ON
  5. GO
  6.  
  7. SET ANSI_PADDING ON
  8. GO
  9.  
  10. CREATE TABLE [dbo].[testtable](
  11.     [id] [int] IDENTITY(1,1) NOT NULL,
  12.     [testcolumn] [varchar](50) NULL
  13. CONSTRAINT [PK_testtable] PRIMARY KEY CLUSTERED
  14. (
  15.     [id] ASC
  16.   )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS= ON) ON [PRIMARY]
  17. ) ON [PRIMARY]
  18.  
  19. GO
  20.  
  21. SET ANSI_PADDING OFF
  22. GO

Note the Identity column used as the Primary Key

Here’s the SQL script to represent the table where the different client databases will be uploaded:

Code Snippet
  1. SET ANSI_NULLS ON
  2. GO
  3.  
  4. SET QUOTED_IDENTIFIER ON
  5. GO
  6.  
  7. SET ANSI_PADDING ON
  8. GO
  9.  
  10. CREATE TABLE [dbo].[TestTable](
  11.     [ClientId] [uniqueidentifier] NOT NULL,
  12.     [Id] [int] NOT NULL,
  13.     [testcolumn] [varchar](50) NULL
  14. CONSTRAINT [PK_TestTable] PRIMARY KEY CLUSTERED
  15. (
  16.     [ClientId] ASC,
  17.     [Id] ASC
  18.   )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS= ON) ON [PRIMARY]
  19. ) ON [PRIMARY]
  20.  
  21. GO
  22.  
  23. SET ANSI_PADDING OFF
  24. GO

Note the composite Primary Key which is based on the client’s original Id column and a ClientId GUID to uniquely identify each client.

Let’s provision the  client:

Code Snippet
  1. //setup the client connection
  2. var clientConn = new SqlConnection("Data Source=localhost; Initial Catalog=Client1; Integrated Security=True");
  3.  
  4. // define a new scope named TestScope
  5. var clientScope = new DbSyncScopeDescription("TestScope");
  6.  
  7. // get the description of the TestTable table from the client database
  8. var clientTable = SqlSyncDescriptionBuilder.GetDescriptionForTable("TestTable", clientConn);
  9.  
  10. // add the table description to the sync scope definition
  11. clientScope.Tables.Add(clientTable);
  12.  
  13. // create provisioning scope based on the scope definition
  14. var clientProvision = new SqlSyncScopeProvisioning(clientConn, clientScope);
  15.  
  16. if (!clientProvision.ScopeExists("TestScope"))
  17. {
  18.     // skipping the creation of table since table already exists on client
  19.     clientProvision.SetCreateTableDefault(DbSyncCreationOption.Skip);
  20.                 
  21.     //apply the scope definition
  22.     clientProvision.Apply();
  23. }

Let’s provision the server:

Code Snippet
  1. //setup the server connection
  2. var serverConn = new SqlConnection("Data Source=localhost; Initial Catalog=Aggregator; Integrated Security=True");
  3.             
  4. // define a new scope named TestScope
  5. var scopeDesc = new DbSyncScopeDescription("TestScope");
  6.  
  7. // get the description of the TestTable table from the server database
  8. var tableDesc = SqlSyncDescriptionBuilder.GetDescriptionForTable("TestTable", serverConn);
  9.             
  10. // add the table description to the sync scope definition
  11. scopeDesc.Tables.Add(tableDesc);
  12.             
  13. // create provisioning scope based on the scope definition
  14. var serverProvision = new SqlSyncScopeProvisioning(serverConn, scopeDesc);
  15.  
  16. if (!serverProvision.ScopeExists("TestScope"))
  17. {
  18.     // skipping the creation of table since table already exists on server
  19.     serverProvision.SetCreateTableDefault(DbSyncCreationOption.Skip);
  20.  
  21.     //apply the scope definition
  22.     serverProvision.Apply();
  23. }

The above provisioning script actually looks exactly the same except that they’re ran against different connections and the client and server GetDescriptionForTable calls will retrieve different table structures and PKs.

Since the client PK is only based on the Id column and the server has a composite key based on ClientId and Id, we need to inject the ClientId before the row get’s transmitted to the server.

The server provider needs both ClientId and Id to determine if it already has those rows.

So let’s try to inject a ClientId based on some arbitrary value (you can retrieve the ClientId whichever way you want, but for purposes of this test, we just hardcode a specific GUID).

Let’s subscribe to the Client’s (LocalProvider)  ChangesSelected event:

Code Snippet
  1. static void ClientChangesSelected(object sender, DbChangesSelectedEventArgs e)
  2. {
  3.     //let's check if we're synching the table we're interested
  4.     if (e.Context.DataSet.Tables.Contains("testtable"))
  5.     {
  6.         var dataTable = e.Context.DataSet.Tables["testtable"];
  7.                 
  8.         //let's add the new column
  9.         dataTable.Columns.Add(new DataColumn("ClientId"));
  10.  
  11.         //let's assign the client guid
  12.         var clientId = "89198182-24C9-4CD7-8427-F6790270E62C";
  13.  
  14.         foreach (DataRow row in dataTable.Rows)
  15.         {
  16.             // if it's an update or an insert, just assign the clientId
  17.             if (row.RowState == DataRowState.Modified || row.RowState == DataRowState.Added)
  18.             {
  19.                 row["ClientId"] = clientId;
  20.             }
  21.             // if it's a delete, undo the delete status,
  22.             // assign the guid, accept the changes for the quid,
  23.             // then revert back the row as deleted
  24.             else if (row.RowState == DataRowState.Deleted)
  25.             {
  26.                 row.RejectChanges();
  27.                 row["ClientId"] = clientId;
  28.                 row.AcceptChanges();
  29.                 row.Delete();
  30.             }
  31.         }
  32.                 
  33.     }
  34. }

To expound on the code above, what it essentially does is add the ClientId column to the DataTable, then loop thru each row in the DataTable to assign the ClientId value.

However, for Deletes, we cannot simply assign the ClientId value since you cannot access the row and assign a value to any of its columns when it’s flagged as deleted. So we need to undo the Deleted status by calling RejectChanges. We then assign the value and call AcceptChanges to set the ClientId value. Then we revert back the row as Deleted.

That’s it, the client should be able to synchronize/upload its changes to the server.

Code Snippet
  1. // create the sync orchestrator
  2. var syncOrchestrator = new SyncOrchestrator();
  3.  
  4. // set local provider of orchestrator
  5. syncOrchestrator.LocalProvider = new SqlSyncProvider("TestScope", clientConn);
  6.  
  7. // set the remote provider of orchestrator
  8. syncOrchestrator.RemoteProvider = new SqlSyncProvider("TestScope", serverConn);
  9.  
  10. // set the direction of sync session to Upload and Download
  11. syncOrchestrator.Direction = SyncDirectionOrder.UploadAndDownload;
  12.  
  13. // subscribe for errors that occur when applying changes to the client
  14. ((SqlSyncProvider)syncOrchestrator.LocalProvider).ChangesSelected += ClientChangesSelected;
  15. ((SqlSyncProvider)syncOrchestrator.RemoteProvider).ChangesSelected += ServerChangesSelected;
  16.            
  17. // execute the synchronization process
  18. var syncStats = syncOrchestrator.Synchronize();
  19.  
  20. // print statistics
  21. Console.WriteLine("Start Time: " + syncStats.SyncStartTime);
  22. Console.WriteLine("Total Changes Uploaded: " + syncStats.UploadChangesTotal);
  23. Console.WriteLine("Total Changes Downloaded: " + syncStats.DownloadChangesTotal);
  24. Console.WriteLine("Complete Time: " + syncStats.SyncEndTime);
  25. Console.WriteLine(String.Empty);
  26. Console.ReadKey();

Give it a go and let me know how it goes.

32 comments

  1. […] Idle Thoughts ← Part 1 – Upload Synchronization where the Client and Server Primary Keys are diffe… […]

  2. Can I match server rows not by primary key, but just by pair of columns?
    I still want to have on server primary key as Identity, e.g.
    CREATE TABLE [dbo].[testtable](
    [id] [int] IDENTITY(1,1) NOT NULL,–Server Identity ID
    [ClientId] [uniqueidentifier] NULL,
    [IdFromClient] [int] NULL,
    [testcolumn] [varchar](50) NULL
    )
    and pair ClientId, [IdFromClient] should be used to match.
    I also can’t put uniqueIndex on this pair, because both these columns can be null.

    1. if pair ClientId, IdFromClient can be null, how do you find the matching record in the server? will you update everything with a null pair ClientId, IdFromClient pair?
      the actual matching of PK is not just limited to the SQL used by Sync Framework, it is also used to keep track of what was synched in the “sync knowledge”. in your case, the sync knowledge knows about the identity column PK but not the pair ClientId, IdFromClient pair.
      When it does a select of what has changed, Sync Framework will examine its knowledge if the rows selected by the selectchanges has already been synched and it will use the identity column PK to find the matching entry in the knowledge.

      let me know if got your question right.

      cheers,

      junet

  3. Rich Thurman · · Reply

    How do you deal with Identity cloumns that are on the server but are not part of a primary key? I have a situation where the server has an identity column but its not part of the primary key. I tried doing reverse of what was suggested in this post, or rather just removing the identity column before it gets to the client.

    However, why I try to provision the server I get this error: “Cannot update identity column ‘ID’.

    ID is my identity column on the server and the client really doesnt care about it, but anytime a client creates a new row and syncs it up to the server, the server needs to keep track with an int value.

    I found this thread, but no answer is clear:

    http://social.msdn.microsoft.com/Forums/en-NZ/sqlazurelabssupport/thread/b0843084-70f2-4470-9a29-c65e3d83baa8#x_mce_temp_url#

    Do you have any suggestions?

    1. the error you’re getting is not actually specific to Sync Framework. by default SQL Server doesnt allow you to insert a value or update a column tagged as an identiy column unless you explicitly allow it using SET IDENTITY_INSERT. In the case of Sync Framework, it’s treating the Identity column just like any other column that is not a PK and is including it in the insert/update SPs. you can directly edit the insert/update SPs in the server to exclude the identity column or exclude the column during the provisioning on the server.

      let me know if i didnt get you scenario right.

      cheers,

      junet

      1. Felix Fuentes · ·

        I have a situation where I need to the data in the Identity column so I cant remove it altogether. Also, I can’t edit the Insert/Update SPs because they are not created yet . I am seeing the “Cannot update identity column ID” during the provision process. I get the error at serverTemplate.Apply() :

        private static void ProvisionDatabase()
        {
        //Step 1.) Provision database for all clients
        SqlConnection serverConn = new SqlConnection(@”Data Source=R9MT8C6-30338-L\SQLSERVER2008R2;Initial Catalog=SyncDB;Integrated Security=True”);

        String scopeName = “TestScope”;
        DbSyncScopeDescription scopeDesc = new DbSyncScopeDescription(scopeName);
        scopeDesc.UserComment = “Template for Orders and Products table”;

        DbSyncTableDescription productTableDesc = SqlSyncDescriptionBuilder.GetDescriptionForTable(“Products”, serverConn);
        scopeDesc.Tables.Add(productTableDesc);

        scopeDesc.Tables[“Products”].Columns[“GUID”].IsPrimaryKey = true;

        scopeDesc.Tables[“Products”].Columns[“ID”].IsPrimaryKey = false;
        scopeDesc.Tables[“Products”].Columns[“ID”].IsNullable = true;

        DbSyncTableDescription ordertableDesc = SqlSyncDescriptionBuilder.GetDescriptionForTable(“Orders”, serverConn);
        scopeDesc.Tables.Add(ordertableDesc);

        SqlSyncScopeProvisioning serverTemplate = new SqlSyncScopeProvisioning (serverConn, scopeDesc, SqlSyncScopeProvisioningType.Template);

        serverTemplate.Tables[“Products”].AddFilterColumn(“GUID”);
        serverTemplate.Tables[“Products”].FilterClause = “([side].[GUID] = @GUID)”;

        SqlParameter param = new SqlParameter(“@GUID”, SqlDbType.UniqueIdentifier);
        serverTemplate.Tables[“Products”].FilterParameters.Add(param);

        serverTemplate.SetCreateProceduresDefault(DbSyncCreationOption.Create);

        if (!serverTemplate.TemplateExists(scopeName))
        {
        serverTemplate.Apply();
        }
        }

        I also set the column to allow nulls so when I insert a record, Sync will auto update the server db with the next seed value. Suggestions?
        -F

      2. try running SQL Profiler to figure out he offending statement. Provisioning is not supposed to be updating an identity column and am not entirely sure what would cause the error that you got.

      3. good to hear ti worked.

  4. Excellent Post.

    In my case I have a database where the majority of the PK’s are identity autoincrementing ints.

    we are now in a position were we need to make this data available in a distributed environment (i.e mobile devices, external software etc) and of course kept in sync.

    As I cant change from using the current PK sync I was thinking that i could add a GUID to the columns which would be used as sync.

    on the client I have complete control over how that database structure is created so I was thinking there i could have the GUID as the identifying key
    on first sync the server would send across the required data (and GUID).

    if the client wants to do inserts they can do and then sync (sending across GUID) – although server would auto increment the PK (but we can still match on the GUID for future syncs).

    Does this sound like the right solution and if so can the Microsoft Sync framework handle this task?

    1. it should be possible. when you provision the server, exclude the Identity column and specify that GUID is the PK via Sync Fx. So your physical table still has the Identity column but as far as Sync Fx is concerned, it thinks the Guid is the PK.

      as for the client, you would just provision it as it is with Guid as the PK.

      If you’re not in a hurry, i’ll do blog post after office hours.

      1. That would be fantastic if you could do a post

        sounds great if its possible using that method though!

  5. […] have previously blogged about synching tables with different Primary keys (see: Part 1 – Upload Synchronization where the Client and Server Primary Keys are different  and Part 2 – Bidirectional Synchronization where the Client and Server Primary Keys are […]

  6. How to sync between two tables having only one common column.

  7. Is it possible to sync between MS SQL server and Pervasive database.
    In my case the Source is MS sql server and the destination is Pervasive.
    I want to sync from both the side.
    Please let me know if this is possible.

    Thanks

    1. Sync Framework doesnt come with a sync provider for Pervasive database so you will have to build your own custom database sync provider.

  8. Is it a requirement to have a uniqueidentifier column for PK in Sync Framework?
    Tables in our existing centralDB use the ClientID as int, This ClientID is getting from Clients table, where the actual ID is an identity column. Can we re-use this instead of creating a new uniqueidentifier column together with new [Id] [int] NOT NULL column?

    It would look like this:

    CREATE TABLE [dbo].[TestTable](
    [ClientId] [int] NOT NULL,
    [Id] [int] NOT NULL,
    [testcolumn] [varchar](50) NULL
    CONSTRAINT [PK_TestTable] PRIMARY KEY CLUSTERED
    (
    [ClientId] ASC,
    [Id] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS= ON) ON [PRIMARY]
    ) ON [PRIMARY]

    1. it’s not a requirement to use a GUID. It’s a requirement though to make sure that PKs is unique in the sync community. you don’t two replicas generating the same PK as that will cause a PK conflict on the server during insert and will be raised as a conflict.

  9. I am trying to use this scenario (Client and Server have different PKs) with Parameter-based Filtered Templates in N-Tier (hub – spoke).

    I am following N-Tier sample in http://code.msdn.microsoft.com/windowsdesktop/Database-SyncSQL-Server-e97d1208

    Background:
    Server: SQL Server 2008 R2
    Clients: SQL Express 2008 and SQL Express LocalDB
    Client should get only relevant data filtered based on ClientID.

    How can we apply this scenario to remove clientId before Synchronize in N-Tier system?

    There is no ChangeSelected event in ((SqlSyncProviderProxy)syncOrchestrator.RemoteProvider)..

    As in your example if it is direct connection we can use this
    ((SqlSyncProvider)syncOrchestrator.RemoteProvider).ChangesSelected += new EventHandler(ServerChangesSelected);

    But with remote proxy it requires type SqlSyncProviderProxy.

    We can’t use something like this:
    ((SqlSyncProviderProxy)syncOrchestrator.RemoteProvider).ChangesSelected += new EventHandler(Server.ChangesSelected);

    Can we do this on server side ?

    1. what you have on the client side is just a proxy. you will have to catch the ChangesSelected event on the WCF service side. besides, i dont think you want to server side to be firing an event on the client side over WCF.

      1. Thanks for quick reply. I figured it out. And now it is working well. Your other blogs regarding templates and filtering also helped lot. Thanks again.

  10. I don’t have relations between the columns of two table like if “id” column in students and “studentId” column in student subjects and i have about more the 150 table to sync give me any advice because its very difficult to make relation between them all.

    1. not sure exactly what help you’re asking here. are you asking how create the relationships? sync tables with relationships? etc…? btw, i deleted your duplicate comments on my other posts. no need to post the same question over and over again.

      1. in your given code all table have primary and foreign key relation defined but the problem i am facing is i am unable to create those relation.

      2. Sorry, but I still don’t follow. What’s stopping you creating the relations?

  11. Hi , I have a scenario Upload only.
    client has primary key as recordId , server has recordId and clientId as primary key. All the ids are Guid.
    the synchonization works when there are no reference conflict errors. how ever , when delete changes failed to apply because of reference conflict . it will not pick up in the next synchonization and the change become discarded. this does not happen if I remove the clientId from server primary key, the failed changes retried in the next sync.

    Any suggestions appreciated.
    Thanks

    1. how are you resolving the conflict? you can try setting it to RetryOnNextSync

      1. Hi JuneT
        I have already set the RetryOnNextSync but it does not work. right now I have added another scopes only for Delete , but it does not work very well. I saw somewhere that RetryNextSync only work on 2 tier model ? I’m using the 3 tier sync with wcf, so im not sure if that is the case. but it worked before I include the clientId to the primary key even without setting RetryNextSync it just pick it on next sync.

      2. RetryNextSync should work 2-tier or n-tier. Sync Fx is not actually aware of whether it’s running N-tier or not. You should enable verbose logging so you get to see what’s going on.

      3. Hi June,
        Is it because I have clientId column on server as filter column ? The verbose tracing show the action RetryNextSync but it does not retry . If i don’t include the clientId in primary keys it works.
        The problem started after I added the clientId in to server primary keys.

  12. and what’s the actual error/conflict you’re getting on ApplyChangeFailed event? Sync Fx do not detect or sync schema changes and your PK change may have broken the sync.

    1. The Error is delete statement conflict with reference constraint. why would sync break after change PK?. I did deprovisioned and re provision templates and scopes after making schema changes.

Leave a reply to jtabadero Cancel reply