A Sitecore Upgrade Devlog - Part III: Stop content tests, update XM and xDB databases, clean up content databases

This is the third part of a devlog I started weeks ago. Today, I am covering the upgrade of the Sitecore databases, what went wrong and how we fixed that.

Disclaimer: This series of blog posts do not release you from reading the Upgrade Guide. There is quite a lot of material that I do not cover here.

The process of upgrading databases has changed in the version 10.1.0. In the past, you had to install a Sitecore package and hope for the best. Now, you simply turn off your Sitecore instances, run a couple of scripts - some might take a day or two to finish - and voilà!

As I mentioned in the previous blog post, I am upgrading a Sitecore XP + SXA from version 9.0.2 to 10.1.0.

Stop all content tests

As described in the upgrade guide, you have to stop all running content tests. This process is pretty simple, although you can only do it through the application and there is no supported way via script or command line. See Cancel active tests for more information.

Once all tests have been stopped, we now stop the Sitecore environment.

Stop the Sitecore environment

This step has been introduced in Sitecore 10.1.0.

Turn off the application roles like CM or CD by stopping the sites in the IIS Management Console. You can also stop the websites by running the command iisreset /stop.

Also stop any xDB web sites and services. You can stop Sitecore services with the following Powershell commands:

Get-Service *prefix.* | Stop-Service

All the upgrades will be carried out by the SQL scripts and you don't have to run an .update package.

Upgrade the databases

Extract the database upgrade scripts from the Database Upgrade Script.zip file that you downloaded earlier. Also, mostly important, refer to the Upgrade Guide for any caveats.

XM databases

The upgrade guide has a table which gives you what scripts and which order you should run.

To avoid confusion, you can rename the scripts in the file system accordingly:


These scripts are quite straightforward, they run in a couple of minutes and we didn't have any issues.

xDB databases

After finishing the XM databases, now it’s time for the xDB databases. These are the ones that take a long time, and ideally you want to run it in a VM that will not sleep and disconnect you from the SQL Server instance.

If you do not have the option of running these SQL scripts in a VM, you can then use the following Powershell script to keep your computer awake for as long as it needs:

Write-Host "Keeping awake..."
while (1) {
  $wsh = New-Object -ComObject WScript.Shell
  $wsh.SendKeys('+{F15}')
  Start-Sleep -seconds 59
}

Not everything went smoothly as we expected. I describe below some of the issues we encountered.

Issue #1: Reference Data Script Part 1 failed

After running the SXP_referencedata_Part1.sql, it returned the following message:

Migrating definition types...
15 have been migrated.
Migrating definitions...
15165753 definitions have been migrated.
340 definitions cannot be migrated.
Upgrade stopped. Fix all the issues and run this script again.
Upgrade (part 1) has not been completed.

This message is missing the piece "Length of # definition monikers exceeds 300 characters. Consider truncate, change or remove them." - as documented in the Upgrade Guide.

I ran the following query, but no results were returned. That meant, there are no definitions to be migrated despite the script saying otherwise.

SELECT COUNT_BIG(*) FROM [xdb_refdata].[DefinitionMonikers] WHERE LEN([Moniker]) > 300

It turns out, the problem was the in way the Reference Data database was exported from the production environment.

This error occurs because the .bacpac file is not transactional, so new rows written to the database while the .bacpac is being generated will end up corrupting the index. The database either must have no other users connected making writes (not practical in a production environment), or it should be copied to a new database and only then create a .bacpac from the copy.

How to solve that?

  1. Copy the target database, which will return straight away, but the database will take some time to copy. This operation will create a full transactional copy:
CREATE DATABASE <name> AS COPY OF <original_name>

2. Find the status of your copy operation:

SELECT * FROM sys.dm_database_copies

3. Generate a bacpac file on the copied database, which isn't being used by anyone.

4. Delete the copied database, and you'll have a working bacpac file.

More on Unable to restore bacpac due to foreign key conflict.

Run the SXP_referencedata_Part1.sql script again.

Issue #2: Reference Data Script Part 2 failed

Exporting database as a copy, fixed the Reference Data Script Part 1 issue. However, the Reference Data Script Part 2 failed with the following:

Deleting [xdb_refdata].[SaveDefinitions]...
Deleting [xdb_refdata].[SaveDefinitionCultures]...
Deleting [xdb_refdata].[GetDefinitions]...
Deleting [xdb_refdata].[GetByDefinitionTypes]...
Deleting [xdb_refdata].[DeleteDefinitions]...
Altering [xdb_refdata].[DefinitionBatch]...
Altering [xdb_refdata].[DefinitionCriteria]...
Altering [xdb_refdata].[DefinitionCultureBatch]...
Altering [xdb_refdata].[DefinitionCultureResults]...
Altering [xdb_refdata].[DefinitionKeys]...
Altering [xdb_refdata].[DefinitionResults]...
Altering [xdb_refdata].[MonikerDefinitions]...
Creating [xdb_refdata].[DeleteDefinitions]...
Creating [xdb_refdata].[GetByDefinitionTypes]...
Msg 468, Level 16, State 9, Procedure GetByDefinitionTypes, Line 147 [Batch Start Line 272]
Cannot resolve the collation conflict between "Latin1_General_CS_AS" and "SQL_Latin1_General_CP1_CI_AS" in the equal to operation.
Creating [xdb_refdata].[GetDefinitions]...
Msg 468, Level 16, State 9, Procedure GetDefinitions, Line 75 [Batch Start Line 463]
Cannot resolve the collation conflict between "Latin1_General_CS_AS" and "SQL_Latin1_General_CP1_CI_AS" in the equal to operation.
Creating [xdb_refdata].[SaveDefinitionCultures]...
Creating [xdb_refdata].[SaveDefinitions]...
Removing [xdb_refdata].[GetDefinitionID]...
Removing [xdb_refdata].[GetDefinitionMoniker]...
Removing [xdb_refdata].[GetDefinitionMonikerHash]...
Creating [xdb_refdata].[DeleteSingleDefinition]...
Creating [xdb_refdata].[GetSingleDefinition]...
Creating [xdb_refdata].[SaveSingleDefinition]...
Creating [xdb_refdata].[SaveSingleDefinitionCulture]...
Altering [xdb_refdata].[GrantLeastPrivilege]...
Upgrade (part 2) has been done successfully.

According to the Sitecore Support:

It looks like the described issue is caused by bug #351688. As a workaround, they provided us with a modified query script to run in place of SXP_referencedata_Part2.sql.

If you face the same issue, please file a ticket to the Sitecore Support.

Issue #3: Sitecore Xdb Collection Part 2 failed

After we run the Sitecore Xdb Collection Update tool as described in the Appendix B of the Upgrade Guide, we run the SXP_collection_Part2.sql against the Shards databases.

We saw the following exceptions in the results:

Msg 515, Level 16, State 2, Line 151
Cannot insert the value NULL into column 'ShardKey', table 'prefix-Shard0.xdb_collection.ContactFacets_Staging'; column does not allow nulls. UPDATE fails.
Msg 515, Level 16, State 2, Line 151
Cannot insert the value NULL into column 'ShardKey', table 'prefix-Shard0.xdb_collection.ContactIdentifiers'; column does not allow nulls. UPDATE fails.
Msg 515, Level 16, State 2, Line 151
Cannot insert the value NULL into column 'ShardKey', table 'prefix-Shard0.xdb_collection.ContactIdentifiers_Staging'; column does not allow nulls. UPDATE fails.
Msg 515, Level 16, State 2, Line 151
Cannot insert the value NULL into column 'ShardKey', table 'prefix-Shard0.xdb_collection.ContactIdentifiersIndex'; column does not allow nulls. UPDATE fails.
Msg 515, Level 16, State 2, Line 151
Cannot insert the value NULL into column 'ShardKey', table 'prefix-Shard0.xdb_collection.ContactIdentifiersIndex_Staging'; column does not allow nulls. UPDATE fails.
Msg 515, Level 16, State 2, Line 151
Cannot insert the value NULL into column 'ShardKey', table 'prefix-Shard0.xdb_collection.Contacts_Staging'; column does not allow nulls. UPDATE fails.
Msg 515, Level 16, State 2, Line 151
Cannot insert the value NULL into column 'ShardKey', table 'prefix-Shard0.xdb_collection.DeviceProfiles'; column does not allow nulls. UPDATE fails.
Msg 515, Level 16, State 2, Line 151
Cannot insert the value NULL into column 'ShardKey', table 'prefix-Shard0.xdb_collection.DeviceProfiles_Staging'; column does not allow nulls. UPDATE fails.
Msg 515, Level 16, State 2, Line 151
Cannot insert the value NULL into column 'ShardKey', table 'prefix-Shard0.xdb_collection.InteractionFacets_Staging'; column does not allow nulls. UPDATE fails.
Msg 515, Level 16, State 2, Line 151
Cannot insert the value NULL into column 'ShardKey', table 'prefix-Shard0.xdb_collection.Interactions'; column does not allow nulls. UPDATE fails.
Msg 515, Level 16, State 2, Line 151
Cannot insert the value NULL into column 'ShardKey', table 'prefix-Shard0.xdb_collection.Interactions_Staging'; column does not allow nulls. UPDATE fails.

For some records in the tables mentioned above, the ShardKey field was empty. Since we are in the development stage, we deleted those records, but there were thousands of those and we were not sure if this was the right solution.

The Sitecore Xdb Collection Update tool ran successfully and did not list any failed processed items.

How to solve that?

This section of the Upgrade Guide was a bit confusing for us, but here is what we missed:

1) .bacpac files

We had the same issues with the Shards databases due to the way the databases were exported. We had to follow the copying procedure as described in the above before exporting them as .bacpac. Since those databases are constantly being written, the chances of having orphan records and corrupting the indexes are high.

Here are the exceptions we saw when importing the .bacpac files of the Shards databases.

Microsoft.Data.Tools.Diagnostics.Tracer Error: 19 : 2021-03-18T13:03:13 : Retry requested: Retry count = 1. Delay = 00:00:00, SQL Error Code = -2146232060, SQL Error Number = 547, Can retry error = False, Will retry = False System.Data.SqlClient.SqlException (0x80131904): The ALTER TABLE statement conflicted with the FOREIGN KEY constraint "FK_InteractionFacets_Interactions". The conflict occurred in database "prefix-Shard0", table "xdb_collection.Interactions", column 'InteractionId'.

Those exceptions were caused because the database was exported while being written.

2) Running the XConnect Upgrade Tool

I had only run the XConnect Upgrade Tool with the -c contacts arguments. You should run it in sequence with all the arguments.

E.g.

\XConnect Upgrade Tool>.\SitecoreXdbCollectionUpdateTool.exe -s "user id=sa;password=Pass12345;data source=ServerName;Initial Catalog=xp90-shard0-db" -c deviceProfiles
\XConnect Upgrade Tool>.\SitecoreXdbCollectionUpdateTool.exe -s "user id=sa;password=Pass12345;data source=ServerName;Initial Catalog=xp90-shard0-db" -c deviceProfileFacets
\XConnect Upgrade Tool>.\SitecoreXdbCollectionUpdateTool.exe -s "user id=sa;password=Pass12345;data source=ServerName;Initial Catalog=xp90-shard0-db" -c contacts
\XConnect Upgrade Tool>.\SitecoreXdbCollectionUpdateTool.exe -s "user id=sa;password=Pass12345;data source=ServerName;Initial Catalog=xp90-shard0-db" -c contactIdentifierIndexes
\XConnect Upgrade Tool>.\SitecoreXdbCollectionUpdateTool.exe -s "user id=sa;password=Pass12345;data source=ServerName;Initial Catalog=xp90-shard0-db" -c contactPercentile
\XConnect Upgrade Tool>.\SitecoreXdbCollectionUpdateTool.exe -s "user id=sa;password=Pass12345;data source=ServerName;Initial Catalog=xp90-shard0-db" -c interactionPercentile

3) YAGNI

In the end, we didn’t need to run the script SXP_collection_Part2.sql nor run the XConnect Upgrade Tool.

If you do not plan on splitting the shards databases into more databases (default is 2), then you should simply run the SXP_collection_Part1.sql script.

A note from the Sitecore Support Team:

I have clarified this with the product team.
The documentation flow should be corrected as follows:
1) Run SXP_collection_Part1.sql script.
2) If there are no plans to use the Shard Split/Merge tool, one can stop here.
3) If there are plans to use Shard Split/Merge tool, the following actions need to be performed:
3.1) Clear all *_Staging tables.
3.2) Run SitecoreXdbCollectionUpdateTool.exe for all operations.
3.3) Run SXP_collection_Part2.sql script.

Clean up the content databases

The Sitecore.UpdateApp tool is designed to remove the unmodified Sitecore items from the Core, Master, and Web databases. The Sitecore items are not removed from a database if you have modified them.

In the upgrade guide you follow the instructions on how to setup this tool and how to run it. After everything is properly configured, you simply run the following in Command Prompt.

Sitecore.UpdateApp.exe clean

You should see something like the following:

This process ran with no issues.

Conclusion

This ends the third part.

We started with stopping content tests, followed by turning off the Sitecore instances. We went through upgrading the databases and solving some issues along the path.

Coming next, I'll be writing about the next steps where you install and configure Sitecore XP 10.1.0 (no, I'm not writing about how to install Sitecore), upgrading SXA and some third-party modules.

Credits

Photo by Aditya Vyas on Unsplash

comments powered by Disqus