Comparing data across databases

SQL code across databases with different collation sequences

The main issue with collation sequences is when you are comparing table string fields across databases.  

The below example assumes the tempdb database uses a different collation sequence to the current database.

 

Collation example
CREATE TABLE #test
(
  ID        INTEGER,
  Surname   VARCHAR(50),
  Given1    VARCHAR(30)
)
SELECT * FROM #test WHERE Surname = 'rose' -- this will work as we are not comparing with data from a table in the current database
 
-- however if we try
SELECT * FROM #test WHERE Surname IN (SELECT Surname FROM community)
-- this causes a collation conflict as we are comparing string values across databases with different collation sequences

 

The resolution is to specify the collation on each column that will be compared with a field in the source database.

 

Correct collation order handling
CREATE TABLE #test
(
  ID        INTEGER,
  Surname   VARCHAR(50) COLLATE DATABASE_DEFAULT,
  Given1    VARCHAR(30)
)

SELECT * FROM #test WHERE Surname IN (SELECT Surname FROM community) -- now it works

 

Note that you only need to specify collation sequence on fields you are doing cross database comparisons on.  The Given1 field in the example above does not need it as we are only displaying it.

It is also good practice to specify the collation with table variables too as they may utilise the temp database with larger volumes of data.

Note also if the temp table is created by a SELECT INTO there are no collation issues as it copies the table structure including collation sequences.  The issue occurs only when we CREATE and then INSERT.

 

 

Collation example using system tables
-- In this example we are using system tables therefore we cannot change the collation sequence of the table itself
-- We need to override the collation sequence on the actual field comparison
SELECT
    sd.DatabaseName
FROM SynDatabases sd
WHERE NOT EXISTS
    (
        SELECT
            *
        FROM master.sys.databases md
        WHERE md.name = sd.DatabaseName COLLATE DATABASE_DEFAULT
    )

 

For most client organisations this change will be invisible.  However, if the organisation has created their own user triggers or user stored procedures that use the tempdb and the Synergetic database uses a different collation sequence to tempdb, they may encounter an error such as the below:

Collation error
Cannot resolve the collation conflict between "SQL_Latin1_General_CP1_CI_AS" and "Latin1_General_CI_AS" in the equal to operation.

If this type of error is encountered the collation sequence will need to be explicitly specified as outlined in the next section.