Page Comparison

Excerpt

Background

A collation sequence is a database setting that controls how string values are ordered.

In Synergetic V67 and all prior versions the collation sequence was required to be the SQL collation 'SQL_Latin1_General_CP1_CI_AS'. The SQL server must have been installed with that specific collation sequence and that determined the collation sequence of the master and temp databases of that SQL Server instance.

This caused the following limitations at a client organisation:

The Synergetic databases had to be installed on a SQL instance configured with the specific SQL collation sequence required by Synergetic.
If the host SQL Server instance was installed with a different collation sequence the instance could not be used and had to be reinstalled.
The client organisation could not change to an accent insensitive collation sequence to change search functionality.

Alongside these limitations, it seems that SQL_ collation sequences are no longer being updated and will eventually be phased out in favour of Windows collation sequences. Some of the newer Unicode characters are not in the SQL collation but do exist in Windows ones.

Changes in V68

Changes have been made to support different collation sequences

...

to remove these limitations and future proof Synergetic.

In V68, the collation sequence of the Synergetic database can be changed to any Case Insensitive (CI) collation sequence and is also permitted to be different to the collation sequence of the SQL Server instance.

By default, the Synergetic database in V68 will be changed to the Windows collation sequence Latin1_General_CI_

...

Code Block

language	text
title	Collation error

Cannot resolve the collation conflict between "SQL_Latin1_General_CP1_CI_AS" and "Latin1_General_CI_AS" in the equal to operation.

If this type of error is encountered the collation sequence will need to be explicitly specified as outlined in the next section.

SQL code across databases with different collation sequences

The main issue with collation sequences is when you are comparing table string fields across databases.

The below example assumes the tempdb database uses a different collation sequence to the current database.

Code Block

language	sql
title	Collation example

CREATE TABLE #test
(
  ID        INTEGER,
  Surname   VARCHAR(50),
  Given1    VARCHAR(30)
)
SELECT * FROM #test WHERE Surname = 'rose' -- this will work as we are not comparing with data from a table in the current database
 
-- however if we try
SELECT * FROM #test WHERE Surname IN (SELECT Surname FROM community)
-- this causes a collation conflict as we are comparing string values across databases with different collation sequences

The resolution is to specify the collation on each column that will be compared with a field in the source database.

Code Block

language	sql
title	Correct collation order handling

CREATE TABLE #test
(
  ID        INTEGER,
  Surname   VARCHAR(50) COLLATE DATABASE_DEFAULT,
  Given1    VARCHAR(30)
)

SELECT * FROM #test WHERE Surname IN (SELECT Surname FROM community) -- now it works

Note that you only need to specify collation sequence on fields you are doing cross database comparisons on. The Given1 field in the example above does not need it as we are only displaying it.

It is also good practice to specify the collation with table variables too as they may utilise the temp database with larger volumes of data.

Note also if the temp table is created by a SELECT INTO there are no collation issues as it copies the table structure including collation sequences. The issue occurs only when we CREATE and then INSERT.

Code Block

language	sql
title	Collation example using system tables

-- In this example we are using system tables therefore we cannot change the collation sequence of the table itself
-- We need to override the collation sequence on the actual field comparison
SELECT
    sd.DatabaseName
FROM SynDatabases sd
WHERE NOT EXISTS
    (
        SELECT
            *
        FROM master.sys.databases md
        WHERE md.name = sd.DatabaseName COLLATE DATABASE_DEFAULT
    )

AI

Benefits of Alternate Collation Sequences

The main noticeable difference of using alternate collation sequences is when performing searching:

You can have case sensitive/insensitive searching.
Accent insensitive searching, e.g. looking for "Andre" will reveal results for Andre and André

For most client organisations this change will be invisible. However, if the organisation has created their own user triggers or user stored procedures that use the tempdb and the Synergetic database uses a different collation sequence to tempdb, they may encounter an error such as the below:

Code Block

language	text
title	Collation error

Cannot resolve the collation conflict between "SQL_Latin1_General_CP1_CI_AS" and "Latin1_General_CI_AS" in the equal to operation.

If this type of error is encountered the collation sequence will need to be explicitly specified as outlined in the next section.

SQL code across databases with different collation sequences

The main issue with collation sequences is when you are comparing table string fields across databases.

The below example assumes the tempdb database uses a different collation sequence to the current database.

Code Block

language	sql
title	Collation example

CREATE TABLE #test
(
  ID        INTEGER,
  Surname   VARCHAR(50),
  Given1    VARCHAR(30)
)
SELECT * FROM #test WHERE Surname = 'rose' -- this will work as we are not comparing with data from a table in the current database
 
-- however if we try
SELECT * FROM #test WHERE Surname IN (SELECT Surname FROM community)
-- this causes a collation conflict as we are comparing string values across databases with different collation sequences

The resolution is to specify the collation on each column that will be compared with a field in the source database.

Code Block

language	sql
title	Correct collation order handling

CREATE TABLE #test
(
  ID        INTEGER,
  Surname   VARCHAR(50) COLLATE DATABASE_DEFAULT,
  Given1    VARCHAR(30)
)

SELECT * FROM #test WHERE Surname IN (SELECT Surname FROM community) -- now it works

Note that you only need to specify collation sequence on fields you are doing cross database comparisons on. The Given1 field in the example above does not need it as we are only displaying it.

It is also good practice to specify the collation with table variables too as they may utilise the temp database with larger volumes of data.

Note also if the temp table is created by a SELECT INTO there are no collation issues as it copies the table structure including collation sequences. The issue occurs only when we CREATE and then INSERT.

Code Block

language	sql
title	Collation example using system tables

-- In this example we are using system tables therefore we cannot change the collation sequence of the table itself
-- We need to override the collation sequence on the actual field comparison
SELECT
    sd.DatabaseName
FROM SynDatabases sd
WHERE NOT EXISTS
    (
        SELECT
            *
        FROM master.sys.databases md
        WHERE md.name = sd.DatabaseName COLLATE DATABASE_DEFAULT
    )

Contained Databases

Further issues are apparent when using contained databases with OPENXML and getting errors along the line of Cannot resolve the collation conflict between "Latin1_General_100_CI_AS_KS_WS_SC" and "Latin1_General_CI_AS" in the equal to operation.

...

language	sql
title	OPENXML Sample

...

Further information:

Child pages (Children Display)

Version	Old Version 26	New Version Current
Changes made by	Andrew Paydon	James Overell
Saved on	Apr 03, 2017	Oct 29, 2018

Versions Compared

Key

Background

Changes in V68

SQL code across databases with different collation sequences

Benefits of Alternate Collation Sequences

SQL code across databases with different collation sequences

Contained Databases