Overview
SQL Availability Groups are a great way of providing system high availability for the Synergetic database layer with an exact mirror copy of the database located on additional servers on separate hardware along with auto-failover option. Several clients have actively deployed their Synergetic instance to multi-node clusters providing their organisation with increased server up-time and redundancy.
Synergetic recommend engaging with third party SQL Server specialists to help with configuration of the platform and synchronisation of the required server level dependencies. Please note the setup below is considered an advanced setup and would therefore only be recommended for schools with in-house DBA expertise or ongoing IT managed services.
Sample Diagram - Synergetic environment with 3x node SQL Availability Group
**Note that the above includes a third node for read-only replica server which is considered optional for data extracts or backups for example. This is not actively used by the Synergetic applications however it may be possible to direct reports traffic to this node with manual configuration. Clients have commonly set up two node clusters without the read-only replica and it is still possible to make the second node readable if required (however additional SQL Server licencing would be required). Regardless of the number of clusters it is recommended to use dynamic quorum and implement either a file share or Azure cloud witness to provide improved cluster health.
Licencing
Adequate SQL Server licencing is required, also if the failover nodes are being used for workloads, see here for details:
https://www.kevinrchant.com/2020/02/17/significant-sql-server-2019-licensing-changes/
Licencing can be very costly depending on the allocated CPU cores and number of failover nodes required - before commencing any design work implementing SQL AGs, check your Microsoft agreements to see what is included.
The above three node setup would require SQL Enterprise licence for each CPU core on both server SQL1 AND SSRS. Server SQL2 does not need additional licence as it is purely for DR purposes, so the licence is assumed transfer from SQL1 in the event of failover.
If SQL Enterprise is not an option for your organisation (due to $$$) then SQL 2016 Standard "Basic Availability Groups" may be an alternative option. See here for FAQ:
Basic Availability Groups limit read access to the primary node only and also to one database per availability group and has not yet tested with Synergetic environments.
Cluster Quorum
In order to maintain access to the Availability Group (cluster) it is important that quorum be maintained. Therefore careful consideration and configuration must be done to ensure that the system can maintain the required number of votes in the event of any of the servers going offline.
A file share or Azure Cloud Witness is generally recommended so that the cluster can utlise dynamic quorum.
https://docs.microsoft.com/en-us/windows-server/failover-clustering/deploy-cloud-witness
See below sample (from the above link) of a four node AG with Azure Cloud Witness assisting with dynamic quorum vote. Keep in mind that the setup below would require three of the four nodes to have SQL Server licence per core so it is generally more feasible to implement two node AG with SQL Enterprise or with SQL Standard Basic Availability Groups.
Server Dependency Synchronisation
Whilst the databases can be configured in a mirrored or warm standby state, there are several design considerations and additional maintenance steps required to ensure that the required server level dependencies. Without these dependencies in place on the active SQL Server, there are some components in of Synergetic that will not function correctly.
The main issues from Synergetic perspective are keeping the server dependencies in sync which are namely:
- SQL agent jobs
- dbmail profiles and operators
- server level logins - It is easier to maintain with v68+ due to the contained database logins but there are still likely to be some server level logins required.
- There is one server level login required to send emails (zyoursynergeticdatabasename_ServerLogin), use the following SQL Agent Job on each no
- server level properties and database trust settings
See here for external recommendations for potential automation of the server level object sync:
- https://sqlsunday.com/2016/10/11/how-to-sync-logins-between-availability-group-replicas/
- https://www.sqlshack.com/synchronizing-sql-server-instance-objects-availability-group/
Database Settings
Synergetic's database and server level properties and the required settings are as follows (current for v68/v69):
- is_trustworthy_on, --1
- suser_sname(owner_sid), --sa
- is_db_chaining_on, --0
- is_auto_shrink_on, --0
- is_broker_enabled, --1
- snapshot_isolation_state_desc, --1
- is_read_committed_snapshot_on, --1
- compatibility_level, --130
- page_verify_option_desc, --CHECKSUM
- is_auto_update_stats_on, --1
- is_auto_update_stats_async_on, --1
is_auto_close_on --0
Synergetic have a standard database restore script/procedure that should be used to ensure the initial defaults are set correctly for any system, regardless of if SQL AGs are implemented or not. When SQL AGs are implemented, some of the above settings need to be set on each node, for example the is_trustworthy_on property will need to be enabled on each node and it can only be changed when the node is active. Please refer to the SQL Agent Job - Synergetic DB Health Check.sql listed under Useful Scripts below.
Failover Support
SQL AG Failover mode can be set to automatic or manual and it is recommended to review this Microsoft Knowledge Base article for the required configuration and conditions that will trigger a failover.
On a failover event the Synergetic Windows client and Arrival/Departure Terminal do not support automatic resume however prompts are displayed to the user when the software detects the session has failed, asking the user if they would like to re-connect to SQL Server. This is because the Windows client requires a persistent session (and in many cases multiple sessions) to SQL Server. The web applications do not keep persistent sessions open to SQL Server so failover occurs more seamlessly for the user. To cleanly re-establish the sessions it is recommended to restart the application if the connection ever drops from a failover event.
The window below is displayed in Synmain when the SQL session drops:
Click 'Yes' then the warning below is then displayed (as stated above it is best to restart the application if the error above appears):
If the session can re-connect the message below is displayed then the application can still generally be used without restarting:
Other Considerations
Nodes should share the same hardware configuration, notably drive assignment. If synchronous commit is utilised then the server should have identical hardware for performance. Also need to consider network latency between cluster nodes and failover behaviour, in particular if setting synchronous commit with auto failover.
A minimum number of cluster votes is required to maintain an active cluster, it is therefore important to consider placement of the second and subsequent nodes and ensure adequate platform configuration to maintain a healthy cluster.
Useful scripts
**The scripts below have been tested on SQL 2016.
Description | Script file |
---|---|
SQL Agent job that can be run on each node to sync the zServerLogin email user account and create the required permission set. The permission set in the script is valid and confirmed for v68+v69.
| SQL Agent Job - Sync zServerLogin User.sql |
SQL Agent job to ensure that Synergetic database has the required properties set - this calls a standard Synergetic function to initialise the database on the server. Alerts are then configured by another script to trigger execution of this job on failover event.
| SQL Agent Job - Synergetic Database Initialise.sql |
Four SQL events that indicate a failover event has occurred - this is to send notifications and also trigger the above initialisation job if required.
| |
SQL Agent job that executes a stored procedure to ensure that the required properties exist for the database. Script checks are valid for v68/v69 properties.
| |
Script to automate fail back to a preferred node - run the attached job on the primary replica This script uses a function call to sys.fn_hadr_is_primary_replica which requires SQL 2016+ replace <database name here> with the required Synergetic database and <availabilty group name here> with the AG name.
|
Useful Resources
- Brent Ozar provides free detailed checklists for server and Availability Group provisioning: https://www.brentozar.com/sql/sql-server-alwayson-availability-groups/
- SQL Server Setup Checklist: https://www.brentozar.com/archive/2014/06/sql-server-setup-checklist-free-ebook-download/.
- Extract of Brent Ozar video overview of availability groups: Intro to SQL Availability Groups.mp4
- Microsoft's Overview of Always On Availability Groups (SQL Server): https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/overview-of-always-on-availability-groups-sql-server?view=sql-server-2017
- Utilising Azure Cloud Witness for assisting with dynamic quorum: https://docs.microsoft.com/en-us/windows-server/failover-clustering/deploy-cloud-witness