Wednesday, April 15, 2020

Sql Standards


SQL Server Standards and Best Practices

Overview

This document is intended as a guideline for naming conventions, coding standards, development strategies and recommendations.  Deviations from these standards and best practices should be documented in the script where the deviation occurs.

Naming Standards

General Naming Practices

·         Be descriptive with object names while keeping names to a reasonable length.
·         Limit the use of abbreviations.
·         Use Camel Case for object names; exception the ‘d’ in ID should be capitalized
·         All objects names should include their schema.
·         Avoid underscores within object names unless documented as a separator below.
·         Avoid white space and special characters.
·         Avoid SQL Keywords (if brackets are required it’s a Keyword).

Table/View Names

·         Tables should use plural form of nouns and accurately reflect the data contained. 
·         Tables should not be named for specific applications or companies.
  --Correct
  dbo.Players, dbo.Users, Security.Roles, Bonusing.Events, CrusieLine.Voyages, Patron.Groups

  --Incorrect
  dbo.Player, dbo.Values, SpeedMediaEvents, dbo.RCCLVoyages, dbo.CDS_Group

Column Names

·         Columns should use singular form of nouns.
·         If the column is a datetime, date, or time datatype, the name should include the base type.
·         If the column contains JSON include that as you would datetime, date, time.
·         If the column is a temporal duration it should include the units (min, sec, ms, …) in the name.
·         Use Begin and End in place of To, From, Start in column names.
·         Use InsertedDate[time] and UpdatedDate[time] in place of CreatedDatetime, LastEdited, etc.
  --Correct
  PlayerID, UserID, UpdatedDatetime, PlayedSecs, BeginTime

  --Incorrect
  Playerid, UsersID, LastEdited, PlayedDuration, StartTime

Default Names

·         All defaults must be named except those on table variables and temp tables.
·         Defaults on temp tables should NEVER be named.
·          Use the pattern df__
  --Correct
  df_Players_InsertedDatetime

  --Incorrect
  PlayersInsertedDatetime, CurDate, Now

Constraint Names

·         All constraints must be named except those on table variables and temp tables.
·         Constraints on temp tables should NEVER be named.
·         Check Constraint should use the pattern ck__.
·         Foreign Key Constraints should use the pattern fk___.
·         See Indexes section for primary key constraints.
  --Correct
  ck_Config_FirstDayOfWeekValidDay, fk_Players_User_HostUserID

  --Incorrect
  CHECK_FirstDayOfWeek, ck_ValidDay, fk_PlayersHost

Indexes

·         All indexes should be named except those on table variables and temp tables.
·         Indexes on temp tables should NEVER be named.
·          Use the pattern __
·         Prefix in order are:
o    pk = Primary Key
o    u  = Unique (not required for primary keys)
o    c  = Clustered
o    nc = Nonclustered
o    f  = Filtered
·         If the column names are too long they can be replace with a description

Procedures

·         Use the pattern usp_<[Table/Object]Description>_.
·         The prefix isp can replace usp for internal stored procedure not called by anything outside of the database.
·         Table/Object should be the first part of the name, so all the procedures for the given Table/Object are listed together.
·         If the procedure is inserting or updating multiple records use the plural form of the table/object name.
·         If the procedure is used for a report, use the report name instead of the table/object name.
·         Suffixes:
o    ins = insert
o    upd = update
o    del = delete
o    sel = select
o    rpt = report data (if used by report for selecting parameters use sel)
o    mrg = merge (Any combination of merge, update, insert and/or delete on primary object)
  --Correct
  dbo.usp_Player_ins, dbo.usp_PlayersOnFloor_sel

  --Incorrect
  dbo.usp_PlayersInsert, dbo.usp_getOnFloorPlayers

Functions

·         Use the pattern udf_<[Object/Noun]Description>[_sel]
·         If the return type is a table, use the suffix _sel. 
·         If the return type is a singleton, there should be no suffix.
·         Do not use a verb at the beginning of a function name such as Get, Select, Fetch, Calculate, Current, etc.
  --Correct
  dbo.udf_GamingDate, dbo.udf_TripCalc_sel

  --Incorrect
  dbo.udf_GetGamingDate, dbo.function_TripCalc

Table Types

·         Table types created for Table Value Parameters should be declared in the procedure script that uses them as a parameter type.
·         Table types created for Table Value Parameters should not be used as parameter types by multiple procedures. 
·          Table types should follow the pattern t.

Configuration Keys

·         Configuration Keys , such as those used in the SystemSettings table, should use periods or underscores as the separator to indicate grouping.  Periods are the preferred choice.
·         All system settings should have at least one parent group.
·          As with column names Configuration Keys should include units such as Hrs, Mins, Secs, MS for durations or other forms of measurement.

Schema Design

General Design Goals

·         Data validation should be handled in all layers.
·         Logic should be done where it is the most efficient.
·         Formatting of data should typically not be done in the database.
·         All data access should be done using Stored Procedures. 

Tables Design

·         All table changes most include logic to convert the existing data.  The only exceptions being when the data is unable to be converted, table is 100% system initialized data or the table is for staging data holding only temporary data.

Primary/Clustered Key

·         All tables should have a primary key.
·         All tables should have a clustered key
o    Should be narrow to save space in table and all indexes since it is the pointer back to the table
o    Should be incremental to allow all inserts to create a hotspot that can be cached and reduce fragmentation
o    Should be DECLARED unique otherwise 4 bytes are added to key for uniquification.
·         Common pattern is to use an identity column as the primary key and the clustered key.

Columns

·         Use data type best suited to the data
o    Use nchar and nvarchar over char or varchar for new designs to support Unicode data. 
o    Use nchar for text of fixed length values
o    Use nvarchar for text of variable length
o    Try and keep varchar lengths to exponents of 2, (16,32,64,128,256..).  This makes it easier to remember lengths.
o    Use bit for toggle-types values (Yes/No, On/Off ..) rather than char
o    Avoid using LOBS such as varchar(max), varbinary(max), or XML in large tables.  Large tables should be kept as narrow as possible and prevented at all costs from a page exceeding 8K causing Row-Overflows.
o    Use XML/JSON for loosely structured data and keep it as short as possible.  It’s better to have more rows of shorter XML/JSON then few rows of larger XML/JSON.
o    Use JSON columns over XML.  JSON should use camelCase, include JSON in name and have a check constraint using the ISJSON function to make sure the string is valid JSON.
o    Use the smallest precision that is guaranteed to meet the requirements.
o    Use date when time is not needed.
o    Use time when date is not needed.
o    Use datetime2 when both date and time are required.
§  Use precision of 0 for datetime2 for schedules or when ms are not required
§  Use precision of 7 for datetime2 for transactions and everything else
·         All columns should be defined with NULL or NOT NULL without exception. 
·         Minimize the use of NULLs as they often confusing for the business logic and UI code.  Any expression that refers to a nullable column is at risk of a false NULL output if the NULL value is not handled properly.

Defaults

·         Limit the use of defaults; defaults may hide bugs by allowing procedures to not reference columns that are required.

Constraints

·         Use check constraints and foreign keys to guarantee data validation.
·         Constraints both check and foreign keys do affect performance, so consider the cost.

Indexes (nonclustered)

·         Create indexes based on expected usage of the table and to cover referential integrity conditions.
·         If adding additional columns to cover a query that doesn’t add selectivity use the include list instead of keyed columns which requires less space and overhead. 
·         The clustered key is inherently an “included” column so avoid adding it to an index, unless it’s needed for selectivity. 
·         If the keyed value is mostly NULL, and null is never searched on, consider using a filter to exclude the null values so the selectivity is higher and the size and cost of the index is greatly reduced. 

Trigger Design

·         Avoid triggers, they slow down IO by causing reads on the log; and increase the complexity of debugging and trouble shooting. 
·         Move logic from triggers to constraints and procedures.

Procedure Design

·         Only return records and columns actually needed, to reduce overhead.
·         Design procedures to reduce the number of round trips to the database server.
·         Return aggregates (SUM, AVG…) of data over details to reduce overhead.
·         Return multiple records sets for Master/Detail searches instead of forcing clients/mid-tiers to make calls for each master’s details.
·         If returning a single ID, use an output parameter it’s more efficient for a single value.
·         If returning more than a few columns or multiple rows use a result set.
·         Use TVP (Table Value Parameters) when inserting multiple records.
·         Don’t create over head of TVPs if there is not a need to insert multiple records.

Function Design

·         Create functions for frequently used or complex logic.
·         Use functions for more readable check constraints or check constraints that need to query other records and/or tables.

SQL CLR Design

·         SQL CLR should only be used for small simple functions that can’t otherwise be done in T-SQL.
·         The use of CLR for a function must be approved by all Data Architects.
·         NO DML statements should be performed.
·         Functions only, no procedures, triggers or datatypes should be created.
·         CLR functions should only be consumed by T-SQL code.
·         All functions just like existing functions, should be fully covered by unit tests.

Syntax Formatting

Comments and Documentation

·         Comments within code should be reserved for clarification of complex logic or notes as to why a best practice was not followed. 
·         Comments within code should use the single comment line syntax (two dashes) or entered on the line above the line being addressed.
·         Documentation for the database object is maintained in the script header and enclosed in block comment (/* */).
·         Each entry in the object header change should contain  Date, Version/Branch, Author, and Reference Numbers
·          All dates should be in the YYYY-MM-DD format.
/*

DATE          VERSION               AUTHOR                  REFERENCE
2015-05-10    Dev\12.10.ProjectX    FirstName LastName      39383,39384
Initial Version

2016-01-18    Dev\12.10.ProjectY    FirstName LastName      45871,45872
Added new parameter @OptionalParam

*/

General T-SQL Guidelines

·         All statements should be terminated with a semicolon.
·         Keywords and system functions should be in upper case.
·         Datatype should be in lower case.
·         Object names should be in CamelCase.
·         All indentions should be 2 spaces.
·          Column and value lists should be organized appropriately for readability.  Avoid the need to scroll to view entire list.
  --Correct
  SELECT
    PlayerID,
    FirstName,
    LastName
  FROM dbo.Players;

  SELECT PlayerID, FirstName, LastName
  FROM dbo.Players;

  --Incorrect
  SELECT PlayerID,
    FirstName, LastName
  FROM PLAYERS;

·         When breaking up long statements use keywords FROM, WHERE, GROUP BY, HAVING and ORDER BY to start new lines, which should be aligned vertically.
  --Correct
  SELECT p.PlayerID, p.GUID, COUNT(c.PlayerID) NumberOfCards
  FROM dbo.Players p
    LEFT JOIN dbo.Cards c on c.PlayerID=p.PlayerID
  WHERE p.Active = 1
  GROUP BY p.PlayerID, p.GUID
  HAVING COUNT(c.PlayerID) <= 3;

  --Incorrect
  SELECT p.PlayerID, p.GUID, COUNT(1) NumberOfCards
  FROM dbo.Players p LEFT JOIN dbo.Cards c on c.PlayerID=p.PlayerID
  WHERE p.Active = 1 GROUP BY p.PlayerID, p.GUID
  HAVING COUNT(c.PlayerID) <= 3;

·         Column lists, JOINs or conditions when starting a new line should be intended from the parent SELECT, FROM, WHERE, GROUP BY, HAVING or ORDER BY clauses. 
·         JOIN keywords should be qualified with a join type (LEFT, RIGHT, INNER, FULL, CROSS) . 
·         If CASE statements are broken across multiple lines indent and vertically align WHEN/ELSE clauses.
·         All lock hints including NOLOCK should use the WITH keyword.
·         Closing parenthesis of derived tables, sub queries and column lists should line up vertically with the opening parenthesis when spanning multiple lines.
  --Correct
  SELECT p.PlayerID, c.NumberOfCards
  FROM dbo.Players p
    LEFT JOIN
    (
      SELECT PlayerID, COUNT(1) NumberOfCards
      FROM dbo.Cards WITH (NOLOCK)
      WHERE Active=1
    ) c on c.PlayerID=p.PlayerID;

  --Incorrect
  SELECT p.PlayerID, c.NumberOfCards
  FROM dbo.Players p (NOLOCK)
    LEFT JOIN (SELECT PlayerID, COUNT(1) NumberOfCards FROM dbo.Cards
  WHERE Active=1) c on c.PlayerID=p.PlayerID;

SELECT

·         Do not use the equal operator to alias a column.
  --Correct
  SELECT PlayerID, Int1+Int2 Ints FROM dbo.Players;
  SELECT PlayerID, Int1+Int2 as Ints FROM dbo.Players;

  -- Incorrect
  SELECT PlayerID, Ints = Int1+Int2 FROM dbo.Players;

INSERT

·         INTO keyword is optional in an insert statement.
·         Use the SELECT form in favor of the VALUES form of an INSERT statement.
·         If using a constant or variable that is not clearly named after the column, alias the column.
  --Correct
  INSERT dbo.Players
  (PlayerID, FirstName, LastName, DefaultDenom)
  SELECT @PlayerID, @FirstName, @LastName, 0 DefaultDenom;

  INSERT INTO dbo.Players
  (
    PlayerID, FirstName,
    LastName, DefaultDenom
  )
  SELECT
    @PlayerID, @FirstName,
    @LastName, 0 DefaultDenom;

  --Incorrect
  INSERT dbo.Players
  (PlayerId, FirstName, LastName, DefaultDenom)
  VALUES (@PlayerID, @FirstName, @LastName, 0);

UPDATE

·         SET keyword should be on the same line as the UPDATE keyword.
  --Correct
  UPDATE dbo.Players SET
    FirstName=@FirstName
  WHERE PlayerID=@PlayerID;

  UPDATE dbo.Players SET FirstName=@FirstName WHERE PlayerID=@PlayerID;

  --Incorrect
  UPDATE dbo.Players
  SET FirstName=@FirstName
  WHERE PlayerID=@PlayerID

MERGE

·         All WHEN clauses should line up vertically with MERGE clause.
·         Avoid using DELETE without a delete bit, which frequently causes unintentional deletes. 
·         UPDATE, INSERT and DELETE keywords maybe on same line as WHEN or separate line
  --Correct
  MERGE dbo.Players t
  USING @PlayersTVP s on t.PlayerID=s.PlayerID
  WHEN MATCHED AND s.Deleted=0 THEN UPDATE SET
    FirstName=s.FirstName
  WHEN NOT MATCHED THEN INSERT
    (PlayerID, FirstName)
    VALUES (s.PlayerID, s.FirstName)
  WHEN MATCHED AND s.Deleted=1 THEN DELETE;

  WITH Params AS (SELECT @PlayerID PlayerID, @FirstName FirstName, @Deleted Deleted)
  MERGE dbo.Players t USING Params s on t.PlayerID=s.PlayerID
  WHEN MATCHED AND s.Deleted=0 THEN
    UPDATE SET
      t.FirstName=s.FirstName
  WHEN NOT MATCHED THEN
    INSERT (PlayerID, FirstName)
    VALUES (s.PlayerID, s.FirstName)
  WHEN MATCHED AND s.Deleted=1 THEN
    DELETE;

  --Incorrect
  MERGE dbo.Players t USING @PlayersTVP s on t.PlayerID=s.PlayerID
  WHEN MATCHED AND s.Deleted=0 THEN UPDATE SET FirstName=s.FirstName,
  LastName=s.LastName WHEN NOT MATCHED THEN INSERT (PlayerID, FirstName)
  VALUES (s.PlayerID, s.FirstName) WHEN MATCHED AND s.Deleted=1 THEN DELETE;

Patterns & Practices

Queries

Object Names should always include schema name

·         All object names should include the schema name.  This prevents multiple lookups to resolve the name.  It also prevents some issues with plan caches not being used properly.

Never use SELECT * or INSERT without a column list

·         Column lists should always be provided so additional columns are not returned unless they are actually needed.  Developers reading the code in the future will know what columns are actually used.

Avoid using DISTINCT when possible

·         DISTINCT is a very expensive operator, make sure it’s actually required and not just used to address a bad join condition.  If it’s actually needed make sure to comment the reason it is needed.

Favor UNION ALL over UNION when possible

·         Use UNION ALL as opposed to UNION whenever possible which performs a DISTINCT.  As noted above DISTINCT is an expensive operator and frequency is over used. If a UNION is actually needed make sure to comment the reason it is needed.

Always use table hint “WITH (NOEXPAND)” with Index Views

·         SQL Server Developer and Enterprise editions will use the indexed view without this hint; however SQL Server Standard edition will not. 
·         Example: a view which uses two indexed views without this hint may cause SQL Server standard to preform table scans against the base tables instead of index seeks against the index views.
  CREATE VIEW dbo.AccountBalance
  AS
  SELECT AccountID, SUM(Amount) Balance
  FROM
  (
    SELECT AccountID, Amount FROM dbo.AccountBalanceLive WITH (NOEXPAND)
    UNION ALL
    SELECT AccountID, Amount FROM dbo.AccountBalanceArchive WITH (NOEXPAND)
  ) b
  GROUP BY AccountID;
  go

NOLOCK Hints

·         Use the NOLOCK hints on tables that frequently have lock contention such as Ticketing or CDS_SlotMast.
·         Use the NOLOCK hints on long running queries particularly reports that are more likely to cause lock contention.
·         DO NOT use NOLOCK hints on every table of every query.

OPTION (RECOMPILE)

·         Use OPTION (RECOMPILE) query hint to force SQL Server to calculate the best plan for the given statement each time it runs by considering the values for each variables and/or parameters.
·         Use when there are multiple optional parameters on indexed columns.
·         Use when the cost of recompilation far outweighs having specific plans based on different parameter values.
·         DO NOT use for high frequency queries where the plan should not change.
  --Correct
  SELECT PlayerID, FirstName
  FROM dbo.Players
  WHERE (@PlayerID IS NULL OR PlayerID=@PlayerID)
    AND (@FirstName IS NULL OR FirstName=@FirstName)
  OPTION (RECOMPILE);

  SELECT PlayerID, FirstName FROM dbo.Players WHERE PlayerID=@PlayerID;

  --Incorrect
  SELECT PlayerID, FirstName
  FROM dbo.Players
  WHERE (@PlayerID IS NULL OR PlayerID=@PlayerID)
    AND (@FirstName IS NULL OR FirstName=@FirstName);

  SELECT PlayerID, FirstName
  FROM dbo.Players
  WHERE PlayerID=@PlayerID
  OPTION(RECOMPILE);

Avoid Query,  Join and Index Hints

·         Avoid query, join and index hints not referenced in this document as hints generally limit the ability of SQL Server to adjust to changing data patterns and engine optimizations. 
·         Always document non-standard Query, Join or Index hints that are needed, so they can be removed when no longer required but not before.

Parameter/Variable Data Types

·         Always make sure variables or parameters used in a WHERE condition are the same type as the column.  If the types don’t match exactly SQL can choose to convert the column instead of the variable there by preventing the use of an index and/or increasing the cost of the conversion from one value to the number of rows in the table, which could be millions.
·         If column and parameter are not the same type use a CONVERT on the variable in the WHERE clause.  Never CONVERT the column.

Be cautious of using Functions in SELECT, WHERE and APPLY clauses

·         Uses of functions in SELECT, WHERE and APPLY clauses can be ran for each record.
·         When possible, call the function before the query and store the results into a variable.
·         For large result sets favor system functions and T-SQL expressions over udfs.
  --Correct
  DECLARE @NextDay date = DATEADD(dd, 1, @InsertedDate);

  SELECT PlayerID, RTRIM(LTRIM(FirstName)) FirstName
  FROM dbo.Players
  WHERE InsertedDatetime >= @InsertedDate
    AND InsertedDatetime < @NextDay;

  --Incorrect
  SELECT PlayerID, dbo.udf_AllTrim(FirstName) FirstName
  FROM dbo.Players
  WHERE @InsertedDate >= InsertedDatetime
    AND @InsertedDate < DATEADD(dd, +1, InsertedDatetime);

MERGE Statements

·         MERGE statements do not necessarily perform better then separate insert, update and/or delete statements.
·         If the vast majority of the time an UPDATE is needed you should use the pattern of UPDATE … IF @@ROWCOUNT=0 INSERT…
·         If the vast majority of the time an INSERT is needed you should use the pattern of INSERT … WHERE NOT EXISTS … IF @@ROWCOUNT=0 UPDATE …
·         If inserting a single record, generally use one of the previously mentioned patterns.
·         If merging large sets of data from one table to another use separate INSERT and UPDATE statements over a single MERGE statement.
·         If a TVP is the source and it is generally an even mix of UPDATEs and INSERTs then MERGE is a good choice.

Datetime range condition; Date math

·         Instead of using a BETWEEN which is inclusive on both ends it’s generally a better pattern to use >= and < when filtering on a Datetime or Datetime2 values to make sure we don’t include the end value in the current next range.
·         NEVER assume you can subtract 3ms from the next day and use an inclusive pattern.  Datetime2 has varying degrees of accuracy which break this pattern.
·         Never do date math outside of system functions, or without specifically converting the values to Datetime first.  Datetime2, Date, Time, DatetimeOffset do not support math without system functions.
  --Correct
  DECLARE @NextDay datetime2(7) = DATEADD(dd, 1, @InsertedDate);

  SELECT PlayerID, FirstName
  FROM dbo.Players
  WHERE InsertedDatetime >= @InsertedDate
    AND InsertedDatetime < @NextDay;

  DECLARE @BeginDatetime datetime2 = CONVERT(datetime,@BeginDate)+CONVERT(datetime,@BeginTime);

  --Incorrect
  DECLARE @NextDay datetime2 = DATEADD(ms, -3, @InsertedDate+1);

  SELECT PlayerID, FirstName
  FROM dbo.Players
  WHERE InsertedDatetime BETWEEN @InsertedDate AND @NextDay;

  DECLARE @BeginDatetime datetime2 = @BeginDate+@BeginTime;

Join conditions should favor parameters over columns

·         This provides the SQL engine with the information that only a single value will be looked up, allowing it to create a better plan.
  --Correct
  SELECT p.PlayerID, a.AccountType
  FROM dbo.Players p
    INNER JOIN dbo.Accounts a ON a.PlayerID=@PlayerID
  WHERE p.PlayerID=@PlayerID;

  --Incorrect
  SELECT p.PlayerID, a.AccountType
  FROM dbo.Players p
    INNER JOIN dbo.Accounts a ON a.PlayerID=p.PlayerID
  WHERE p.PlayerID=@PlayerID;

Procedures

Call Procedures using Parameter by Name not by Position

·         Code is much easier to read and understand.
·         Errors from named parameter changes are checked at compile time instead of run time.
·         Defaulted and/or deprecated parameters can be skipped until all references are removed.
  --Correct
  EXEC dbo.utd_CDSSlotMast @SlotMastID=45124, @SlotNumber=60484, @Active='Y', @CurrentRevision='Y';
  EXECUTE dbo.utd_CDSSlotMast
    @SlotMastID = 45114,
    @SlotNumber = 60488,
    @Active = 'Y',
    @CurrentRevision ='N';

  --Incorrect
  EXEC dbo.utd_CDSSlotMast 45124, 60484, 'Y', 'Y';

SET NOCOUNT ON;

·         Should always be the first statement following the BEGIN of every stored procedure, no exceptions.
·         If the client needs a row count return it through a result set or an output parameter.
·         This prevents “ rows affected” messages returned to the client for every statement that is run.

Pagination Pattern

·         Use @Top int=32, @Skip int=0 for the parameters
·         And ORDER BY OFFSET @Skip ROWS FETCH NEXT @Top ROWS ONLY syntax.
·         This pattern has no requirements for the existence of a unique key or assumptions about the unique key’s data type in the result set. 
·         This pattern also supports all manners of conditional sorting.
·         The only requirement is the ORDER BY must be unique, but this does not require the columns used to make the order unique to be returned.  The simplest approach is to add one or more primary key columns at the end of any conditionally ordered columns.
--Correct
CREATE PROCEDURE WebAPI.usp_Permissions_sel
(
  @Top int=32,
  @Skip bigint=0,
  @OrderBits smallint = 0
)
AS
BEGIN
  SET NOCOUNT ON;

  SELECT Name PermissionName, Alias, ActiveFlag
  FROM dbo.Permissions
  ORDER BY
    IIF(@OrderBits&1=1,Name,NULL),
    PermissionID
    OFFSET @Skip ROWS FETCH NEXT @Top ROWS ONLY;

END
go

--Incorrect
CREATE PROCEDURE WebAPI.usp_Permissions_sel
(
  @RecordCount int=32,
  @RecordSeed int=0
)
AS
BEGIN
  SET NOCOUNT ON;

  SELECT TOP (@RecordCount) PermissionID, Name PermissionName, Alias, ActiveFlag
  FROM dbo.Permissions
  WHERE PermissionID > @RecordSeed
  ORDER BY PermissionID;

END
go