Friday, October 19, 2012

SQL Server: Why We Should Avoid NOLOCK Table Hint in DELETE/UPDATE Queries

Recently, I was asked to review, already written stored procedures for optimization purpose. During this review process I have found that a group of developers is regularly committing a big mistake. This group of developers believes that table hint NOLOCK is used to execute queries quickly, as this hint will avoid placing any lock on target table records and it can you used in any query. Even they have applied this NOLOCK in DML statements.
WRONG
First thing, NOLOCK hint means, it will not take care of any lock (instead of placing lock). It will return data, that could be dirty (NOT YET COMMITTEED by other transactions). We can use this table hint to get results quickly when we are dead sure that dirty data is TOTALLY bearable.
In DELETE/UPDATE queries it should be totally avoided as it can produce junk results. Let’s prove.
In following example, we need to correct discount column of SalesOrderDetail, but according to discount provided in lookup table of SpecialOffer. Before we execute our update statement (Statement #2 in Transaction# 2), someone has accidently changed SpecialOffer, but good thing is that, he has not committed these changes yet. But as we have placed NOLOCK hint in our Statement #2 in Transaction# 2, it will change data according to dirty data, though, later on transaction#1 is rolledback.

SQL Server: Script to Fix ALLOW_PAGE_LOCKS Option for All indexes on All Databases

Recently we have found that our index defragmentation job is failing on a production server, due to REORGANIZE   failure of one of our index. SQL Server was unable to REORGANIZE this index because mistakenly we have an index with the ALLOW_PAGE_LOCKS options set to OFF

What is Page Lock Option?
According to BOL, If ALLOW_PAGE_LOCKS option is set to ON, it means Page locks are allowed when accessing the index. The Database Engine determines when page locks are used.


If
ALLOW_PAGE_LOCKS  option is set to off following query will return an error.
ALTER
INDEX IX_Address_AddressLine1_AddressLine2_City_StateProvinceID_PostalCode
ON
Person.Address REORGANIZE

The index "IX_Address_AddressLine1_AddressLine2_City_StateProvinceID_PostalCode" (partition 1) on table "Address" cannot be reorganized because page level locking is disabled.

You can correct it by simply updating ALLOW_PAGE_LOCKS option to ON with the help of following query
ALTER INDEX IX_Address_AddressLine1_AddressLine2_City_StateProvinceID_PostalCode
ON
  Person.Address
SET
(ALLOW_PAGE_LOCKS = ON);

To make sure to avoid this problem in futur for all datbases on a instance, Amna Asif has suggested following simple but efficient script to detect indexes with ALLOW_PAGE_LOCKS option set to OFF and automatically fix this problem for all of your databases on an instance.

/***********************************
Script By: Amna Asif
Purpose : To fix ALLOW_PAGE_LOCKS option on

                 all indexes of all databases on a particular instance
***********************************/

USE
MASTER
GO

SET
QUOTED_IDENTIFIER ON
GO

SET
ARITHABORT ON
GO

DECLARE @DBName NVARCHAR(255)
DECLARE @IndexCount int
DECLARE
@UpdateIndexQuery Varchar(500)
--- Table variable to hold intermediate result set

            DECLARE @IndexsInfo TABLE
            ( 
            RowNo int identity(1,1),
            DatabaseName varchar(100),
            TableName varchar(100),
            IndexName varchar(100)
            )
--- Cursor to work on each changeable index of each db on an instance

      DECLARE DatabaseList CURSOR 
                  FOR
                    SELECT Name
                    FROM sys.databases
                    WHERE state_desc = 'ONLINE'
                    AND is_read_only = 0
                    ORDER BY name
      OPEN DatabaseList
             FETCH NEXT FROM DatabaseList INTO @DBName
             WHILE @@FETCH_STATUS = 0
             BEGIN   
               INSERT INTO @IndexsInfo (DatabaseName,TableName,IndexName)
               EXEC( '  SELECT '''+@DBName+''' AS DatabaseName,TABL.NAME AS TableName,indx.NAME AS IndexName     
                              FROM '+@DBName+'.SYS.INDEXES indx
                              LEFT OUTER JOIN '+@DBName+'.SYS.TABLES TABL ON indx.[OBJECT_ID]=TABL.[OBJECT_ID]
                              WHERE ALLOW_PAGE_LOCKS = 0           -- where page lock option is not selected
                              AND indx.name NOT LIKE ''QUEUE%''  ' -- we need only user defined indices
                        )
            FETCH NEXT FROM DatabaseList INTO @DBName
             END
            CLOSE DatabaseList
            DEALLOCATE DatabaseList
-----Update allow_page_locks option for those indexes where we need 

 SET @IndexCount=(SELECT MAX(RowNo) FROM @IndexsInfo )
      WHILE @IndexCount >0  
      BEGIN
          SET @UpdateIndexQuery=( SELECT ' ALTER INDEX '+ IndexsInfo.IndexName +' ON ['+
            IndexsInfo.DatabaseName+'].dbo.['+IndexsInfo.TableName+']
            SET (
                  ALLOW_PAGE_LOCKS = ON
                  ) ; '
            FROM @IndexsInfo AS IndexsInfo
            WHERE IndexsInfo.RowNo=@IndexCount)
     
            EXEC(@UpdateIndexQuery)
        
           SET @IndexCount=@IndexCount-1
      END

SQL Server: How to Analyze Blocking and Deadlocking

In response to an early post Difference Between Locking, Blocking and Dead Locking , I have received few mails where I was asked “How to analyze this blocking and dead locking phenomenon”.
I like to use performance monitor counters to check the frequency of blocking and dead locking. You can find these counters by selecting SQL Server: Locks. Three counters under this group are very useful.
  1. Lock Timeouts/sec
  2. Lock Wait Time (ms)
  3. Deadlocks/sec
Out of these three counters first two are used to analyze blocking. Value for “Lock Timeouts/sec” should be zero (0) and “Lock Wait Time (ms)” must also be very low. If you are observing nonzero value for “Lock Timeouts/sec” and continuous high value for “Lock Wait Time (ms)”, then there is excessive blocking occurring. Your long running queries can cause this blocking. Use profiler or sys.dm_exec_query_stats to identify such culprit queries.
None zero values “Deadlocks/sec” counter is an indication of deadlocks. Value for this counter must always be zero.
We can also use following query
SELECT *
FROM    sys.dm_os_performance_counters
WHERE object_name = 'SQLServer:Locks' 
AND instance_name = '_Total'
AND counter_name IN ('Lock Waits/sec','Lock Wait Time (ms)','Number of Deadlocks/sec')
Do you have any better idea ? Please do share with us.

SQL Server: Difference Between Locking, Blocking and Dead Locking

Like ever, today’s article of Pinal Dave was interesting and informative. After, our mutual discussion between our DBAs and Developers on Pinal Dave topic of Concurrency and Isolation. I felt that most of us are intermingling three equally sounding words. Those are LOCKING, BLOCKING and DEAD LOCKING.
Lets try to revisit these concepts with some simple analogies.
LOCKING occurs when connection needs access to a piece of data in database and it’s necessary for SQL Server when managing multiple connections. Just assume an example of your garage, when you park your car in garage, basically you are locking the place of garage.

BLOCKING occurs when two connections need access to same piece of data concurrently and one connection is blocked because at a particular time, only one connection can have access. Just like, you stop (block) your car on a traffic signal because some other car or cars are using the crossing area.
DEAD LOCK occurs when one connection is blocked and waiting for a second to complete his work, but on other side, second connection is also waiting for first connection to release the lock. Just like, you need to cross the signal area but same time someone else from opposite side also want to cross the signal. Now, you need a way which other is holding and other need way where your car is.
That is why, one should be clear that locking is integral part of SQL Server to handle concurrency, blocking is bad when one connection/transaction is waiting unnecessary for a long time, and deadlocking is a phenomenon which should never occur.

SQL Server: Indexes for GROUP BY Columns, a Good Choice

Every one of us knows that index on the columns referred to in an ORDER BY clause are helpful for optimizer to organize the output fast because columns values are already in sorted form.
Same way, columns which are part of GROUP BY benefit from indexes on the corresponding columns as GROUP BY clause columns are sorted first because sorted column values allow the adjacent matching values to be grouped quickly.
Indexes are also helpful for aggregate functions like MIN and MAX. Indexes on such columns are helpful for optimizer which need table/clustered index scan to find MIN or MAX values for a given column.

SQL Server: Simple Method to Resolve All Indexes Fragmentation

Recently a blog reader asked about a script or stored procedure which can resolve all indexes fragmentation as manually exploring each index of database for its level of fragmentation and then rebuilding or re-indexing it, according to its requirement, is bit painful task.
Here is a store procedure which I like to use for this purpose. I normally execute this stored procedure through an automated job during off peak hours.

CREATE PROCEDURE dbo.Proc_IndexDefragmentation
AS
    DECLARE @DBName NVARCHAR(255),
        @TableName NVARCHAR(255),
        @SchemaName NVARCHAR(255),
        @IndexName NVARCHAR(255),
        @PctFragmentation DECIMAL

    DECLARE @Defrag NVARCHAR(MAX)

    IF EXISTS ( SELECT  *
                FROM    sys.objects
                WHERE   OBJECT_ID = OBJECT_ID(N'#Frag') )
        DROP TABLE #Frag

    CREATE TABLE #Frag
        (
          DBName NVARCHAR(255),
          TableName NVARCHAR(255),
          SchemaName NVARCHAR(255),
          IndexName NVARCHAR(255),
          AvgFragment DECIMAL
        )

    EXEC sp_msforeachdb 'INSERT INTO #Frag(
                  DBName,
            TableName,
            SchemaName,
            IndexName,
            AvgFragment
            )
            SELECT ''?'' AS DBName
            ,t.Name AS TableName
            ,sc.Name AS SchemaName
            ,i.name AS IndexName
            ,s.avg_fragmentation_in_percent
            FROM ?.sys.dm_db_index_physical_stats(DB_ID(''?''),NULL,NULL,NULL,''Sampled'') AS s
            JOIN ?.sys.indexes i
            ON s.Object_Id = i.Object_Id
            AND s.Index_id = i.Index_id
            JOIN ?.sys.tables t
            ON i.Object_Id = t.Object_Id
            JOIN ?.sys.schemas sc
            ON t.schema_id = sc.Schema_Id
            WHERE s.avg_fragmentation_in_percent > 20
            AND t.TYPE = ''U''
            AND s.page_count > 8
            ORDER BY TableName,IndexName'

    DECLARE cList CURSOR
        FOR SELECT  *
            FROM    #Frag

    OPEN cList
    FETCH NEXT FROM cList INTO @DBName, @TableName, @SchemaName, @IndexName,
        @PctFragmentation
    WHILE @@FETCH_STATUS = 0
        BEGIN
            IF @PctFragmentation BETWEEN 20.0 AND 40.0
                BEGIN
                    SET @Defrag = N'ALTER INDEX ' + @IndexName + ' ON '
                        + @DBName + '.' + @SchemaName + '.[' + @TableName
                        + '] REORGANIZE'
                    EXEC sp_executesql @Defrag
                    PRINT 'Reorganize index: ' + @DBName + '.' + @SchemaName
                        + '.' + @TableName + '.' + @IndexName
                END
            ELSE
                IF @PctFragmentation > 40.0
                    BEGIN
                        SET @DeFrag = N'ALTER INDEX ' + @IndexName + ' ON '
                            + @DBName + '.' + @SchemaName + '.[' + @TableName
                            + '] REBUILD'
                        EXEC sp_executesql @Defrag
                        PRINT 'Rebuild index: ' + @DBName + '.' + @SchemaName
                            + '.' + @TableName + '.' + @IndexName
                    END

            FETCH NEXT FROM cList INTO @DBName, @TableName, @SchemaName,
                @IndexName, @PctFragmentation

        END
    CLOSE cList
    DEALLOCATE cList

    DROP TABLE #Frag

I think one more cleanly written, Tara Kizar stored procedure for this auto defregmentation process, can also help you. You can find it at Tara Kizar Blog

SQL Server: How to Get Physical Path of Tables and Indexes

When database consists of multiple data files and objects (tables/indexes) are dispersed on these multiple data files. Common requirement is to get a list of objects (tables, indexes) along with their physical path.  Here is a simple query to accomplish this task.
SELECT  'table_name' = OBJECT_NAME(i.id),
        i.indid,
        'index_name' = i.name,
        i.groupid,
        'filegroup' = f.name,
        'file_name' = d.physical_name,
        'dataspace' = s.name
FROM    sys.sysindexes i,
        sys.filegroups f,
        sys.database_files d,
        sys.data_spaces s
WHERE   OBJECTPROPERTY(i.id, 'IsUserTable') = 1
        AND f.data_space_id = i.groupid
        AND f.data_space_id = d.data_space_id
        AND f.data_space_id = s.data_space_id
ORDER BY f.name,
        OBJECT_NAME(i.id),
        groupid

SQL Server: How to Determine All Indexes Fragmentation

On request of one of my blog reader, here is a script which I like to use to check internal and external fragmentation of all indexes of my database.
In this simple script I am currently using DETAILED option. You can obtain results with moderate accuracy but with increase in speed by using the SAMPLE option, which scans only 1 percent of the pages. For the most accuracy, use the DETAILED option, which scan all the pages in an index. But always keep in mind that if an index has fewer than 10,000 pages, what ever option you select, DETAILED option is used automatically.
Note: Please don’t forget to provide your own database name at YourDatabaseNameHere.
SELECT OBJECT_NAME(dt.object_id), 
si.name,
dt.avg_fragmentation_in_percent, 
dt.avg_page_space_used_in_percent
FROM
      (SELECT object_id, 
              index_id, 
              avg_fragmentation_in_percent, 
              avg_page_space_used_in_percent
            FROM sys.dm_db_index_physical_stats (DB_ID('YourDatabaseNameHere'), NULL, NULL, NULL, 'DETAILED')
            WHERE index_id <> 0) as dt --does not return information about heaps
      INNER JOIN sys.indexes si
            ON si.object_id = dt.object_id
                  AND si.index_id = dt.index_id
WHERE avg_fragmentation_in_percent >10 OR avg_page_space_used_in_percent <75
ORDER BY avg_fragmentation_in_percent DESC
/*avg_fragmentation_in_percent column is used to determine external fragmentation
External fragmentation is indicated when this value exceeds 10.
avg_page_space_used_in_percent column is used to determine internal fragmentation.
Internal fragmentation is indicated when this value falls below 75. */

SQL SERVER: 4 Possible Methods to Resolve Index Fragmentation

For OLTP databases, index fragmentation is a major problem for query performance. We have four methods to resolve this fragmentation problem. Let’s explore these methods one by one.

  1. Dropping and Re-creating Fragmented Indexes
One of the easiest ways to avoid this fragmentation problem is dropping and recreating the targeted index.
Benefits:
·         Major benefit of this method is that we can completely remove both internal and external fragmentation
          Drawbacks:
·     We can’t drop a unique index (clustered or non-clustered) used for primary key directly. First we have to delete all the foreign keys that reference this primary key. So it is most time consuming method.
·     If we drop clustered index we have to rebuild all non-clustered indexes on target table.
·    If we drop an index on production server, during drop and recreate time, query performance will become very low due to unavailability of proper index.
·    Other queries accessing same table can face BLOCKING problem.

2.          Using DROP_EXISTING Keyword
DROP_EXISTING keyword in CREATE INDEX statement drops and create existing index in single atomic step

CREATE INDEX YourIndexName ON TableName(ColumnName)
WITH (DROP_EXISTING = ON)
Benefits:
·         This method is helpful to avoid overhead of non-clustered index rebuilding when we drop a clustered index but still provide benefits of Drop & Create method.
          Drawbacks:
·     If column on which index is created is being used for primary or unique foreign key and somehow we omit UNIQUE keyword in CREATE statement, It will generate error.
·    Blocking of queries same to first method

3.         ALTER INDEX REBUILD
ALTER INDEX REBUILD rebuilds an index assigning fresh pages to reduce both internal and external fragmentation to a minimum and is considered most appropriate way to avoid fragmentation.

ALTER INDEX YourIndexName ON TableName REBUILD
Benefits:
·         Most useful method to remove internal and external fragmentation with out any ambiguity defined in first two methods.
·         Using ONLINE keyword in ALTER INDEX REBUILD query, we somehow can avoid query blocking problem but obviously process of rebuilding of an index will be slow.
·         With following single query you can re-build all the indexes of a table.

ALTER INDEX ALL ON TableName
          Drawbacks:
·     Other then blocking ALTER INDEX REBUILD has only one problem. You have to re-execute the ALTER query if somehow process of rebuild was interrupted.

4.    ALTER INDEX REORGANIZE
     ALTER INDEX REORGANIZE reduces the fragmentation without any rebuilding process. ALTER INDEX REORGANIZE reduces fragmentation through following steps
a.     Rearranges the existing leaf pages of index in logical order to reduce external fermentation
b.     And to reduce internal fermentation it compacts the rows with in the pages and removes resultant empty pages
c.      ALTER INDEX REORGANIZE works in steps and performs locking for a small period. If a page is found already locked by another query, it just leave the page as it and moves forward.

   ALTER INDEX YourIndexName ON TableName REORGANIZE

Benefits:
·         It can work without disturbing other queries running on server.
·         It preserves the work intermediately so if ALTER INDEX REORGANIZE query is interrupted work is never roll backed.

          Drawbacks:
·     It can’t reduce fragmentation effectively as compared to ALTER INDEX REBUILD
·    It can take more time then rebuild when index is highly fragmented

Try your best to avoid first method. Second one is good when you actually need to drop and recreate index, but still try to avoid using this for de-fregmentation process. Use ALTER INDEX REORGANIZE when fragmentation is up to 40 % (avg_fregmentation_in_percent value of dm_db_index_physical_stats) and if percentage of fragmentation increases from 40% you must use ALTER INDEX REBUILD

SQL SERVER: 9 Most Forbidden Things



  1. DBCC SHRINKDATABASE
Most of SQL Server developers and even some DBAs think that we can shrink size of a database by executing DBCC SHRINKDATABASE. You can just reacquire unused space of database and it’s never going to compress your database and change magically low size. Advantage is very very low as compared to performance loss. Why DBCC SHRINKDATABASE is so awful SQL Server Guru Pinal Dave has explained it here.
  1. SET AUTO_CREATE STATISTICS OFF
SQL Sever by default SET this option to ON and create statistics for all the columns used in join and filters, for even those columns on which no index exists. SQL Server is made quite intelligent to create statistics on required columns only and even drop statistics which are no more in use. So always keep AUTO_CREATE STATISTICS ON.
  1. RECOMPILE hint in Stored Procedures
You can provide RECOMPILE hint for stored procedure to recompile it every time it is executed. RECOMPILE hint for SPs is nothing but a performance overhead.
  1. Query hint to force some index usage
Index of your own choice can be forced to use for a query BUT never ever do this on production servers. Query Optimizer is intelligent enough to select, which index is better to use for a query. Let optimizer work.
  1. Heap table structure
Tables without any clustered index are called HEAP tables and records for these tables are not kept physically in order. Always create a clustered index on each of your database table because lot of hard work is required by SQL Server to perform any type of query on these heap tables.
  1. Clustered index on UNIQUEIDENTIFIER column
To types of data types are used to create a surrogate key (artificial key) i.e. INT (as identity column) and UNIQUEIDENTIFIER. Problems occur when clustered indexes are created on UNIQUEIDENTIFIER data type column. Value for UNIQUEIDENTIFIER are never sequential (I am not talking about Sequential GUID in Sql Server 2008) and always play a giant role in index fragmentation not only for clustered index but due to this wide key, non-clustered indexes need more space and maintenance time. Read more about this in early post UNIQUEIDENTIFIER column as primary key a worst choice.
  1. Index on BIT data type columns
Though, we are allowed to create index on columns with BIT data type. Yet creating an index on a BIT data type column by itself is not a big advantage since such a column can have only two unique values. Such indexes are just over head as we SQL server still need to maintain these indexes.
  1. COUNT(*) to check records existence
Never use COUNT (*) to check existence of record, instead use
IF EXISTS (SELECT 1 FROM YourtTable WHERE YourFilterCondition)

(Update: http://connectsql.blogspot.com/2011/01/sql-server-why-we-should-avoid-count.html)
  1. User Tables in System Databases
Always create user tables in your own user databases and never use system databases for this purpose.

Do you guys have more points which must be added in this list ? Share with us.