Total Pageviews

Tuesday, November 19, 2013

DATEDIFF function– A common performance problem

DATEDIFF is one of the most widely used built-in functions to calculate the difference between two date points. Microsoft says DATEDIFF can be used in the select list, WHERE, HAVING, GROUP BY and ORDER BY clauses. Programmatically and logically this statement is absolutely correct, however there is a catch when it is used on the WHERE clause. We often introduce “non-sargable” predicate with the DATEDIFF function which leads to poor performance. I don’t think that there are good guidelines for many new developers.

Usage patterns:
As a rule of thumb, we know that using function on the left side of the WHERE clause causes Table or Index scan. So when the DATEDIFF function or any other functions are used on a key column, we will obviously see performance issues. Some common patterns of DATEDIFF functions are as follows:

WHERE DATEDIFF(day, dJoinDate, @CurrentDate) >=30
WHERE DATEDIFF(d, dDateofBirth, @dDate) =0
WHERE a.LastName LIKE  'Jon*' AND DATEDIFF(d, a.dSalesDate, @dDate) =0
WHERE DATEDIFF(mm, dDate, GetDate()) >= 15
WHERE YEAR(a.dDate) =2012

Issues observed:
Following are some definite issues which can be observed:
1.      Increased query response times.
2.      Table/Index scan in execution plans.
3.      Short or long durations of SQL blockings.
4.      Increasing of locking overhead.
5.      Unnecessary I/O activities and memory pressure.
6.      Parallel query plan and “sort operation”.

Sample Scripts to understanding the performance issues:
Let’s create a database and table; and then populate the table with data to explore some of the performance issues which may arise from a non-sargable predicates.

/******************************************************
Create database and some relevant stuff
******************************************************/
set nocount on
create database TestDB
go

use TestDB
go

if object_id('tblLarge') is not null
    drop table tblLarge
go

create table tblLarge
    (
      xID int identity(1, 1) ,
      sName1 varchar(100) ,
      sName2 varchar(1000) ,
      sName3 varchar(400) ,
      sIdentifier char(100) ,
      dDOB datetime null ,
      nWage numeric(20, 2) ,
      sLicense varchar(25)
    )
go


/******************************************************
Add some records with non-blank dDOB
******************************************************/
set nocount on
insert  into tblLarge
        ( sName1 ,
          sName2 ,
          sName3 ,
          sIdentifier ,
          dDOB ,
          nWage ,
          sLicense
        )
values  ( left(cast(newid() as varchar(36)), rand() * 50) ,    -- sName1
          left(cast(newid() as varchar(36)), rand() * 60) ,    -- sName2
          left(cast(newid() as varchar(36)), rand() * 70) ,    -- sName2
          left(cast(newid() as varchar(36)), 2) ,              -- sIdentifier    
          dateadd(dd, -rand() * 20000, getdate()) ,            -- dDOB
          ( rand() * 1000 ) ,                                  -- nWage
          substring(cast(newid() as varchar(36)), 6, 7)        -- sLicense       
        )
go 100000


/******************************************************
Create indexes
******************************************************/
alter table dbo.tblLarge add constraint
PK_tblLarge primary key clustered
(
xID
) with( pad_index = off,
          fillfactor = 85,
          allow_row_locks = on,
          allow_page_locks = on)
go

create nonclustered index [IX_tblLarge_dDOB_sName1] on [dbo].[tblLarge]
(      [dDOB] asc,
[sName1] asc
) with (pad_index = off,
          allow_row_locks = on,
          allow_page_locks = on,
          fillfactor = 85)
go


Example #1: DATEDIFF: non-sargable predicate:
Let’s consider the following commonly used patterns of the DATEDIFF function:

declare @dDate as datetime
set @dDate = '2012-09-19'

-- #0: non-sargable search
select  xID ,
        sName1 ,
        dDOB
from    tblLarge
where   datediff(d, dDOB, @dDate) = 0
order by dDOB


The above query results with Index Scan and below is the execution plan:


Optimizing the search:
The above query can be optimized a couple different ways and will result in an efficient execution plan:

-- #1: sargable predicate
select  xID ,
        sName1 ,
        dDOB
from    tblLarge
where   dDOB between '20120919' and '20120920'
order by dDOB

-- #2: sargable predicate
select  xID ,
        sName1 ,
        dDOB
from    tblLarge
where   dDOB between cast(convert(varchar(12), @dDate, 112) + ' 00:00:00' as datetime)
             and     cast(convert(varchar(12), @dDate + 1, 112) + ' 00:00:00' as datetime)
order by dDOB

-- #3: sargable predicate
select  xID ,
        sName1 ,
        dDOB
from    tblLarge
where   dDOB between convert(char(8), @dDate, 112)
             and     convert(char(8), @dDate + 1, 112)
order by dDOB


Following are the execution plans and cost comparisons:

Example #2: DATEDIFF: non-sargable predicate:
Consider the following as a non-sargable example.

declare @dDate as datetime
set @dDate = '2013-11-19'

select  xID ,
        sName1 ,
        dDOB
from    tblLarge
where   datediff(dd, dDOB, @dDate) <= 1
order by dDOB


To optimize the above query we can move the DATEDIFF function from left side to the right side.

declare @dDate as datetime
set @dDate = '2013-11-19'

select  xID ,
        sName1 ,
        dDOB
from    tblLarge
where   dDOB >= dateadd(dd, -1, @dDate)
order by dDOB

Following is the optimization effort which results in better query response time.


Example #3: YEAR- non-sargable predicate:
This is an example of YEAR function used on datetime column which results with index scan and can be re-written in a slightly different way.

-- non-sargable
select  xID ,
        sName1 ,
        dDOB
from    tblLarge
where   year(dDOB) = 2010
order by dDOB

-- sargable
select  xID ,
        sName1 ,
        dDOB
from    tblLarge
where   dDOB >= '01-01-2010'
        and dDOB < '01-01-2011'
order by dDOB

Execution plan and cost comparison:


Summary:
Writing efficient query for OLTP application needs more careful consideration and understanding of various techniques. Just satisfying the business requirement is not enough; we also need to make sure each query is a super performer by removing non-sargable predicates. 

Wednesday, November 13, 2013

“xp_delete_file”: A simple PowerShell Script alternative

There are numerous threads that can be found on “xp_delete_file” regarding various issues when used in a Maintenance Plan in SQL Server to remove old database backup (bak) or transaction backup (trn) from the disk and folder. This is a built-in and undocumented extended stored procedure and used internally by the Maintenance Plan Wizard. This Extended Stored Procedure can also be executing manually in SSMS such as:

declare @filedate datetime
set @filedate = getdate() - 5
execute master.dbo.xp_delete_file 0, 'd:\temp\', 'bak', @filedate, 1

Issues:
We often find that the maintenance task fails with the following error message in ERROR Log and SQL Agent Job history respectively. In addition to the message, we will also see mini-dump in the SQL Server log folder.

Error: 18002, Severity: 20, State: 1.
Exception happened when running extended stored procedure 'xp_delete_file' in the library 'xpstar.dll'. SQL Server is terminating process 73. Exception type: Win32 exception; Exception code: 0xc0000005.

Source: Maintenance Cleanup Task Execute SQL Task Description: Executing the query "EXECUTE master.dbo.xp_delete_file 0, N'd:\temp', N'trn', N'2010-01-21T13:00:00' " failed with the following error: "A severe error occurred on the current command. The results, if any, should be discarded. A severe error occurred on the current command. The results, if any, should ... The package execution fa... The step failed.

If we run “xp_delete_file” manually in SSMS, we may see the following error message:

Msg 0, Level 11, State 0, Line 0
A severe error occurred on the current command.  The results, if any, should be discarded.
Msg 0, Level 20, State 0, Line 0
A severe error occurred on the current command.  The results, if any, should be discarded.

Alternative to “xp_delete_file”:
As this functionality has some known issues and consequences, it is wise to use PowerShell script as an alternative.  Following are a few examples on how to remove older “bak” or “trn” files from a folder as well as from sub-folder. This PowerShell Script can be used to delete any kind of files from disk.

Example One (based on number of days):
Remove database backup files with the extension “bak” which are longer than 5 days old.

# target path
$TargetPath = "d:\temp\"

# files to delete more than 5 days
$Days = 5

# extension of the file to delete
$Extension = "*.bak"
$CurrentDate = Get-Date
$LastWrite = $CurrentDate.AddDays(-$days)

# Get files based on lastwrite filter in the specified folder
$FilesToDeletes = Get-Childitem $targetpath -Include $Extension -Recurse | Where {$_.LastWriteTime -le "$LastWrite"}

foreach ($File in $FilesToDeletes)
    {
    if ($File -ne $NULL)
        {   
        Remove-Item $File.FullName | out-null
        }
    }


Example Two (based on number of hours):
Remove transaction log backup files with the extension “trn” which are longer than 10 hours old.

# target path
$TargetPath = "d:\temp\"

# files to delete more than 10 hours
$Hours = 10

# extension of the file to delete
$Extension = "*.trn"
$CurrentDate = Get-Date
$LastWrite = $CurrentDate.AddHours(-$Hours)

# Get files based on lastwrite filter in the specified folder
$FilesToDeletes = Get-Childitem $targetpath -Include $Extension -Recurse | Where {$_.LastWriteTime -le "$LastWrite"}

foreach ($File in $FilesToDeletes)
    {
    if ($File -ne $NULL)
        {   
        Remove-Item $File.FullName | out-null
        }
    }


Using PowerShell script in SQL Agent Job (SQL 2008+):
Using PowerShell Script in SQL Server Agent Job is simple. Follow the steps described below:

1.      Create a new SQL Agent Job, for example “Remove_older_BAK_files”.
2.      “In the Job Step properties” – select “PowerShell” as a type (figure #1).
3.      Paste the PowerShell script. Don’t forget to adjust your path and day parameter according to your need.
4.      Exit by saving the job and then execute it.

If you want to use the above job in a Maintenance Plan, you can use “SQL Server Agent Job Task” as shown below (figure #2).

Figure #1: SQL Agent Job with PowerShell Script:


 
Figure #2: Maintenance Plan with PowerShell Scripted Job:



Monday, November 4, 2013

SQL Server, CPU and Power Policy – save the civilization

From the dawn of time, power has been linked with human history; in the past, present and future. The advancement of our species, the Homo sapiens, has been one of the most successful organisms on Earth to this day. The lighting of fire had fueled the creation for a modern world and that is what separated our evolutionary abled bodies from the ape species which existed before.

We produce power and consume it every nanosecond to continue through our daily lives. We also have concerns about using unnecessary power and we encourage others to reduce power so that we can save our planet and preserve it for our future generation. This makes sense.

We need to think carefully about how to use power and consume it effectively in SQL Server OLTP implantation.

OLTP and Windows Power Policy:
In cases, especially in OLTP and CPU-intensive application where concurrency is high, we want to make sure that the database server receives enough power to process each instruction without any latency. Saving some power in such cases is not an option as the power consumption directly affects CPU, which brings CPU latency and increases application response time.

Windows Power Policy:
In Windows 2008, there are three power consumption options (power plan), where “Balanced” is set to default and many SysAdmin or DBA never think to change to high performance mode. As a result, performance hurts and the overall performance degrades dramatically which can’t be understood the usual way. As per different leading experts research, “High Performance” mode will provide 10% to 30% overall performance improvement.

However, just enabling “High Performance” mode does not guarantee that Windows will be able to consume power uninterruptedly. To make this Windows configuration effective, we also need to configure server BIOS power management to “OS Control” mode. Without this configuration, Windows or ESX will not operate as desired.

Virtualization:
The populate virtualization application VMWare also recommends using “OS Control“ in hardware BIOS level and configure “High performance” mode in ESXi power Management. This configuration is also recommended in Microsoft Hyper-V implementation.

Power and CPU correlation Testing:
There is a tool which is known as “Geekbench” which can be used to test how power consumption affects the CPU Performance. You can find this tool at http://www.primatelabs.com/geekbench/. Geekbench is widely used by many industry experts as a CPU stress testing tool.

Figure: HP power management
Figure: Windows 2008 power management

 Figure: ESXi power management





References:
Degraded overall performance on Windows Server 2008 R2
http://support.microsoft.com/kb/2207548

Configuring Windows Server 2008 Power Parameters for Increased Power Efficiency
http://blogs.technet.com/b/winserverperformance/archive/2008/12/04/configuring-windows-server-2008-power-parameters-for-increased-power-efficiency.aspx

Host Power Management in VMware vSphere 5.5
http://www.vmware.com/resources/techresources/10205
 

Sunday, September 22, 2013

NUMA - Memory and MAXDOP settings

It is common practice to put more hardware power to alleviate the application performance issues rather than fixing the issues. For short life and non-mission critical applications, it makes sense. But for the mission critical applications that benefit from upgraded hardware, it does not last long. Fixing application issues become more obvious and important. At the same time, tweaking the server configuration is also necessary to operate the Database server flawlessly.

Modern hardware, such as NUMA based server technology, has a tremendous capability to process application requests faster than SMP architecture. Microsoft SQL Server is fully capable of using the NUMA architecture and taking advantage of it. Starting from SQL 2000 SP4, Microsoft supports Hardware NUMA and in each release, support for the NUMA architecture is getting enhanced.

NUMA and Memory Setting:
In a NUMA based system, memory setting (min server memory and max server memory) plays an important role. It is generally the best practice to configure memory in such a way that allocated memory is distributed evenly across all NUMA nodes. This will help each NUMA node to operate independently without demanding memory from other nodes. Accessing memory on another NUMA node is called “remote memory access” and accessing memory on the same NUMA node is called “local memory access”. Accessing different node for memory introduces latency.

To get best out of the NUMA system, the following settings are highly recommended:

1.      Lock Pages in Memory: The SQL Server Service account needs to have “Lock Pages in Memory” in place in the Windows local security policy. This will prevent paging out SQL Server memory back to Windows.

2.      Max and Min Server Memory: Max and Min server memory will need to be equal for two reasons:

(a)   This will reduce overhead (allocation and de-allocation) that would otherwise be used by SQL Server dynamically managing these values.
(b)   As Memory calculation is usually derived from “Max Server Memory”, SQL Server Engine will have better values to allocate physical memory evenly to each NUMA node. This will reduce “Foreign Memory” requirement that occurs during data processing on one node.

3.      MAXDOP: For servers that have NUMA configured, MAXDOP should not exceed the number of CPUs that are assigned to each NUMA node. Meaning that if a server has 4 NUMA nodes and each NUMA node consists with 4 CPUs then MAXDOP will be 4 or less. This will reduce threading overhead that occurs which will then be utilizing more NUMA nodes simultaneously.

Memory Allocation in each NUMA node:
To learn how much memory each node has received, PerfMon or sys.dm_os_perfromance_counters can be used. Following is a buffer allocation from an 8 node NUMA system.

DMV Query:
select  counter_name ,
        cntr_value * 8 / 1024 node_memory_mb ,
        instance_name
from    sys.dm_os_performance_counters
where   [object_name] like '%Buffer Node%'
        and counter_name like 'Total Pages%'
order by instance_name
compute sum(cntr_value * 8 / 1024 )


select  counter_name ,
        cntr_value * 8 / 1024  total_buffer_mb ,
        instance_name
from    sys.dm_os_performance_counters
where   [object_name] like '%Buffer Manager%'
        and counter_name like 'Total Pages%'
order by instance_name




CPU Usages in each NUMA Node:


Some disadvantages:
Although NUMA architecture is increasing processing power, there are some usage patterns which introduce some Latch contention in 32+ cores. In that case, database schema design such as index needs to be reviewed. A detailed guideline can be found in Microsoft’s technical document paper: “Diagnosing and Resolving Latch Contention on SQL Server”.

If overtime “Foreign Pages” counter is high for one or more nodes, this usually means that the nodes require more memory to perform a particular workload. While adding more memory might help, it is recommended to see if the query can be optimized along with index changes.

Read More:
SQL Server and Large Pages Explained

Recommendations and guidelines for the "max degree of parallelism" configuration option in SQL Server

SQL Server, Buffer Node Object

Diagnosing and Resolving Latch Contention on SQL Server

How It Works: SQL Server (NUMA Local, Foreign and Away Memory Blocks)