SQL Server – performance and other stories: November 2013

Tuesday, November 19, 2013

DATEDIFF function– A common performance problem

DATEDIFF is one of the most widely used built-in functions to calculate the difference between two date points. Microsoft says “DATEDIFF can be used in the select list, WHERE, HAVING, GROUP BY and ORDER BY clauses”. Programmatically and logically this statement is absolutely correct, however there is a catch when it is used on the WHERE clause. We often introduce “non-sargable” predicate with the DATEDIFF function which leads to poor performance. I don’t think that there are good guidelines for many new developers.

Usage patterns:

As a rule of thumb, we know that using function on the left side of the WHERE clause causes Table or Index scan. So when the DATEDIFF function or any other functions are used on a key column, we will obviously see performance issues. Some common patterns of DATEDIFF functions are as follows:

WHERE DATEDIFF(day, dJoinDate, @CurrentDate) >=30

WHERE DATEDIFF(d, dDateofBirth, @dDate) =0

WHERE a.LastName LIKE 'Jon*' AND DATEDIFF(d, a.dSalesDate, @dDate) =0

WHERE DATEDIFF(mm, dDate, GetDate()) >= 15

WHERE YEAR(a.dDate) =2012

Issues observed:

Following are some definite issues which can be observed:

1. Increased query response times.

2. Table/Index scan in execution plans.

3. Short or long durations of SQL blockings.

4. Increasing of locking overhead.

5. Unnecessary I/O activities and memory pressure.

6. Parallel query plan and “sort operation”.

Sample Scripts to understanding the performance issues:

Let’s create a database and table; and then populate the table with data to explore some of the performance issues which may arise from a non-sargable predicates.

/******************************************************

Create database and some relevant stuff

******************************************************/

set nocount on

create database TestDB

use TestDB

if object_id('tblLarge') is not null

drop table tblLarge

create table tblLarge

(

xID int identity(1, 1) ,

sName1 varchar(100) ,

sName2 varchar(1000) ,

sName3 varchar(400) ,

sIdentifier char(100) ,

dDOB datetime null ,

nWage numeric(20, 2) ,

sLicense varchar(25)

)

/******************************************************

Add some records with non-blank dDOB

******************************************************/

set nocount on

insert into tblLarge

( sName1 ,

sName2 ,

sName3 ,

sIdentifier ,

dDOB ,

nWage ,

sLicense

)

values ( left(cast(newid() as varchar(36)), rand() * 50) , -- sName1

left(cast(newid() as varchar(36)), rand() * 60) , -- sName2

left(cast(newid() as varchar(36)), rand() * 70) , -- sName2

left(cast(newid() as varchar(36)), 2) , -- sIdentifier

dateadd(dd, -rand() * 20000, getdate()) , -- dDOB

( rand() * 1000 ) , -- nWage

substring(cast(newid() as varchar(36)), 6, 7) -- sLicense

)

go 100000

/******************************************************

Create indexes

******************************************************/

alter table dbo.tblLarge add constraint

PK_tblLarge primary key clustered

(

xID

) with( pad_index = off,

fillfactor = 85,

allow_row_locks = on,

allow_page_locks = on)

create nonclustered index [IX_tblLarge_dDOB_sName1] on [dbo].[tblLarge]

( [dDOB] asc,

[sName1] asc

) with (pad_index = off,

allow_row_locks = on,

allow_page_locks = on,

fillfactor = 85)

Example #1: DATEDIFF: non-sargable predicate:

Let’s consider the following commonly used patterns of the DATEDIFF function:

declare @dDate as datetime

set @dDate = '2012-09-19'

-- #0: non-sargable search

select xID ,

sName1 ,

dDOB

from tblLarge

where datediff(d, dDOB, @dDate) = 0

order by dDOB

The above query results with Index Scan and below is the execution plan:

Optimizing the search:

The above query can be optimized a couple different ways and will result in an efficient execution plan:

-- #1: sargable predicate

select xID ,

sName1 ,

dDOB

from tblLarge

where dDOB between '20120919' and '20120920'

order by dDOB

-- #2: sargable predicate

select xID ,

sName1 ,

dDOB

from tblLarge

where dDOB between cast(convert(varchar(12), @dDate, 112) + ' 00:00:00' as datetime)

and cast(convert(varchar(12), @dDate + 1, 112) + ' 00:00:00' as datetime)

order by dDOB

-- #3: sargable predicate

select xID ,

sName1 ,

dDOB

from tblLarge

where dDOB between convert(char(8), @dDate, 112)

and convert(char(8), @dDate + 1, 112)

order by dDOB

Following are the execution plans and cost comparisons:

Example #2: DATEDIFF: non-sargable predicate:

Consider the following as a non-sargable example.

declare @dDate as datetime

set @dDate = '2013-11-19'

select xID ,

sName1 ,

dDOB

from tblLarge

where datediff(dd, dDOB, @dDate) <= 1

order by dDOB

To optimize the above query we can move the DATEDIFF function from left side to the right side.

declare @dDate as datetime

set @dDate = '2013-11-19'

select xID ,

sName1 ,

dDOB

from tblLarge

where dDOB >= dateadd(dd, -1, @dDate)

order by dDOB

Following is the optimization effort which results in better query response time.

Example #3: YEAR- non-sargable predicate:

This is an example of YEAR function used on datetime column which results with index scan and can be re-written in a slightly different way.

-- non-sargable

select xID ,

sName1 ,

dDOB

from tblLarge

where year(dDOB) = 2010

order by dDOB

-- sargable

select xID ,

sName1 ,

dDOB

from tblLarge

where dDOB >= '01-01-2010'

and dDOB < '01-01-2011'

order by dDOB

Execution plan and cost comparison:

Summary:

Writing efficient query for OLTP application needs more careful consideration and understanding of various techniques. Just satisfying the business requirement is not enough; we also need to make sure each query is a super performer by removing non-sargable predicates.

Wednesday, November 13, 2013

“xp_delete_file”: A simple PowerShell Script alternative

There are numerous threads that can be found on “xp_delete_file” regarding various issues when used in a Maintenance Plan in SQL Server to remove old database backup (bak) or transaction backup (trn) from the disk and folder. This is a built-in and undocumented extended stored procedure and used internally by the Maintenance Plan Wizard. This Extended Stored Procedure can also be executing manually in SSMS such as:

declare @filedate datetime

set @filedate = getdate() - 5

execute master.dbo.xp_delete_file 0, 'd:\temp\', 'bak', @filedate, 1

Issues:

We often find that the maintenance task fails with the following error message in ERROR Log and SQL Agent Job history respectively. In addition to the message, we will also see mini-dump in the SQL Server log folder.

Error: 18002, Severity: 20, State: 1.

Exception happened when running extended stored procedure 'xp_delete_file' in the library 'xpstar.dll'. SQL Server is terminating process 73. Exception type: Win32 exception; Exception code: 0xc0000005.

Source: Maintenance Cleanup Task Execute SQL Task Description: Executing the query "EXECUTE master.dbo.xp_delete_file 0, N'd:\temp', N'trn', N'2010-01-21T13:00:00' " failed with the following error: "A severe error occurred on the current command. The results, if any, should be discarded. A severe error occurred on the current command. The results, if any, should ... The package execution fa... The step failed.

If we run “xp_delete_file” manually in SSMS, we may see the following error message:

Msg 0, Level 11, State 0, Line 0

A severe error occurred on the current command. The results, if any, should be discarded.

Msg 0, Level 20, State 0, Line 0

A severe error occurred on the current command. The results, if any, should be discarded.

Alternative to “xp_delete_file”:

As this functionality has some known issues and consequences, it is wise to use PowerShell script as an alternative. Following are a few examples on how to remove older “bak” or “trn” files from a folder as well as from sub-folder. This PowerShell Script can be used to delete any kind of files from disk.

Example One (based on number of days):

Remove database backup files with the extension “bak” which are longer than 5 days old.

# target path

$TargetPath = "d:\temp\"

# files to delete more than 5 days

$Days = 5

# extension of the file to delete

$Extension = "*.bak"

$CurrentDate = Get-Date

$LastWrite = $CurrentDate.AddDays(-$days)

# Get files based on lastwrite filter in the specified folder

$FilesToDeletes = Get-Childitem $targetpath -Include $Extension -Recurse | Where {$_.LastWriteTime -le "$LastWrite"}

foreach ($File in $FilesToDeletes)

{

if ($File -ne $NULL)

{

Remove-Item $File.FullName | out-null

}

Example Two (based on number of hours):

Remove transaction log backup files with the extension “trn” which are longer than 10 hours old.

# target path

$TargetPath = "d:\temp\"

# files to delete more than 10 hours

$Hours = 10

# extension of the file to delete

$Extension = "*.trn"

$CurrentDate = Get-Date

$LastWrite = $CurrentDate.AddHours(-$Hours)

# Get files based on lastwrite filter in the specified folder

$FilesToDeletes = Get-Childitem $targetpath -Include $Extension -Recurse | Where {$_.LastWriteTime -le "$LastWrite"}

foreach ($File in $FilesToDeletes)

{

if ($File -ne $NULL)

{

Remove-Item $File.FullName | out-null

}

Using PowerShell script in SQL Agent Job (SQL 2008+):

Using PowerShell Script in SQL Server Agent Job is simple. Follow the steps described below:

1. Create a new SQL Agent Job, for example “Remove_older_BAK_files”.

2. “In the Job Step properties” – select “PowerShell” as a type (figure #1).

3. Paste the PowerShell script. Don’t forget to adjust your path and day parameter according to your need.

4. Exit by saving the job and then execute it.

If you want to use the above job in a Maintenance Plan, you can use “SQL Server Agent Job Task” as shown below (figure #2).

Figure #1: SQL Agent Job with PowerShell Script:

Figure #2: Maintenance Plan with PowerShell Scripted Job:

Monday, November 4, 2013

SQL Server, CPU and Power Policy – save the civilization

From the dawn of time, power has been linked with human history; in the past, present and future. The advancement of our species, the Homo sapiens, has been one of the most successful organisms on Earth to this day. The lighting of fire had fueled the creation for a modern world and that is what separated our evolutionary abled bodies from the ape species which existed before.

We produce power and consume it every nanosecond to continue through our daily lives. We also have concerns about using unnecessary power and we encourage others to reduce power so that we can save our planet and preserve it for our future generation. This makes sense.

We need to think carefully about how to use power and consume it effectively in SQL Server OLTP implantation.

OLTP and Windows Power Policy:

In cases, especially in OLTP and CPU-intensive application where concurrency is high, we want to make sure that the database server receives enough power to process each instruction without any latency. Saving some power in such cases is not an option as the power consumption directly affects CPU, which brings CPU latency and increases application response time.

Windows Power Policy:

In Windows 2008, there are three power consumption options (power plan), where “Balanced” is set to default and many SysAdmin or DBA never think to change to high performance mode. As a result, performance hurts and the overall performance degrades dramatically which can’t be understood the usual way. As per different leading experts research, “High Performance” mode will provide 10% to 30% overall performance improvement.

However, just enabling “High Performance” mode does not guarantee that Windows will be able to consume power uninterruptedly. To make this Windows configuration effective, we also need to configure server BIOS power management to “OS Control” mode. Without this configuration, Windows or ESX will not operate as desired.

Virtualization:

The populate virtualization application VMWare also recommends using “OS Control“ in hardware BIOS level and configure “High performance” mode in ESXi power Management. This configuration is also recommended in Microsoft Hyper-V implementation.

Power and CPU correlation Testing:

There is a tool which is known as “Geekbench” which can be used to test how power consumption affects the CPU Performance. You can find this tool at http://www.primatelabs.com/geekbench/. Geekbench is widely used by many industry experts as a CPU stress testing tool.

Figure: HP power management