Few people love tight code during development more than I, but I also have to support applications as they age. As the user base and application data grows performance almost always becomes a concern.
In that vein I figured I'd whip up a simple test over a sizable amount of data and compare the execution times of a Linq and a traditional approach to filtering a list.
First I needed something to build a list of:
public class Movie
{
public string Title { get; set; }
public TimeSpan RunTime { get; set; }
}
Then I filled a List
Now that I had some data it was time to test. The following is a Linq query I used to retrieve a list of movies longer than two hours and put them into a list named longMovies:
start = DateTime.Now;
List<Movie> longMovies = (from m in movies
where m.RunTime.Hours > 2
select m).ToList();
Console.WriteLine(DateTime.Now.Subtract(start).Milliseconds);
Three executions resulted in an average of 204 milliseconds on an AMD Turion X2, 2.1 Ghz (updated on 2009/11/15 to more modern hardware).
Now to compare it to a traditional iterative approach:
start = DateTime.Now;
longMovies = new List<Movie>();
foreach (Movie m in movies)
{
if (m.RunTime.Hours > 2)
longMovies.Add(m);
}
Console.WriteLine(DateTime.Now.Subtract(start).Milliseconds);
The average result after 3 executions: 132 milliseconds, about 2/3 of the Linq equivalent.
The moral of the story is that you have to be careful with Linq to object, at least with large sets. It may read nice and give you the "feel" of set-based operation, but under the hood the compiler's doing all sorts of expensive things like creating and calling delegates which it uses in Linq "Where" clauses. Your own conditionals will be far faster.

with the 2008 releases from microsoft powershell is really making its way into the mainstream. sql server 2008 ships with standard powershell as well as an add-in that provides an interesting degree of integration that will allow DBAs to write scripts that are far more generalized than T-SQL scripts.
getting ready to play with it is pretty strait forward. all you have to do is click the "Start PowerShell" context menu option which is relatively ubiquitous in SSMS 2008's object explorer, located on the "Tables" folder within a database for example.
essentially the schematic structure of database is layed out into a virtual filesystem with database objects represented as files and directories. for example every table is a directory with subdirectories for each column and each column having files for the more granular properties such as constraints.
the power comes into play when you realize that the virtual filesystem is composed of SMO objects as is illustrated below:
direct sql execution is also supported through the Invoke-SQL cmdlet as shown by the following example:
as you can imagine this barely scratches the surface. for more information I suggest checking out microsoft's documentation here

in order to test them out i created a simple table in a scratch database and filled it with data using DataDude.
CREATE TABLE [dbo].[Sales]( [Sale_ID] [int] IDENTITY NOT NULL, [Complete] [bit] NOT NULL, [Location] [int] NOT NULL, CONSTRAINT [PK_Sales] PRIMARY KEY CLUSTERED ( [Sale_ID] ASC )
the table was filled with 10000 rows with [Complete] = 1 and 100 rows with [Complete] = 0. Location was a random value between 1 and 5.
if i issue the following query:
SELECT [Location], COUNT(*) FROM [Sales] WHERE [Complete] = 0 GROUP BY [Location]to break out counts by location for incomplete sales the following execution plan is chosen because i have no index on the location column:
i could easily increase the efficiency of the query by creating the following filtered index:
CREATE INDEX IX_Location_ForComplete ON [Sales] ([Location]) WHERE [Complete] = 0resulting in:
in comparison, 25 reads resulted without the filtered index and 3 reads resulted with it. quite a significant improvement.
by now i bet you're asking, "couldn't we just create a composite index on [Sales] ([Complete], [Location]) and then queries for complete orders could benefit as well?" well, yes. you could, and it will result in a much larger index that results in higher read counts (5 in this case). also it could be very nasty when page splits occur as the database grows.
the truth is that there are situations where you may only issue certain queries against predictable portions of a table, such as incomplete orders (which are the vast minority of rows in the table in this case). why waste the space and IO on values you know will not be relevant? why waste the time splitting pages while indexing the 90+% of your table you're not interested in or doesn't change anymore?
filtered indexes are yet another interesting feature provided by SQL 2008 that have me eagerly awaiting RTM.

One of my favorite new features available in SQL 2008 is table valued parameters. For years SQL developers have been dying for this abilty and have been throwing some miserable duct tape at the problem.
Before sql 2005 it was common to see strings being passed into stored procedures and then parsed into temporary tables. This approach suffered several major disadvantages including being limited to 8000 bytes, requiring T-SQL parsing logic and little guarantee you would receive the parameter structured exactly as you desired.
SQL 2005 not only lifted the 8000 byte limitation by introducing (n)varchar(max) but also added extensive XML support which allowed structured data to be passed in as XML and later queried as if it were relational data. While very useful I was always a little bothered by having to rely on XML to perform what was often a purely relational task. After all sometimes you're merely passing in an in determinant-length list of integers; effectively an array. Resorting to a hierarchical data technology was a little like cutting down a sapling with a chainsaw.
Table valued parameters to the rescue! In the following example I'll demonstrate the basics of this feature by creating a stored procedure that returns a list of customers that are in a given list of types.
Lets' start out with a basic structure:
create table CustomerTypes (
Customer_Type_ID int identity(1, 1)
constraint PK_Customer_Type primary key clustered,
[Name] varchar(256) not null
)
go
create table Customers (
Customer_ID int identity(1, 1)
constraint PK_Customers primary key clustered,
Full_Name nvarchar(256),
Customer_Type_ID int
constraint FK_CustomerTypes foreign key
references CustomerTypes(Customer_Type_ID)
)
go
And we'll slap some data in those respective tables:
Customer_Type_ID Name ---------------- ---------------------- 1 Basic 2 Intermediate 3 Advanced 4 Professional 5 Enterprise Customer_ID Full_Name Customer_Type_ID ----------- -------------- ------------------------ 1 Jimmy Jones 1 2 Richard Rocket 2 3 Samantha Smith 3 4 Chris Candleson 4 5 Greg Gimble 5 7 Billiam Barlette 5Table valued parameters require a user-defined table type that they can be declared as. In our case we'll use:
create type Customer_Type_IDs as table ( Customer_Type_ID int primary key ) goWith our type in place we can write a stored procedure accepting of a variable of that type as an argument. Note the parameter must be passed in as read-only.
create procedure proc_Customers_ByType_Get ( @p_Customer_Type_IDs Customer_Type_IDs readonly ) as begin select C.Customer_ID,C. Full_Name, C.Customer_Type_ID from Customers C inner join @p_Customer_Type_IDs pCT on C.Customer_Type_ID = pCT.Customer_Type_ID inner join CustomerTypes CT on CT.Customer_Type_ID = pCT.Customer_Type_ID end goAlright, we're ready to test. All we have to do is declare a variable of our table type, fill it with some data and pass it into our stored procedure. Note that if the procedure is being called from .Net code a datatable can be used as the parameter value:
declare @CustomerIDs Customer_Type_IDs insert into @CustomerIDs select 4 union select 5 exec proc_Customers_ByType_Get @CustomerIDsWhich will return:
Customer_ID Full_Name Customer_Type_ID ----------- -------------- ------------------------ 4 Chris Candleson 4 5 Greg Gimble 5 7 Billiam Barlette 5
Easy as pie, eh? Clearly this feature allows database developers to replace error prone and misplaced methodologies with something clean, strongly typed and performance friendly.

One of the coolest new features of SQL Server 2008 (and there are many) is the MERGE statement. This construct has existed in oracle and DB2 for a while now but it's new to SQL Server. Effectively it allows you to merge two tables together, combining UPDATE, INSERTS and DELETEs into one nifty package.
For the purposes of demonstration I'm going to provide a simple example that merges a table variable into a real table by updating rows with matching primary keys with the table variable's values. It will also insert into the real table where there's no match.
Consider the following table:
CREATE TABLE [dbo].[Things]( [ID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY, [Name] [varchar](16) NOT NULL ) Go
With the following content
ID Name ----------- ---------------- 1 computer 2 box 3 keyboard 4 mouse 5 whiteboard 6 pencil 7 monitorThen I'm going to create a table variable with some rows with a matching primary key and some rows that don't match:
declare @OtherThings table (ID int, Name varchar(255)) insert into @OtherThings (ID, Name) values (6, 'pen') insert into @OtherThings (ID, Name) values (5, 'marker') insert into @OtherThings (ID, Name) values (NULL, 'notepad') insert into @OtherThings (ID, Name) values (NULL, 'lamp')
Our goal is to update the contents of the [Name] column for [ID] 5 to 'marker' and 6 to 'pen'. We also want two new rows to be created with [Name] values of 'notepad' and 'lamp'. We'll accomplish that with the following statement:
merge Things T /* Target Table */ using @OtherThings OT /* Source table */ on (OT.ID = T.ID) /* join expression, how to determine matches */ when matched then update set T.Name = OT.Name /* update all matching rows */ when not matched then insert (Name) values /* create rows that don't match */ (OT.Name);
POW, our [Things] table now looks like:
ID Name ----------- ---------------- 1 computer 2 box 3 keyboard 4 mouse 5 marker 6 pen 7 monitor 8 notepad 9 lamp
Pretty simple, right? Note that the source table can be a table expression, and match conditions can be more complicated such as:
when matched and T.Name like '%c%' then
Naturally, there's more to it than I've illustrated here. Check out BOL for more details.

Digg it
Reddit
Delicous
Facebook










