Josh

I'm a developer in Melbourne, Australia, and co-founder of Hello Code.

Published Tue 23 October 2007

← Home

Paging MSSQL results through ODBC: an epic tale

Now, if you're an open-source developer, you probably break into a sweat whenever working with Microsoft products is mentioned. Apache will always beat IIS and MySQL (or PostgreSQL) will reign supreme over SQL Server, despite those naysayers who worship at the temple of Steve Ballmer (I hear they chant "developers, developers, developers" and dance in religious ecstasy). And so it is with me. I knew that SQL Server had all those extras like stored procedures, user-defined functions and views — which MySQL didn't until recently — but I preferred the non-evil side of the force.

Turns out MySQL offers some little extras after all. Little things like the LIMIT clause, which allows you to define a range of results to be returned. This function is absolutely invaluable in paging data, but... in versions of SQL Server older than 2005, there's just nothing similar. The closest is the TOP clause, which returns the first X amount of rows, but does not allow you to specify a starting index.

What follows is a journey of strength and courage, of overcoming adversity, of sticking with SQL Server even when all seems hopeless. This is the story of attempting to page data returned from SQL Server 2000.

Attempt 1. Using nested queries to emulate LIMIT

Our first attempt was to directly emulate LIMIT syntax within the SQL query.

SELECT TOP a * FROM (
    SELECT TOP b *
        FROM table
        WHERE conditions
        ORDER BY table.field  ASC)
    ORDER by field DESC) 

Where a is the amount you need, and b is the amount + the starting index. Or something.

Which doesn't work because...

This is all well and good if you have, say, less than 1000 results in your table. After that you can literally hear the server's cogs grinding, as your page loads... and loads...

Our web app is going to have several hundred thousand after a year or so, so this one was out.

Attempt 2. Using stored procedures and temp tables

The Internet has many, many articles on this topic. I guess there are a lot of people gnashing teeth and crying out in SQL Server-ed pain. From its collective knowledge we chose a handy stored procedure which attempted to use an extra variable to track the result count, and return the correct amount.

DECLARE  @StartRow INT
DECLARE  @EndRow INT
SET      @StartRow = a
SET      @EndRow = b

CREATE TABLE #tables (RowNumber INT IDENTITY(1,1), fields)
INSERT INTO #tables (fields)
SELECT  fields
FROM    table
ORDER BY field

SELECT   *
FROM     #tables
WHERE    RowNumber > @StartRow AND RowNumber < @EndRow

DROP TABLE #tables
GO

Looking good, right? In this method we create a temporary table to hold a copy of the entire table, then return the paged set that we need.

Which doesn't work because...

Actually, the fault appears to be with PHP's ODBC functions. The procedure executes fine, but no rows are returned in the result. There is scant documentation of this phenomenon, but what little we found seems to confirm our findings — ODBC can't return the results of a stored procedure. Yay.

Also, the performance of duping an entire table for each query is just laughable.

Attempt 3. Using a user-defined function to return a table variable

Valiant and courageous, and also under pressure from the boss man, we struggled on up the SQL mountain. Precious jewels in the form of SQL rows exist at the top, but are guarded by snarling dogs and, inexpicably, a sweaty fat man.

The next approach we tried was using the old user-defined functions. These sound similar to a stored procedure, but have the advantage that they can be called from any old query, eg. SELECT * FROM FunkyFunc() and voila, results are returned.

With four gentlemen from Rolla assisting, we created a function like this:

CREATE FUNCTION PagingPleaseWork
(
   @startRowIndex int,
   @maximumRows int
)
RETURNS (@TempItems)

AS

DECLARE @TempItems TABLE (fields)

DECLARE @maxRow int

SET @maxRow = (@startRowIndex + @maximumRows)-1 

SET ROWCOUNT @maxRow

INSERT INTO @TempItems (fields)
SELECT   fields
FROM     table
ORDER BY field

SET ROWCOUNT @maximumRows

SELECT *
FROM   @TempItems t
WHERE  ID >= @startRowIndex

SET ROWCOUNT 0

GO

This is our final assault. Believing we are near the peak of Mt SQL, we launch a last assault on the rocky slopes. It seems so perfect. Execute a function within a query which returns the necessary rows. Genius.

Only it doesn't work.

As it turns out, some wag decided that functions like ROWCOUNT (which are used in this instance, strangely enough, to make sure the appropriate total number of rows is returned) are invalid within a function.

Great.

Attempt 4. Give up and buy SQL Server 2005

A-team retrieve the SQL
A happy ending.

Seriously, honestly, that's what we did. We begged the company to upgrade. And apparently the sound of a grown man crying was enough to sway them.

The SQL Server 2005 approach

Someone at MS wised up, and the ROW_NUMBER function now returns a sequential number per row, specifically for paging.

SELECT  *
FROM    (SELECT  ROW_NUMBER() OVER (ORDER BY ID DESC)
         FROM LOG)

WHERE  Row >= a AND Row <= b

And it only took them 18 years.

To top