I have read so much lately about how bad it is to use a recursive CTE to do… well pretty much anything. It came up in a discussion about creating sequential number, and not in a positive sort of way.
I wrote about recursive CTEs when they were first added to SQL Server, and have been a fan for doing breadth-first searching of a hierarchy/graph, but never even thought to use one to generate a set of numbers. As I kept hearing how bad it this method was, so I figured, let’s see just how terrible it is. And of course, I needed something to compare to, so I decided to try all of the ways I could think of. So I will generate varying numbers of rows with the following methods:
- Recursive CTE – A SQL construct that will basically query itself for multiple iterations.
- Good old fashioned WHILE loop – Just a while loop adding one for each iteration
- A cool mathy type generator using
CROSS JOINoperators – Something I have used before, that generates digits with cross joins of varying mathematical equations. - Using
ROW_NUMBER()on a very large set of rows withCROSS JOIN– Basically, generate a lot of rows that start returning data quickly, then use theROW_NUMBERfunction on the rows. GENERATE_SERIES– The built-in function in SQL Server 2022 and beyond that simply solved the problem in a very nice way indeed.
I will present the code in each section, and then I will execute these functions, providing a listing of the time it took to do 10, 100, 1000, 10000, 100000, and 100000 rows for each method.
Which will be faster? Read on.
The methods
In these sections, I will briefly describe the methods I will be using in my tests.
Recursive CTE
I had never tried this, but honestly, it is a really simple method. The basics are that you start with a starting value (which I will always start with 1 for these tests), and for each recursion, it will add one to the value and add it to a result set.
WITH BaseRows AS (
SELECT 1 AS value
UNION ALL
SELECT value + 1
FROM BaseRows
WHERE value BETWEEN 1 AND 9
)
SELECT Value
FROM BaseRows
OPTION (MAXRECURSION 0); --this makes the boundary data type based.
--so over 2 billion for the int type that
--that SELECT 1 as value would use.
All of the outputs for the code explanations will look like this:
------------- 1 2 3 4 5 6 7 8 9 10
For the performance tests, I will send the results to a temp table.
Good old-fashioned WHILE loop
This is the first algorithm we all used at one time. Just a loop and a counter. Not ideal with SQL, so we learned to create a table of numbers (also known as a sequence table or tally table).
CREATE TABLE #HoldValues (value int);
DECLARE @counter INT = 1
WHILE @counter <= 10
BEGIN
INSERT INTO #holdValues(value)
SELECT @counter
SET @counter = @counter + 1
END
SELECT * FROM #HoldValues;
### Using ROW_NUMBER() on a very large set of rows
I never personally used this method, but it is something that people do. For this, you find a table that has a good number of rows, typically a system view, like:
SELECT COUNT(*) FROM master.sys.indexes;
Then use:
SELECT COUNT(*)
FROM master.sys.indexes
CROSS JOIN master.sys.indexes as i2
CROSS JOIN master.sys.indexes as i3;
The output for this is: 10503459, which is good enough for our use, where I want to do at least 1 million values in the sequence. Then just do something like:
SELECT TOP 10 ROW_NUMBER() OVER (ORDER BY i2.object_id) as value
FROM master.sys.indexes
CROSS JOIN master.sys.indexes as i2
CROSS JOIN master.sys.indexes as i3;
This is something I saw more of before the advent of GENERATE_SERIES because it is very quick to write and works for even moderately large sets. Most of the time we don’t care about performance too much because we are loading a table with values. But when working ad hoc, sometimes you just don’t want to wait. Luckily, too, because there is no row filtering, it can start spooling results very fast.
This method and the next are really good, but you need to have control over the max number of rows you want to put out, and what the min and max are. They generally expect you start at 1 or zero, and if you want something else, you have to change the value with an offset. That offset won’t change the performance of the code, so not a big deal.
A cool mathy type generator
This code is something that someone smarter than me came up with. I have used it many times, and I understand how to use it, but it is more math than I like to think about 🙂
;WITH digits (I) AS (--set up a set of numbers from 0-9
SELECT I
FROM (VALUES (0),(1),(2),(3),(4),(5),
(6),(7),(8),(9)) AS digits (I))
--builds a set of data from from 0 to 999999
,Integers (I) AS (
SELECT D1.I + (10<em>D2.I) + (100</em>D3.I) +
(1000<em>D4.I) + (10000</em>D5.I) + (100000<em>D6.I) +
(1000000</em>D7.I)
FROM digits AS D1 CROSS JOIN digits AS D2
CROSS JOIN digits AS D3
CROSS JOIN digits AS D4
CROSS JOIN digits AS D5
CROSS JOIN digits AS D6
CROSS JOIN digits AS D7 )
--insert into table
SELECT I + 1 --starts with 0
FROM Integers
WHERE I + 1 <= 10;
The problem with this method is basically that you have to choose a size if you are writing it into code. And at these larger sizes, this can be pretty CPU-intensive. (My fans start cranking the second I run, even this 1-10 query because of it.
GENERATE_SERIES
And finally, the newest of them all, the built-in functionality. Which is fastest? That is coming up next.
SELECT value FROM GENERATE_SERIES(1,10);
This can’t be the fastest, can it? Generally speaking, the built-in tools aren’t always the best performers… but maybe this time?
The results
Using the code in the Appendix that follows this section, the tests were run on my test machine that you can see listed here as my Test Machine (NUC9i7QNX). I used SQL Server 2025 Enterprise RTM for my tests.
In these tests, here is the thing. None of these methods performed so poorly that loading a table of numbers with them is particularly terrible. All of them produced a set of a million rows in less than 12 seconds. But the absolute best performer turned out to be:
SectionName \ Rows 10 100 1000 10000 100000 1000000 Rows --------------------- ----- ------ ------ ------ ------- ------------ Recursive CTE 1 3 12 120 929 9292 While Loop 1 2 13 112 1116 11661 CROSS JOIN ROWNUMBER 59 63 95 97 397 3602 CROSS JOIN Digits 5550 5505 5573 5423 5615 5952 GENERATE_SERIES 2 1 2 12 103 1123 Times in microseconds.
GENERATE_SERIES! By quite some distance. Interesting to me was that the CROSS JOIN ROWNUMBER method was the second-best performer. I had expected that one to be much slower, honestly.
In any case, if you have a need to generate a series of integers, use GENERATE_SERIES. It is the fastest available method in SQL Server and is just plain easier. You can tailor your output to be what you need with parameters for start, stop, and a step value that can be int, bigint, or even numeric.
One thing to note, the only method that uses parallelism is the CROSS JOIN Digits method. And in order to create the million rows I want and use the same code, it cranked my machine hard. However, while it was over 5 seconds to generate 10 rows, it was only 5 seconds to generate a million rows. If I weren’t afraid my computer was going to melt into a puddle of silicone, I would see just how far I could go with it.
Appendix – The test code
This should work as is on any machine. It is set to create a table in your database that will hold the output and the times. It is also set to reset when you do 10 rows, so don’t forget that and lose your samples.. not that I did that or anything.
–Note: code isn’t well formatted.
SET NOCOUNT ON;
IF OBJECT_ID(‘dbo.HoldTimes’) is null
BEGIN
CREATE TABLE dbo.HoldTimes (
sectionName varchar(30),
starttime datetime2,
endtime datetime2,
sampleSize int,
DurationMilliseconds as (DATEDIFF(millisecond,starttime, endtime))
);
END;
DROP TABLE IF EXISTS #holdValues;
CREATE TABLE #holdValues (value INT NOT NULL PRIMARY KEY (value));
DECLARE @maxValue INT = 100000 –0000;
IF @maxValue = 10
TRUNCATE TABLE dbo.HoldTimes
PRINT ‘Recursive CTE’;
TRUNCATE TABLE #holdValues;
INSERT INTO dbo.HoldTimes (SectionName,StartTime,SampleSize)
VALUES (‘Recursive CTE’,SYSDATETIME(),@maxValue);
WITH BaseRows AS (
SELECT 1 AS value
UNION ALL
SELECT value + 1
FROM BaseRows
WHERE value BETWEEN 1 AND (@MaxValue – 1)
)
INSERT INTO #holdValues (value)
SELECT Value
FROM BaseRows
OPTION (MAXRECURSION 0);
UPDATE dbo.HoldTimes
SET EndTime = sysdatetime()
WHERE sectionName = ‘Recursive CTE’
AND sampleSize = @maxValue
IF (SELECT COUNT(*) from #holdValues) <> @MaxValue
THROW 50000,’ERROR Occurred’,1;
PRINT ‘While Loop’;
TRUNCATE TABLE #holdValues;
INSERT INTO dbo.HoldTimes (SectionName,StartTime,SampleSize)
VALUES (‘While Loop’,SYSDATETIME(),@maxValue);
DECLARE @counter INT = 1
WHILE @counter <= @maxvalue
BEGIN
INSERT INTO #holdValues(value)
SELECT @counter
SET @counter = @counter + 1
END
UPDATE dbo.HoldTimes
SET EndTime = sysdatetime()
WHERE sectionName = ‘While Loop’
and sampleSize = @maxValue
IF (SELECT COUNT(*) from #holdValues) <> @MaxValue
THROW 50000,’ERROR Occurred’,1;
PRINT ‘CROSS JOIN ROWNUMBER’;
TRUNCATE TABLE #holdValues
INSERT INTO dbo.HoldTimes (SectionName,StartTime,SampleSize)
VALUES (‘CROSS JOIN ROWNUMBER’,SYSDATETIME(),@maxValue);
INSERT INTO #holdValues(value)
SELECT TOP (@MaxValue) ROW_NUMBER() OVER (ORDER BY i2.object_id) as value
FROM master.sys.indexes
CROSS JOIN master.sys.indexes as i2
CROSS JOIN master.sys.indexes as i3;
UPDATE dbo.HoldTimes
SET EndTime = sysdatetime()
WHERE sectionName = ‘CROSS JOIN ROWNUMBER’
and sampleSize = @maxValue
IF (SELECT COUNT(*) from #holdValues) <> @MaxValue
THROW 50000,’ERROR Occurred’,1;
PRINT ‘CROSS JOIN Digits’;
TRUNCATE TABLE #holdValues
INSERT INTO dbo.HoldTimes (SectionName,StartTime,SampleSize)
VALUES (‘CROSS JOIN Digits’,SYSDATETIME(),@maxValue);
;WITH digits (I) AS (–set up a set of numbers from 0-9
SELECT I
FROM (VALUES (0),(1),(2),(3),(4),(5),
(6),(7),(8),(9)) AS digits (I))
–builds a set of data from from 0 to 999999
,Integers (I) AS (
SELECT D1.I + (10D2.I) + (100D3.I) +
(1000D4.I) + (10000D5.I) + (100000D6.I) +
(1000000D7.I) + (10000000*D8.I)
FROM digits AS D1 CROSS JOIN digits AS D2
CROSS JOIN digits AS D3
CROSS JOIN digits AS D4
CROSS JOIN digits AS D5
CROSS JOIN digits AS D6
CROSS JOIN digits AS D7
CROSS JOIN digits AS D8 )
–insert into table
INSERT INTO #holdValues (value)
SELECT I + 1
FROM Integers
WHERE I <= @MaxValue – 1;
UPDATE dbo.HoldTimes
SET EndTime = SYSDATETIME()
WHERE sectionName = ‘CROSS JOIN Digits’
AND sampleSize = @maxValue
IF (SELECT COUNT(*) FROM #holdValues) <> @MaxValue
THROW 50000,’ERROR Occurred’,1;
PRINT ‘GENERATE_SERIES’
TRUNCATE TABLE #holdValues;
INSERT INTO dbo.HoldTimes (SectionName,StartTime,SampleSize)
VALUES (‘GENERATE_SERIES’,SYSDATETIME(),@maxValue);
INSERT INTO #holdValues (value)
SELECT value
FROM GENERATE_SERIES(1,@MaxValue)
UPDATE dbo.HoldTimes
SET EndTime = SYSDATETIME()
WHERE sectionName = ‘GENERATE_SERIES’
AND sampleSize = @maxValue;
IF (SELECT COUNT(*) FROM #holdValues) <> @MaxValue
THROW 50000,’ERROR Occurred’,1;
PRINT ‘That” All’
SELECT SectionName,
MAX(CASE WHEN SampleSize = 10 THEN DurationMilliseconds ELSE -1 END) AS [10 Rows],
MAX(CASE WHEN SampleSize = 100 THEN DurationMilliseconds ELSE -1 END) AS [100 Rows],
MAX(CASE WHEN SampleSize = 1000 THEN DurationMilliseconds ELSE -1 END) AS [1000 Rows],
MAX(CASE WHEN SampleSize = 10000 THEN DurationMilliseconds ELSE -1 END) AS [10000 Rows],
MAX(CASE WHEN SampleSize = 100000 THEN DurationMilliseconds ELSE -1 END) AS [100000 Rows],
MAX(CASE WHEN SampleSize = 1000000 THEN DurationMilliseconds ELSE -1 END) AS [1000000 Rows]
FROM dbo.HoldTimes
GROUP BY SectionName




Leave a comment