Cool features in SQL Server I missed…DATE_BUCKET

I keep finding/hearing about very useful features that I just plain missed. I plan to read over all the “what’s new” entries for SQL Server (well, at least the Transact-SQL sections!) and see what else I have missed, and then other features that I haven’t used enough that feel useful.

I heard about this feature in a LinkedIn post by Jovan Popovic a few weeks back.

And I instantly understood what he meant by how useful that the DATE_BUCKET function (arriving in SQL Server 2022) would be!

Using this function, you can group data easily into different time buckets, like year, month, day (which are standard enough, of course), but also into buckets like 2 days, 6.4 weeks, etc. Now, I don’t feel like this should ever make you decide that you don’t need a date dimension for you warehouse, but it is great when you are are just checking out data and want to play with different intervals in an easy manner.

The Basics

The DATE_BUCKET function is more or less a tool to round off date values to a lower boundary. Like say you have the date 2025-12-30. If you want to put this into the December 2025 bucket, it will let you use the following parameters:

DATE_BUCKET(<time period>,<time period count>,<date value>,[<origin date>])

So, for instance, if you want to create a relatively simple date bucket at the month level, you can use something like this (just showing one date value in this first example):

SELECT DATE_BUCKET(month,1,CAST('2025-12-30' AS DATE));

This returns:

------------
2025-12-01

As it would for any other date literal that started with '2025-12-' that was a valid date, of course. For time values, you can do this down to the millisecond, too. Here I am putting data into buckets by second:

SELECT DATE_BUCKET(second,5,CAST('2025-12-30 3:45:16.145' AS DATETIME2(3)));

This returns:

2025-12-30 03:45:15.000

We will talk a bit more about this later in the blog, but for now, just realize that all groupings start at 1900-01-01 00:00:00.000, which was a Monday. This will come up a bit later as we start to do buckets where there is no obvious starting point. For example, 3-week buckets. When is the week of the 3?

Side note, DATE_BUCKET does not work with literal values directly, probably for performance reason, and that it is clearly meant to work on lots of rows. This call: SELECT DATE_BUCKET(month,1,'2025-12-30'); will give you an error that states:

Msg 8116, Level 16, State 1, Line 16 Argument data type varchar is invalid for argument 3 of Date_Bucket function.

Examples

I know I come to blogs for examples, so here we go. I have created a table with a row per second to show how we can group them.

CREATE TABLE #ExampleData
(
ExampleDataId int NOT NULL,
TimeValue   datetime2(0) --data to the minute
)
INSERT INTO #ExampleData(ExampleDataId, TimeValue)
SELECT VALUE AS ExampleDataId, DATEADD(MINUTE, value - 1,'2025-01-01')
FROM   GENERATE_SERIES(1,525600);

In the examples, I will start at a very high level and work down to grouping by minutes.

Year and Month groupings

Just to get started, here is the count of all data in the table:

SELECT COUNT(*)
FROM   #ExampleData

This returns:

-----------
525600

Not to get overly introspective, but that doesn’t seem like a lot of minutes when you think about it.

Another way you can get the same answer is to do:

SELECT DATE_BUCKET(month,12,TimeValue) as DateGroup, COUNT(*) as GroupCount
FROM   #ExampleData
GROUP BY DATE_BUCKET(month,12,TimeValue);

This returns:

DateGroup                   GroupCount
--------------------------- -----------
2025-01-01 00:00:00         525600

So grouping at the 12-month bucket, you can see that everything is rounded down to the start of a 12-month period (this is analogous to a year:

SELECT DATE_BUCKET(year,1,TimeValue) as DateGroup, COUNT(*) as GroupCount
FROM   #ExampleData
GROUP BY DATE_BUCKET(year,1,TimeValue);

Same output. Since this is sort of a rounding function, you can’t do (at least not that I have seen yet) any really complex groupings that don’t fit into the “normal” calendar. So you can’t do user-defined time ranges like the first week and a half of the month, then the rest of the week or something. For that, you still need to create a date table. But anything that can be based on equally sized buckets, based on an integer number of periods.

For example, if you try to use a fractional number:

SELECT DATE_BUCKET(year,.5,TimeValue) as DateGroup, COUNT(*) as GroupCount
FROM   #ExampleData
GROUP BY DATE_BUCKET(year,.5,TimeValue);

You get this kind of confusing error message, since .5 does turn out to be a positive value!

Msg 9834, Level 16, State 1, Line 75 Invalid bucket width value passed to date_bucket function. Only positive values are allowed.

Months

SELECT DATE_BUCKET(month,1,TimeValue) as DateGroup, COUNT(*) as GroupCount
FROM   #ExampleData
GROUP BY DATE_BUCKET(month,1,TimeValue)
ORDER BY DateGroup;

This returns:

DateGroup                   GroupCount
--------------------------- -----------
2025-01-01 00:00:00         44640
2025-02-01 00:00:00         40320
2025-03-01 00:00:00         44640
2025-04-01 00:00:00         43200
2025-05-01 00:00:00         44640
2025-06-01 00:00:00         43200
2025-07-01 00:00:00         44640
2025-08-01 00:00:00         44640
2025-09-01 00:00:00         43200
2025-10-01 00:00:00         44640
2025-11-01 00:00:00         43200
2025-12-01 00:00:00         44640

Note that, of course, the groups are not equal in rowcount/size, because not every month has the same number of days. Half years:

SELECT DATE_BUCKET(month,6,TimeValue) as DateGroup, COUNT(*) as GroupCount
FROM   #ExampleData
GROUP BY DATE_BUCKET(month,6,TimeValue)
ORDER BY DateGroup;

This returns:

DateGroup                   GroupCount
--------------------------- -----------
2025-01-01 00:00:00         260640
2025-07-01 00:00:00         264960

Quarters

SELECT DATE_BUCKET(month,3,TimeValue) as DateGroup, COUNT(*) as GroupCount
FROM   #ExampleData
GROUP BY DATE_BUCKET(month,3,TimeValue)
ORDER BY DateGroup;

Now we have 4 groups

DateGroup                   GroupCount
--------------------------- -----------
2025-01-01 00:00:00         129600
2025-04-01 00:00:00         131040
2025-07-01 00:00:00         132480
2025-10-01 00:00:00         132480

Weeks

If you want to group by weeks, that is easy enough. As an example, some organizations work on a 13-month system, where there are 4 weeks per “month.

SELECT DATE_BUCKET(Week,4,TimeValue) as DateGroup, COUNT(*) as GroupCount
FROM   #ExampleData
GROUP BY DATE_BUCKET(Week,4,TimeValue)
ORDER BY DateGroup;

Now we get 14 rows out (partial groups on each end of the listing, which, as you can see, have fewer rows because the only data in the table is for 2025, but for rows early in the year, they group to a date in 2024):

DateGroup                   GroupCount
--------------------------- -----------
2024-12-16 00:00:00         17280
2025-01-13 00:00:00         40320
2025-02-10 00:00:00         40320
2025-03-10 00:00:00         40320
2025-04-07 00:00:00         40320
2025-05-05 00:00:00         40320
2025-06-02 00:00:00         40320
2025-06-30 00:00:00         40320
2025-07-28 00:00:00         40320
2025-08-25 00:00:00         40320
2025-09-22 00:00:00         40320
2025-10-20 00:00:00         40320
2025-11-17 00:00:00         40320
2025-12-15 00:00:00         24480

The starting point

One thing to note for the previous example is that Monday is the start of the week, and as briefly mentioned earlier in the post, the starting bucket for weeks is Monday, 1900-1-1 (which also means years start on 1-1, and months do too).

What can be interesting is that this does not correspond to my @@DATEFIRST value, so you need to set this in the function call using another parameter. Like on my server:

SELECT @@DATEFIRST;

This returns 7 (Sunday), not Monday. But you can see that the first grouping in the previous query was 2024-12-16. and that is a Monday:

WITH BaseRows AS (
  SELECT DATE_BUCKET(Week,4,TimeValue) as DateGroup, COUNT(*) as GroupCount
  FROM   #ExampleData
  GROUP BY DATE_BUCKET(Week,4,TimeValue)
)
SELECT *, DATENAME(weekday,DateGroup) as DayOfTheWeek
FROM   BaseRows
ORDER BY DateGroup;

As this returns:

DateGroup                   GroupCount
--------------------------- ----------- ------------------------------
2024-12-16 00:00:00         17280       Monday
2025-01-13 00:00:00         40320       Monday
2025-02-10 00:00:00         40320       Monday
2025-03-10 00:00:00         40320       Monday
2025-04-07 00:00:00         40320       Monday
2025-05-05 00:00:00         40320       Monday
2025-06-02 00:00:00         40320       Monday
2025-06-30 00:00:00         40320       Monday
2025-07-28 00:00:00         40320       Monday
2025-08-25 00:00:00         40320       Monday
2025-09-22 00:00:00         40320       Monday
2025-10-20 00:00:00         40320       Monday
2025-11-17 00:00:00         40320       Monday
2025-12-15 00:00:00         24480       Monday

If you want to change that, you can do something like this, setting the origin date yourself (which was briefly mentioned earlier), using a call like this: DATE_BUCKET(Week,4,TimeValue,cast('2025-01-01' as datetime2)). The origin will not accept a string literal either, which makes a lot less sense than the date value, but it is what it is:

WITH BaseRows AS (
  SELECT DATE_BUCKET(Week,4,TimeValue,cast('2025-01-01' as datetime2)) as DateGroup,
         COUNT(*) as GroupCount
  FROM   #ExampleData
  GROUP BY DATE_BUCKET(Week,4,TimeValue,cast('2025-01-01' as datetime2))
)
SELECT *, DATENAME(weekday,DateGroup) as DayOfTheWeek
FROM   BaseRows
ORDER BY DateGroup;

Now in the output, you will see that each group is set to Wednesday, which was the first day of 2025.

DateGroup                   GroupCount  DayOfTheWeek
--------------------------- ----------- ------------------------------
2025-01-01 00:00:00.0000000 40320       Wednesday
2025-01-29 00:00:00.0000000 40320       Wednesday
2025-02-26 00:00:00.0000000 40320       Wednesday
2025-03-26 00:00:00.0000000 40320       Wednesday
2025-04-23 00:00:00.0000000 40320       Wednesday
2025-05-21 00:00:00.0000000 40320       Wednesday
2025-06-18 00:00:00.0000000 40320       Wednesday
2025-07-16 00:00:00.0000000 40320       Wednesday
2025-08-13 00:00:00.0000000 40320       Wednesday
2025-09-10 00:00:00.0000000 40320       Wednesday
2025-10-08 00:00:00.0000000 40320       Wednesday
2025-11-05 00:00:00.0000000 40320       Wednesday
2025-12-03 00:00:00.0000000 40320       Wednesday
2025-12-31 00:00:00.0000000 1440        Wednesday

Very cool stuff, but perhaps a wee bit tricky too when you are first starting to work on it.

A brief example at a grain less than a day

Remember that our sample data is to the minute:

SELECT TOP 10 *
FROM   #ExampleData
ORDER BY TimeValue asc;

This returns:

ExampleDataId TimeValue
------------- ---------------------------
1             2025-01-01 00:00:00
2             2025-01-01 00:01:00
3             2025-01-01 00:02:00
4             2025-01-01 00:03:00
5             2025-01-01 00:04:00
6             2025-01-01 00:05:00
7             2025-01-01 00:06:00
8             2025-01-01 00:07:00
9             2025-01-01 00:08:00
10            2025-01-01 00:09:00

You can use the different time settings for your data, even down to milliseconds, but for my quick example, I will do 10-minute time periods (still starting at the year’s start for a grouping):

WITH BaseRows AS (
  SELECT DATE_BUCKET(Minute,10,TimeValue,cast('2025-01-01' as datetime2)) 
                                                    as   DateGroup, 
         COUNT(*) as GroupCount
  FROM   #ExampleData
  GROUP BY DATE_BUCKET(Minute,10,TimeValue,cast('2025-01-01' as datetime2))
)
SELECT Top 10</em>, DATENAME(weekday,DateGroup) as DayOfTheWeek
FROM   BaseRows
ORDER BY DateGroup;

The output here is:

DateGroup                   GroupCount  DayOfTheWeek
--------------------------- ----------- ------------------------------
2025-01-01 00:00:00.0000000 10          Wednesday
2025-01-01 00:10:00.0000000 10          Wednesday
2025-01-01 00:20:00.0000000 10          Wednesday
2025-01-01 00:30:00.0000000 10          Wednesday
2025-01-01 00:40:00.0000000 10          Wednesday
2025-01-01 00:50:00.0000000 10          Wednesday
2025-01-01 01:00:00.0000000 10          Wednesday
2025-01-01 01:10:00.0000000 10          Wednesday
2025-01-01 01:20:00.0000000 10          Wednesday
2025-01-01 01:30:00.0000000 10          Wednesday
WITH BaseRows AS (
  SELECT DATE_BUCKET(Minute,10,TimeValue,cast('2025-01-01 00:02' 
                                         as datetime2)) as DateGroup, 
       COUNT(*) as GroupCount
  FROM   #ExampleData
  GROUP BY DATE_BUCKET(Minute,10,TimeValue,cast('2025-01-01 00:02' as datetime2))
)
SELECT Top 10 *, DATENAME(weekday,DateGroup) as DayOfTheWeek
FROM   BaseRows
ORDER BY DateGroup;

And we can even shift this in the origin by a few seconds:

WITH BaseRows AS (
   SELECT DATE_BUCKET(Week,4,TimeValue,cast('2025-01-01' 
                           as datetime2)) as DateGroup,   
          COUNT(*) as GroupCount
   FROM   #ExampleData
   GROUP BY DATE_BUCKET(Week,4,TimeValue,cast('2025-01-01' as datetime2))
)
SELECT *, DATENAME(weekday,DateGroup) as DayOfTheWeek
FROM   BaseRows
ORDER BY DateGroup;

Now the group starts 2 minutes earlier, and the first 2 seconds of the year are in a group that started on Tuesday 12-31.


DateGroup                   GroupCount  DayOfTheWeek
--------------------------- ----------- ------------------------------
2024-12-31 23:52:00.0000000 2           Tuesday
2025-01-01 00:02:00.0000000 10          Wednesday
2025-01-01 00:12:00.0000000 10          Wednesday
2025-01-01 00:22:00.0000000 10          Wednesday
2025-01-01 00:32:00.0000000 10          Wednesday
2025-01-01 00:42:00.0000000 10          Wednesday
2025-01-01 00:52:00.0000000 10          Wednesday
2025-01-01 01:02:00.0000000 10          Wednesday
2025-01-01 01:12:00.0000000 10          Wednesday
2025-01-01 01:22:00.0000000 10          Wednesday

Summary

SQL Server 2022 added the DATE_BUCKET function that will let you very easily group data based on even date ranges without needing a special table or another function to write the grouping. It is clearly syntactical sugar, in that you could do the same thing with existing functions, but it makes it far easier to do very quickly.

You can group to centuries or milliseconds, depending on your need. Which is pretty cool.

4 responses to “Cool features in SQL Server I missed…DATE_BUCKET”

  1. paschott Avatar

    Curious about the performance differences for date_bucket vs the “old school” ways of doing it. I’d hope it would be similar and/or faster. I know over the years I’ve written a bunch of “group by year/month” or similar bits of code for rolling up to a monthly aggregate. 🙂

    Like

    1. Louis Davidson Avatar

      I haven’t done any performance testing yet. Just some queries to see what it would allow you to do.

      I can’t see that it could be worse than using datepart/datediff functions that can’t use indexes either.

      Date dimensions probably are better when you have date values, something I will pay with sometime.

      Like

      1. paschott Avatar

        I’ve done date dimension with “INT” dates (yyyymmdd) before. Not pretty, but smaller than the alternatives in many cases and fast. I think when “DATE” was introduced as a datatype, I started using that. But always had to do some craziness around the rollups people wanted. Having a good Calendar/Date table helped a lot there – storing metadata around current month, year, day, holiday, weekend, fiscal details, and so on. Rolling up against that was a whole lot easier than some of the calculations and hoops we went through. 🙂

        Like

      2. Louis Davidson Avatar

        Yeah, we still have int dates in our data warehouse. When I test it I will use date values in my date table to keep it simple.

        Like

Leave a comment

I’m Louis

I have been at this database thing for a long long time, with no plans to stop.

This is my blog site, companion to drsql.org

Recents