T-SQL Tuesday #118 – Your Fantasy SQL Feature

Andy Levy

2019-09-10

Page content

It’s early September, which means it’s time for T-SQL Tuesday! This month’s topic comes from Kevin Chant (blog | twitter). Our mission, should we choose to accept it (click the image to see the original invite):

Recently I’ve had to submit suggestions to Microsoft about Azure DevOps and SQL Server.

I will admit a couple of the suggestions had certainly been in my head for a while. In fact, I wish I had suggested them sooner.

My invitation to yourselves this month is to write a post about a fantasy SQL Server feature you’ve got in mind.

It can be absolutely anything at all related to SQL Server. For example, it could be about a new SQL Server operator to improve queries or a new service that does something amazing.

I think anyone who has worked with SQL Server for a while has at least one of these. I posted a while back about my desire for Script Update Mode to be run in parallel, to speed startup with new Service Packs and Cumulative Updates (my latest CU took 4 1/2 hours). But I’m not going to recycle that suggestion.

Divide and Conquer

So, here’s my fantasy feature: parallel query execution. But it’s not what you think. SQL Server has had the ability to create a query plan that uses multiple parallel threads for as long as I can remember. We can control how far parallel a query will go and how expensive a query has to be before it’s considered for parallelization with max degree of parallelism and cost threshold for parallelism. That’s all well and good.

No, what I want is the the ability to execute the same query against multiple databases on a single instance simultaneously and have the results merged into a single result set.

I’ve…Seen This Before?

You may be saying to yourself “self! I thought I could do this already!” And you’d be correct, from a certain point of view. With Registered Servers in SQL Server Management Studio, we can execute one query against multiple instances, and the results are compiled in to a single result set in SSMS for you to view.

Starting a multi-instance query

Multi-instance query results

In Azure SQL DB, you can use Elastic Query but it requires some setup and from what I’ve gathered, your environment really has to be designed to handle it.

We Aren’t All-In with the Cloud Yet

But this isn’t an option today for the on-premises, “boxed” SQL Server product. And probably not with Managed Instances either (I admit that I haven’t looked into beyond a cursory search). What I want is the ability to specify a collection of databases on my instance and the query to be executed. All through T-SQL.

Here’s what I propose it looking like (and if implemented, I’m sure it’ll look completely different).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


WITH MyMultiDBQuery
AS (
	SELECT UserName
		,LastLogin
		,IsActive
	FROM SiteUsers
	)
SELECT *
FROM MyMultiDBQuery OVER (
		SELECT name AS DBName
		FROM sys.databases
		WHERE database_id > 4
			AND name NOT LIKE '%tools%'
		);

DBName	UserName	LastLogin	IsActive
Cust001	Andy	2019-09-01 11:23:00	1
Cust002	SQLHat	2019-07-21 17:05:00	0

A few ideas of possibilities this opens up:

Query tables across many databases and compile the results into a single temp table or report
Run the same data or schema change query against multiple databases
Copy data from one database into multiple databases at once
Query across multiple databases from the application tier (if you trust the everyone to properly secure and not abuse it, that is) in a single batch

Can’t I Do This Already?

Well…sort of. You can write a cursor to loop over a list of database names, and inside that cursor write dynamic SQL for your queries. But it won’t be parallel, and it’s easy to throw a syntax error or even SQL injection into the mix with dynamic SQL.

3rd-party tools like Red Gate’s Multi Script can run a query across multiple databases simultaneously. You could even pull together some PowerShell and run that in threads, either via jobs (Start-Job, Start-RSJob, Start-ThreadJob) or rolling your own with Runspaces. But both of these solutions run independent batches and don’t give you a single unified result set without some extra work like writing to a table (temp or otherwise).

Combo Platter

Since this proposal is “just” T-SQL, and with Registered Servers we can execute the same T-SQL against multiple instances concurrently, can we take this to the next level and run my proposed query construct against all the selected registered servers? I don’t see why not. The results would look something like this:

Server Name	DBName	UserName	LastLogin	IsActive
localhost\sql16	Cust001	Andy	2019-09-01 11:23:00	1
localhost\sql17	Cust002	SQLHat	2019-07-21 17:05:00	0

What Do You Think?

Let me know what you think in the comments below. Would this be a valuable feature for you? Do other multi-tenant, database-per-tenant environments face the same challenges that would be aided by this feature? Would there be value in allowing the user to set the number of concurrent queries this can execute (maybe up to MAXDOP as configured on the server)?