Regular Expressions to Generate Disable/Drop Index Scripts
Earlier this year, I embarked on a bit of a project to tidy up the indexes in a sizeable database. This database has over 900 tables, and there are quite a few indexes which I have long suspected don’t need to exist for a variety of reasons. Some indexes are redundant, others don’t get used, still others could be combined into a single index. By reducing the number of indexes, we can improve write performance in the database, and the size of the database itself. But we have to do it carefully.
Data Collection
Before doing anything with these indexes, I needed data about them. Enter Brent Ozar’s sp_BlitzIndex
(part of the First Responder Kit). Running this stored procedure with the @mode=4
parameter, I was able to collect usage statistics for the most interesting indexes in the database. These statistics do not persist across SQL Server instance restarts, so I ran sp_BlitzIndex
immediately before a restart, but after the instance had been running for long enough to cover “normal” activity. Then I saved it off to an Excel file for analysis. You can do this with a Powershell one-liner using dbatools
and ImportExcel
.
Invoke-DbaQuery -SqlInstance MyInstance -Query "exec sp_BlitzIndex @DatabaseName=MyDB,@Mode=4;" | Export-Excel -Path ~\Documents\IndexAnalysis.xlsx`;
What’s the Issue?
For this project, two of my biggest concerns are:
- Indexes that are written, but never read
- Indexes with a very low read:write ratio
These are indexes which are potentially causing a drain on my write performance while providing little to no value on reads. For example:
Reads: 5,692 Writes: 8,683,633 on: dbo.tablename.indexname (10)
We’re spending a lot of effort maintaining indexes like this, but rarely looking at them. These are good candidates for elimination.
But There Are So Many!
In my case, there were several dozen indexes that fit this description. Manually writing out all of the statements to disable and then drop them was going to be tedious, and I don’t enjoy writing text-processing formulas in Excel. However, I do enjoy a good regular expression and my favorite text editor, Notepad++, supports them in its find/replace function. So I copied and pasted the index stats seen above into Notepad++, and used the following find/replace:
Find:
^[\w\s\:,]+(dbo\.\w+)\.(\w+)\s\(\d+\)$
Replace:
alter index $2 on $1 disable;
Result:
alter index indexname on tablename disable;
Here’s the explanation of the regular expression:
- Starting at the beginning of the string
- Look for one or more word, whitespace,
:
and comma characters - Look for the string
dbo.
followed by one or more word characters. Save this collection of characters for use later. - Look for a
.
- Look for one or more word characters. Save this collection of characters for use later
- Look for one or more whitespace characters, then one or more numbers inside parens, ending at the end of the string or line.
The collections of characters being “saved for later” are called capture groups and can be referenced in Notepad++’s find/replace (and other regex engines) with $1
, $2
, and so on, sequentially.
We could do this right in Azure Data Studio, Visual Studio Code, or SSMS as well if we really wanted to.
I Thought You Were A PowerShell Stan, Andy!
Yes, we can do it with PowerShell too:
$string = "Reads: 5,692 Writes: 8,683,633 on: dbo.tablename.indexname (10)";
$regex = "^[\w\s\:, ]+(dbo\.\w+)\.(\w+)\s\(\d+\)$";
$MatchFound = $string -match $regex;
if ($MatchFound) {
"alter index $($Matches[2]) on $($Matches[1]) disable;";
}
When doing regular expression matches in PowerShell, the capture groups start at the position 1 of the $Matches
collection. Position 0, or $Matches[0]
, is the full string that was matched by the expression. You can also name the capture groups, but that’s a subject for another time.
If I were using PowerShell in my project, I’d loop over all of the results returned by sp_BlitzIndex
and process the applicable records one at a time in the loop, collecting all of the statements to output at the end.
Just Disabling the Indexes?
Yes! Rarely do I drop
an index immediately. I prefer to disable indexes that are candidates to be dropped and let things run for a while. This lets me monitor for any negative impacts and if something goes off the rails, I can rebuild
an index (which re-enables it) to get things back in line.
After a few weeks if I’m satisfied that everything is copacetic, I can drop
the indexes entirely. We can use the same regular expression for the “find”, and replace using the same capture groups.
drop index $2 on $1;
Wrap Up
All told, turning the output of sp_blitzIndex
into dozens of alter index
and drop index
statements via regular expression was much quicker for me than trying to parse it out in Excel, or doing multiple find/replace operations on plain strings. And it was a lot more fun, too. It may be different for you. But hopefully this is another item you can put into your toolkit and have at your disposal when the opportune moment arrives.