T-SQL Tuesday 177 - Managing Database Code

2024-08-12 (Last Modified: 2024-08-12)

T-SQL Tuesday is a monthly blog party hosted by a different community member each month. This month, Mala Mahadevan (blog) asks how we manage our database-related code.

Where do you keep your database code? Is it in a GIT-based repo, or just in the database the old-fashioned way?

Read on for the rest of the invitation, where Mala expands upon the question (and there is a lot to dig into).

This is a multi-part answer due to the fact that the various systems I work with have reached different levels of the Capability Maturity Model. I expect this is the case for many folks. Additionally, I work with at two different database platforms (MS SQL Server and MySQL) which were built upon by different groups with different tooling, varying levels of experience, and different amounts of planning devoted to managing their codebases and processes.

We self-host an instance of Gitlab, and anytime I refer to “git” below, I’m really referring to the whole git/Gitlab stack for repositories with managed version control. I’ll just use “git” as shorthand.

MS SQL Server

Every SQL Server database that’s been created since I started my current job has been built with Visual Studio Database Projects. The code is managed in git, and with one exception each database project has an accompanying PowerShell script that uses Publish-DbaDacPackage to deploy changes. The outlier uses a full CI/CD pipeline integrated with the rest of the corresponding application’s build process. For a variety of reasons (excuses, really), I don’t adhere super strictly to a rule of “everything goes into git first”, but I do keep git in sync with what’s deployed. I know I should, but I haven’t made the effort to make that pipeline automated enough to the point that I (and everyone else) use it.

But that brings me to the big database that predates me. The code behind that database is a combination of a code generator and migration scripts, all of which is in git and executed by the CI/CD pipeline after first being tested on a developer’s local copy of the empty database. I’d love to move this into Visual Studio but the current database has so many broken references and other errors flagged by Visual Studio that just getting current state to the point where it’s usable there is going to be a Herculean task.

MySQL

The management of MySQL database scripts is not quite so well-refined. Those databases have been evolved over decades and they were not set up to be managed in what one might call a modern way. No CI/CD. Most of the object definitions aren’t even in git. As we make changes, we put the “new/current” version of the objects we’re changing into git, but this will never reach 100% as there are huge swaths we’ll never touch again. We also write migration scripts to deploy those changes and put them into git, but they’re manually executed during deployments.

Other

I also have a lot of “utility” code kicking around. One-off SQL queries, queries I use to generate reports on an infrequent basis, PowerShell scripts and functions, things like that. One of the first things I did was to get my own repository set up for these so that I’d have a convenient backup and to easily share my work with others. Again, no CI/CD pipeline, but the repository is referenced frequently in my documentation, with direct links to the relevant scripts.