T-SQL Tuesday #136: Your Favorite (or Least Favorite) Data Type

Page content

T-SQL Tuesday is a monthly blog party hosted by a different community member each month, and this month Brent Ozar (blog | twitter) asks us to talk about data types.

T-SQL Tuesday Logo

Your mission: write a blog post about your favorite data type, and schedule it for next Tuesday, March 9.

Mad about money? Feverish about float? Tell us what you use your favorite data type for, and what people need to know before they get started using it. Data types have so many interesting edge cases, like when Aaron Bertrand taught us to use lower case for our data types.

In his original invitation title, Brent did open the possibility for us to write about our least-favorite type, so that’s the route I’m going.

Not Just One

My least favorite data type? It’s user-defined data types.

In this post I’m talking about scalar types, not CLR or table types. At first, these seem like a great idea.

  • I can create my own data type names!
  • I can standardize on the length and precision of various pieces of data in one place!
  • No piece of software is gonna tell me what I can do, I’m telling it what to do!

But there’s a catch. There are a few catches.

Prix Fixe

Here’s the thing - these types are really just aliases for native types in SQL Server, but more constrained. Constrain yourself to UDTs and you’ll have trouble right-sizing your fields. Let’s say you’ve got three data types for text data:

  • myShortString (varchar(10))
  • myString (varchar(256))
  • myBigString (varchar(8000))

These lengths are not helping anyone. You can’t store email addresses or names in myShortString. But myString is probably way too much for that data. You’re going to waste memory because of how SQL Server estimates memory grants and your indexes will be bloated. Maybe you just need to create more UDTs to cover these situations. But that just compounds the other problems, doesn’t it?

What’s My Name Again?

It’s long been said that naming things is one of the two hard things in computer science. When you create a UDT, you have to name it. If you want a meaningful name, that’s really hard! How much data can I store in a column of type myString? I have to look it up. What’s the difference between myShortString and myString? 🤷‍♂️

Set in Stone

Once you create a UDT, you can’t ALTER it. Let’s say, for argument’s sake, you’ve created a UDT for your alphanumeric customer IDs. Business is booming, and a couple years down the road, you realize you need to add a few characters to the length of the field to keep up. You can’t just change the definition of your UDT to accommodate. Instead, you have to do one of two things:

  1. Create a new UDT, then run an ALTER TABLE ALTER COLUMN everywhere the old UDT was used to use the new one, then DROP the old UDT if it’s no longer needed
  2. ALTER all of the columns using the UDT to be the standard type equivalent, then drop the UDT, re-create it with the same name, then ALTER all the columns back to the UDT (which appears to be how SQL Server Data Tools/sqlpackage.exe handles it)

Both of those are extra steps beyond using native types. And depending upon which avenue you choose, may lead to confusion.

Not the Whole Story

sp_help will happily tell you the name, length, precision, and scale of the columns using UDTs on a given table. But that still doesn’t tell you enough about the base type. It won’t differentiate between varchar and nvarchar. Which means you need to work through sys.types to get to the native type to find out what you’re really working with. Not exactly a chore but sp_help is just so darn convenient.

Same, But Different

You can create multiple UDTs with the same name in the same database, but with different definitions. How? By defining them in different schemas. But a given UDT isn’t restricted to being used in tables that schema. sp_help doesn’t give an indication of which schema the UDT on the table is coming from, so again you’re left scratching your head unless you query sys.types (gosh, this post is starting to sound like an indictment of sp_help, isn’t it?).

This is probably not a scenario that will come up often, but if it’s done it can cause a lot of confusion.

Crossing Databases

If you select * Into NewTable from TableWithUdt, the new table will use your UDT because they’re both in the same database. But what if you need to do a quick copy of a table into another database? select * into OtherDB.dbo.NewTable from TableWithUdt will create that new table using the UDT’s base type, whether a UDT of the same name/definition exists in the destination database or not. Whatever measure of clarity or control we thought we were getting from using a UDT has now disappeared.

Icing on the Cake

And then, on top of all of this - UDTs aren’t usable with temp tables at all.

I Feel Better Now

Whew. It feels great to get all that off my chest. I guess the main takeaway here is that I have some pretty strong reservations about using user-defined data types. Thanks to Brent for hosting T-SQL Tuesday this month and giving us a great topic to write about!