T Sql Tuesday 170: Learning From Abandoned Projects

2024-02-20 (Last Modified: 2024-05-11)

T-SQL Tuesday is a monthly blog party hosted by a different community member each month. I missed out on January 2024’s edition because I didn’t think I had anything good to talk about, but it suddenly hit me in February so…retroactive T-SQL Tuesday!

For January 2024, Reitse Eskens (blog | twitter) prompts us to talk about our learnings from abandoned or failed projects.

the main intent of this blog is to trigger your stories; what projects did you abandon but learn a lot from OR what’s your favourite learning from a failure.

A number of years ago, in an attempt to learn more about working with geospatial data in SQL Server, I started to build out a database with data that had practical application to one of my hobbies, and a dataset I was already familiar with. It seemed simple enough (that’s always how it starts, right?), but things stalled out.

I’ve been geocaching for over 20 years, and there’s a very popular application for managing data about geocaches called GSAK which is feature-rich and very customizable. Over more than a decade, I’ve been maintaining multiple databases with this application, totaling over 70,000 geocaches and their associated waypoints and logs. GSAK stores all of its data in a SQLite database.

Digging around in this database, I saw a few design decisions that I didn’t like. Combined with my desire to learn more about geospatial types in SQL Server, I hatched a plan. I’d “just” create my own schema in SQL Server, write up some queries and scripts to do an ETL from SQLite to SQL Server, and seed a database with all of this data.

I didn’t want to deal with building a SSIS project for this endeavour, so I went the “just code” route. I started out trying to work through SQL Server exclusively by creating Linked Servers to the SQLite databases and querying the tables directly. I struggled a bit just getting those created (which is why I wrote the post) but got it working. Sort of.

Some of my queries worked just fine, but querying tables with large amounts of text just didn’t work. I even wrote up a lengthy DBA.se post seeking assistance, to no avail. This took the wind out of my sails. For a “just messing around” project, I wasn’t sure it was worth this much trouble and I gave up.

There are a few more approaches I could and should have tried, but the failure to do it the way I wanted to really discouraged me. In highsight, I should have tried just extracting the data to CSV via the SQLite client, then importing that into SQL Server. Or maybe try a different language/environment, like PowerShell or Python. Along the way, I livestreamed a couple of my coding sessions, so I learned how to get that working with OBS.

But I might bring the project back. There seems to be a newer version of the SQLite ODBC drivers, so maybe it’s a bug that’s been fixed. Or I can take crack at the CSV export/import. Also, I’ve decided that 2024 is the year I need to learn PostgreSQL, so maybe I can make this work there “somehow.” And Postgres has geospatial extensions so that fits in nicely.

I did learn something in the process of creating the DSNs and Linked Servers. And I tried a lot of things trying to get OPENQUERY() to return the data so I could make use of it. If I use this project to learn Postgres, that’ll be a mountain of new things I’ll learn - even if it’s not 100% successful.