In the course of testing a major upgrade, one of my users in Accounting happened upon a problem with one of her scenarios. The web app we were working on had a habit of “locking up” on people when loading some pages; in most cases, it was because the server was pushing a huge HTML table to the client, and most web browsers struggle when faced with several megabytes of markup and thousands of rows in a single table. Digging into the source code for the page and SQL Profiler (yes, I know Extended Events are a thing), we were able to isolate the query.
The query started innocently enough.
SELECT TOP 301 Field1, Field2 FROM Ledger1
This system uses
TOP every now and then trying to limit the number of records it gets back (and the developers always seem to use the arbitrary
301 - I’m guessing some degree of cargo cult programming going on). I’d prefer a well-constructed
WHERE clause to limit the result set but beggars can’t be choosers when working with legacy vendor code.
What I found odd was that the
Ledger1 table didn’t get a lot of traffic - with the
WHERE clause in use (omitted here for brevity), you’d only get a handful of records, maybe a dozen at most. The query continued…
Now we’re onto something. In comparison to
Ledger2 is huge - at least one order of magnitude larger. And with that same
WHERE clause, you could easily pull back a couple hundred to a couple thousand records from
Where did the developers go wrong? In a query with
UNIONs, you really have multiple independent, distinct queries whose results are getting glued together before being sent back to the client. In this case, the
TOP 301 was only applied to the first subquery. I suspect that the developers' intent was to limit the entire result set to 301 records, but never tested with enough data in either table to know that this was working properly.
To properly limit the results of a
UNION query, we have to wrap the thing in parenthesis and treat it as a subquery.
While this is better, I’m not a huge fan of it because it’s still pulling a lot of data back from
Ledger2 that just gets thrown away. As I said above, I’d prefer to have the results constrained by the
WHERE clause as a well-designed index can help out.
In this case the user was refunding payments, and it’s very likely that payments made two years in the past wouldn’t normally be refunded. Perhaps a business policy could be implemented here to limit the impact on the system by allowing us to only look back 6-12 months for records, instead of searching the full account history. There isn’t always a purely technical solution; it can be beneficial to speak with the business and end user to ask “do you really need this, or can we change our processes so cover 99% of scenarios and handle the edge cases on an individual basis?”.