Anti-Simplification

Our anonymous submitter relates a tale of simplification gone bad. As this nightmare unfolds, imagine the scenario of a new developer coming aboard at this company. Imagine being the one who has to explain this setup to said newcomer.

Imagine being the newcomer who inherits it.

A "Storm P machine" - the Danish equivalent of a Rube Goldberg machine.

David's job should have been an easy one. His company's sales data was stored in a database, and every day the reporting system would query a SQL view to get the numbers for the daily key performance indicators (KPIs). Until the company's CTO, who was proudly self-taught, decided that SQL views are hard to maintain, and the system should get the data from one of those new-fangled APIs instead.

But how does one call an API? The reporting system didn't have that option, so the logical choice was Azure Data Factory to call the API, then output the data to a file that the reporting system could read. The only issue was that nobody on the team spoke Azure Data Factory, or for that matter SQL. But no problem, one of David's colleagues assured, they could do all the work in the best and most multifunctional language ever: C#.

But you can't just write C# in a data factory directly, that would be silly. What you can do is have the data factory pipeline call an Azure function, which calls a DLL that contains the bytecode from C#. Oh, and a scheduler outside of the data factory to run the pipeline. To read multiple tables, the pipeline calls a separate function for each table. Each function would be based on a separate source project in C#, with 3 classes each for the HTTP header, content, and response; and a separate factory class for each of the actual classes.

After all, each table had a different set of columns, so you can't just re-use classes for that.

There was one little issue: the reporting system required an XML file, whereas the API would export data in JSON. It would be silly to expect a data factory, of all things, to convert this. So the CTO's solution was to have another C# program (in a DLL called by a function from a pipeline from an external scheduler) that reads the JSON document saved by the earlier program, uses foreach to go over each element, then saves the result as XML. A distinct program for each table, of course, requiring distinct classes for header, content, response, and factories thereof.

Now here's the genius part: to the C# class representing the output data, David's colleague decided to attach one different object for each input table required. The data class would use reflection to iterate over the attached objects, and for each object, use a big switch block to decide which source file to read. This allows the data class to perform joins and calculations before saving to XML.

To make testing easier, each calculation would be a separate function call. For example, calculating a customer's age was a function taking struct CustomerWithBirthDate as input, use a foreach loop to copy all the data except replacing one field, and return a CustomerWithAge struct to pass to the next function. The code performed a bit slowly, but that was an issue for a later year.

So basically, the scheduler calls the data factory, which calls a set of Azure functions, which call a C# function, which calls a set of factory classes to call the API and write the data to a text file. Then, the second scheduler calls a data factory, which calls Azure functions, which call C#, which calls reflection to check attachment classes, which read the text files, then call a series of functions for each join or calculation, then call another set of factory classes to write the data to an XML file, then call the reporting system to update.

Easy as pie, right? So where David's job could have been maintaining a couple hundred lines of SQL views, he instead inherited some 50,000 lines of heavily-duplicated C# code, where adding a new table to the process would easily take a month.

Or as the song goes, Somebody Told Me the User Provider should use an Adaptor to Proxy the Query Factory Builder ...

[Advertisement] ProGet’s got you covered with security and access controls on your NuGet feeds. Learn more.

This post originally appeared on The Daily WTF.

Leave a Reply

Your email address will not be published. Required fields are marked *