Picking Your Consultants

Inilock started making locks back in the 1880s, and has always had a conservative approach to changing things about how locks work. But the world has moved on, and the pin-and-tumbler has given way to RFID card readers and electromagnets.

Since Inilock didn't have the internal expertise to build industrial locking systems for commercial customers, they did what any company would do: they hired highly paid consultants. The project started in 2018. These consultants went out and build a lock firmware platform, a server, and a homegrown TCP protocol to handle configuration and setup, handed it in late and over budget, cashed their checks, and vanished, by 2022.

The system was not too bad. It was extremely bad. The server used for configuring locks would hang on the regular. That was bad enough, but even worse, the locks would also stop responding. As it turns out, "have you tried turning it on and off again" is not something your customers want to hear when they're locked out of the building. The product was so bad, and had consumed so many resources, that Inilock was facing an existential threat to the company.

They hadn't gained any new expertise in software development, so they hired another consulting firm, which is where Christian enters this story. His team was brought in to try and fix this disaster before Inilock locked their own doors for good.

The first thing Christian did was track down the source control server and start reading through the code. The server was written in C#, and while the project started in 2018, every choice that was made was frozen someplace circa 2008- using ADO .NET for database access, instead of any of the more modern frameworks that .NET had added.

That was annoying, but more telling was the release process. There was no CI/CD. A developer pulled the code, ran "Build…" on their local machine, and then uploaded the binary to an FTP server. Another tool could be run on the target network to grab the binary and distributed it to all the locks on that network.

That gave Christian a sense of the overall care that went into the project, but it was when investigating the network protocol and how it was handled that he started to understand why using the software was such a terrible experience.

The configuration application wrote records to an MS-SQL database. Another service queried that database periodically. When the data changed, it would broadcast out the new configuration to all the locks via a homegrown TCP protocol. It would read the data from the database, convert it to XML, blast the XML across a TCP socket to each client, and then query the database again, in an endless loop. Well, mostly endless.

What the service didn't do was any sort of error handling. Oh, and it didn't insert any breaks in the stream, or give the client any way of knowing how long the stream of data it should be expecting.

Every client just had a buffer, scanned the buffer every 50ms, and assumed that everything in the buffer was a single message from the server. This was fine in a laboratory environment where there was very little traffic or latency on the network, but in a real network the clients would frequently check the buffer before the entire message had arrived, or end up with two messages in the buffer. There was no error handling on the clients either, so they'd just hang on the bad data.

And since the server wasn't doing any error handling, timeouts or any asynchronous messaging, it'd hang when enough of those connections went down.

Ripping out the messaging layer and replacing it with fit-for-purpose 3rd-party library was a great deal of work, but that alone was enough to fix the reliability problems and boost performance of the system by 1000%. The architecture is still a ridiculous disaster, the UI is still a nightmare, there are still all sorts of bad choices, sources of crashes, and the only bits of the code base that have any automated tests are the ones which Christian's team touched. But it's gone from a massive trainwreck of toxic waste to a moderately sized trainwreck of toxic waste.

Whether it's enough to save Inilock remains to be seen, but when Christian cashes his checks for being a consultant, he at least knows he made things better.

[Advertisement] BuildMaster allows you to create a self-service release management platform that allows different teams to manage their applications. Explore how!

This post originally appeared on The Daily WTF.

Leave a Reply

Your email address will not be published. Required fields are marked *