News Roundup: A Brillant Copilot

The story of the week in software development is Github's Copilot, which promises to throw machine learning at autocomplete for a "smarter" experience.

Notably, one of their examples highlights its ability to store currency values in a float. Or to generate nonsense. Or outputting GPLed code that was used in its training set.

That last one raises all sorts of questions about copyright law, and the boundaries of what constitutes fair use and derivative works, and whether the GPL's virality can "infect" an ML model. These are questions I'm not qualified to answer, and that may not have a good answer at the moment. And certainly, the answer which applies to the US may not apply elsewhere.

Besides, copyright law is boring. What's fun is that Copilot also spits up API keys, because it was trained on open source projects, and sometimes people mess up and commit their API keys into their source control repository. Oops.

And even their examples don't really constitute "good" code. Like their daysBetweenDates, straight from their website:

function calculateDaysBetweenDates(date1, date2) { var oneDay = 24 * 60 * 60 * 1000; var date1InMillis = date1.getTime(); var date2InMillis = date2.getTime(); var days = Math.round(Math.abs(date2InMillis - date1InMillis) / oneDay); return days; }

Now, this code is fine for its stated task, because JavaScript has absolutely garbage date-handling, and developers are forced to do this themselves in the first place. But it's barely fine. A solution copy-pasted from StackOverflow that fails to hit the "single responsibility principle" for this method (it calculates the difference and converts it to a unit of time- days, in this case). It's not WTF code, sure, but it's also not code that I'd give a thumbs up to in a code review, either.

And it also ignores the right answer: use a date handling library, because, outside of the most limited cases, why on Earth would you write this code yourself?

Or this code, also from their website:

function nonAltImages() { const images = document.querySelectorAll('img'); for (let i = 0; i < images.length; i++) { if (!images[i].hasAttribute('alt')) { images[i].style.border = '1px solid red'; } } }

img:not([alt]) is a query selector that would find all the img tags that don't have an alt attribute. You could go put it in your stylesheet instead of directly modifying the style property on the element directly. Though the :not pseudo-class isn't available in IE6, so that maybe makes my solution a non-starter.

I'm not picking on some tech-preview tool that's still working the kinks out, here. A human being looked at these examples and decided that it's a good way to demonstrate the power of their new tool. Presumably a group of people looked at this output and said, "Yeah, that's the stuff, that feels like magic." Which brings me to my real point.

Any ML system is only as good as its training data, and this leads to some seriously negative outcomes. We usually call this algorithmic bias, and we all know the examples. It's why voice assistants have a hard time with certain names or accents. It's why sentencing tools for law enforcement mis-classify defendants. It's why facial recognition systems have a hard time with darker skin tones.

In the case of an ML tool that was trained on publicly available code, there's a blatantly obvious flaw in the training data: MOST CODE IS BAD.

Here at TDWTF, we try and curate the worst of the worst, because observing failure is often funny, and because we can also learn from these mistakes. But also: because this is also us. We've all written bad code at some point, and we're all going to write bad code again. We tell ourselves we'll refactor, but we never do. We make choices which make sense now, but in six months a new feature breaks our design and we've gotta hack things together so we can ship.

Most of the code in the world is bad.

If you feed a big pile of Open Source code into OpenAI, the only thing you're doing is automating the generation of bad code, because most of the code you fed the system is bad. It's ironic that the biggest obstacle to automating programmers out of a job is that we are terrible at our jobs.

In any case, I hope someone scrapes TDWTF and trains a GPT-3 model off of it. We can then let the Paulas of the world retire.

[Advertisement] BuildMaster allows you to create a self-service release management platform that allows different teams to manage their applications. Explore how!

This post originally appeared on The Daily WTF.

Comments are closed.