Now for AI’s Latest Trick: Writing Computer Code

It can take years to learn how to write computer code well. SourceAI, a Paris startup, thinks programming shouldn’t be such a big deal.

The company is fine-tuning a tool that uses artificial intelligence to write code based on a short text description of what the code should do. Tell the company’s tool to “multiply two numbers given by a user,” for example, and it will whip up a dozen or so lines in Python to do just that.

SourceAI’s ambitions are a sign of a broader revolution in software development. Advances in machine learning have made it possible to automate a growing array of coding tasks, from auto-completing segments of code and fine-tuning algorithms to searching source code and locating pesky bugs.

Automating coding could change software development, but the limitations and blind spots of modern AI may introduce new problems. Machine-learning algorithms can behave unpredictably, and code generated by a machine might harbor harmful bugs unless it is scrutinized carefully.

SourceAI, and other similar programs, aim to take advantage of GPT-3, a powerful AI language program announced in May 2020 by OpenAI, a San Francisco company focused on making fundamental advances in AI. The founders of SourceAI were among the first few hundred people to get access to GPT-3. OpenAI has not released the code for GPT-3, but it lets some users access the model through an API.

GPT-3 is an enormous artificial neural network trained on huge gobs of text scraped from the web. It does not grasp the meaning of that text, but it can capture patterns in language well enough to generate articles on a given subject, summarize a piece of writing succinctly, or answer questions about the contents of documents.

“While testing the tool, we realized that it could generate code,” says Furkan Bektes, SourceAI’s founder and CEO. “That’s when we had the idea to develop SourceAI.”

He wasn’t the first to notice the potential. Shortly after GPT-3 was released, one programmer showed that it could create custom web apps, including buttons, text input fields, and colors, by remixing snippets of code it had been fed. Another company, Debuild, plans to commercialize the technology.

SourceAI aims to let its users generate a wider range of programs in many different languages, thereby helping automate the creation of more software. “Developers will save time in coding, while people with no coding knowledge will also be able to develop applications,” Bektes says.

Another company, TabNine, used a previous version of OpenAI’s language model, GPT-2, which OpenAI has released, to build a tool that offers to auto-complete a line or a function when a developer starts typing.

Some software giants seem interested too. Microsoft invested $1 billion in OpenAI in 2019 and has agreed to license GPT-3. At the software giant’s Build conference in May, Sam Altman, a cofounder of OpenAI, demonstrated how GPT-3 could auto-complete code for a developer. Microsoft declined to comment on how it might use AI in its software development tools.

Brendan Dolan-Gavitt, an assistant professor in the Computer Science and Engineering Department at NYU, says language models such as GPT-3 will most likely be used to help human programmers. Other products will use the models to “identify likely bugs in your code as you write it, by looking for things that are ‘surprising’ to the language model,” he says.

Using AI to generate and analyze code can be problematic, however. In a paper posted online in March, researchers at MIT showed that an AI program trained to verify that code will run safely can be deceived by making a few careful changes, like substituting certain variables, to create a harmful program. Shashank Srikant, a PhD student involved with the work, says AI models should not be relied on too heavily. “Once these models go into production, things can get nasty pretty quickly,” he says.

Dolan-Gavitt, the NYU professor, says the nature of the language models being used to generate coding tools also poses problems. “I think using language models directly would probably end up producing buggy and even insecure code,” he says. “After all, they’re trained on human-written code, which is very often buggy and insecure.”