- On March 13, startup Cognition announced the launch of Devin, an AI software engineer that outperformed top human engineers in the SWE-bench coding benchmark, signalling a significant shift in software development.
- Devin demonstrates the ability to independently complete a wide range of software engineering tasks, from debugging to deploying, utilising its own suite of development tools.
- The founding team of Cognition, consisting of prodigies with impressive backgrounds in computer science and mathematics, aims to revolutionise the field by potentially replacing human software engineers with AI, sparking both optimism and concern within the tech community.
OUR TAKE
Currently, Cognition has secured $21 million in investment led by Silicon Valley magnate Peter Thiel’s Founders Fund and other well-known investment institutions, suggesting that Devin will evolve and update even faster. With tech giants like Google and OpenAI also entering the field, competition in the realm of “AI programmers” is set to intensify.
— Chloe CHEN, BTW Media reporter
On March 13, the startup company Cognition announced the launch of the world’s first AI software engineer, Devin, claiming it will completely change the way humans build software. Devin achieved breakthrough success in the SWE-bench coding benchmark test, demonstrating its ability to execute complex tasks, even surpassing top human engineers. This release has attracted widespread attention from developers.
Also read: Sora won’t replace humans, and here’s why
Also read: Google suspends Gemini AI model’s image generation function
Cognition, a small startup of just 10 people
It is understood that Cognition, the company behind Devin, is a small startup of just 10 people and was established less than 2 months ago. Currently, it has achieved an astonishing 13.86% on the SWE-bench, in comparison, Claude 2 is at 4.80%, while SWE-Llama-13b and GPT-4 can handle 3.97% and 1.74%, respectively.
In Cognition’s demonstration, Devin can quickly complete the basic work that programmers do every day, such as development, debugging, and deployment. Moreover, it has its own shell, code editor, and browser, among other common developer tools, all integrated into a sandbox computing environment, allowing Devin to call on them independently.
Faced with a natural language description of requirements from users, Devin can open the code editor, use the browser for debugging, then run and check the code, finally deploying to meet the user’s needs. Traditionally, these tasks were completed by programmers, but now AI can take over.
The developers showed an example where Devin independently solved a logarithmic calculation error in the sympy Python algebra system. We could see Devin setting up the code environment, reproducing the error, and independently coding and testing the repair solution.
Moreover, for the well-known challenge of AI-generated art hidden within a background image, which traditionally requires software engineers to spend time understanding and learning new tools, Devin managed to learn this entirely new and unfamiliar technology by reading blogs, running ControlNet on Modal, and fulfilling the requirement.
Furthermore, Devin can independently complete the hottest large model technology. For example, it only needs you to send a link to a GitHub research repository, and it autonomously fine-tunes a large language model. Even given a real job task on Upwork, it can write and debug code for a computer vision model. Devin will write a report for the user with the results of the sample data.
Devin is the next-generation software development assistant
We see that Cognition describes Devin as the next-generation software development assistant, not just offering coding suggestions and automating some tasks, but capable of independently completing entire software projects. This means Devin is different from other AI programming tools; it has greater autonomy and is more comprehensive in its programming capabilities.
Although Cognition seems to be a small company without even a fixed office at the time of Devin’s release, looking at its founding team, we see another story of genius entrepreneurship.
Team of geniuses behind Devin
Cognition’s founder and CEO, Scott Wu, participated in a live televised data competition 14 years ago, where he displayed his talent in mathematics by dominating the competition.
The co-founder and CTO, Steven Hao, graduated from MIT with a degree in computer science and previously worked at the data annotation unicorn Scale AI as one of the company’s top engineers.
Another co-founder and Chief Product Officer, Walden Yan, studied computer science and economics at Harvard University, engaged in cryptography and machine learning research with MIT PRIMES, and was a finalist in the Wharton Business School high school investment competition in North America.
Moreover, we discover a surprising coincidence that the founders of Cognition were all gold medalists in the International Olympiad in Informatics (IOI), an annual international informatics competition for individual competitors from various countries, with each country sending a maximum of four contestants.
To say that Cognition’s founding team is a “team of geniuses” for standing out and winning gold medals in the highly competitive field of computer science in the United States does not seem to be an exaggeration.
Will human software engineers be replaced?
However, some people have expressed pessimistic views about the emergence of Devin or similar tools, suggesting that software engineer Scott Wu is launching a generative artificial intelligence tool with the ultimate goal of replacing human software engineers. Computer scientist Silas Alberti stated, “This doesn’t seem like an assistant for writing code, but more like a real worker doing their own job.”
Former Tesla AI director Kapasi said, “Automating software engineering currently seems similar to automating driving.”
This is reflected in the development process: first, humans manually write code; then, GitHub Copilot automatically completes a few lines; after that, ChatGPT writes blocks of code; and now, Devin has emerged.
He believes that automating software engineering will evolve into many tools that development personnel need to use in conjunction to write code: terminals, browsers, code editors, etc., with humans responsible for supervision, gradually moving to higher-level work.