US-based startup Cognition has unveiled an AI-powered tool, Devin, which it calls the “world’s first fully autonomous AI software engineer”. According to the company, Devin can solve engineering tasks through the use of its own shell, code editor, and web browser. Here is everything you need to know about Devin.
Today we're excited to introduce Devin, the first AI software engineer.
— Cognition (@cognition_labs) March 12, 2024
Devin is the new state-of-the-art on the SWE-Bench coding benchmark, has successfully passed practical engineering interviews from leading AI companies, and has even completed real jobs on Upwork.
Devin is… pic.twitter.com/ladBicxEat
Cognition said it has equipped the AI-powered Devin with a shell, a code editor and a separate web browser. In a demonstration shown by the company, Devin uses the browser to pull up API (application programming interface) documentation to read and learn how-to plug into each of the API’s. An API provides a way for two or more computer programs to communicate with each other. When the AI agent runs into an error, it automatically adds a debugging print statement to the main code within the code editor interface and reruns the code.
On its YouTube channel, the company has demonstrated various use cases for the AI agent. This includes building and deploying apps, finding and fixing bugs in codebases, and even fine tuning AI models.
Devin: Is it accurate?
Cognition said it has tested Devin on SWE-bench, a benchmarking platform that tasks agents to resolve real world issues found on open source projects on GitHub. According to the company, Devin correctly resolves 13.86 per cent of the issues end-to-end. For comparison, the GPT4 AI model, when tested on the platform, was able to resolve 1.74 per cent of the issues. The previous best score was held by Anthropic’s AI model called Claude 2 which resolved 4.80 per cent of the issues found.
In addition to this, the company said that the AI agent achieved this feat while it was not assisted in finding the relevant files in the repository.
Devin: Is it really first of its kind?
Microsoft offers AI-powered developers tools, including the GitHub Copilot that essentially is a code completion tool. GitHub Copilot incorporates assistive features for programmers that lets them turn prompts into runnable codes. The AI assistant also auto completes chunks of code and can translate codes between multiple coding languages. However, it cannot complete codes end-to-end on its own without interference or assistance from humans, which Devin is capable of.
Devin: How-to avail its services
Devin is currently available as an early access to individuals who wish to use the AI agent for engineering work. Customers can raise a request to the company on their website to get early access to the AI powered coding agent.