Facebook-owner Meta published an artificial intelligence (AI) model on Wednesday that can pick out individual objects from within an image, along with a dataset of image annotations that it said was the largest ever of its kind.
The company’s research division said that its Segment Anything Model, or SAM, could identify objects in images and videos even in cases where it had not encountered those items in its training.
Using SAM, objects can be selected by clicking on them or writing text prompts. In one demonstration, writing the word ‘cat’ prompted the tool to draw boxes around each of several cats in a photo.
Meta has teased several features that deploy the type of generative AI popularised by ChatGPT, which creates brand new content instead of simply identifying or categorising data like other AI, although it has not yet released a product.
Examples include a tool that spins up surrealist videos from text prompts and another that generates children’s book illustrations from prose.
Also Read
Meta does already use technology similar to SAM internally for activities like tagging photos, moderating prohibited content and determining which posts to recommend to users of Facebook and Instagram.
Additionally, a New York start-up called Runway AI generated a short video in less than 2 minutes of a tranquil river in a forest, after it received a short description of the same.
Runway, which plans to open its service to a small group of testers this week, is one of several firms building AI technology that will soon let people generate videos simply by typing several words into a box on a computer screen.
Google bets on speed
Google released new details about the supercomputers it uses to train its AI models, saying the systems are both faster and more power-efficient than comparable systems from Nvidia. Google has designed its own custom chip called the Tensor Processing Unit (TPU). The Google TPU is now in its fourth generation.
Google published a paper detailing how it has strung more than 4,000 of the chips together into a supercomputer using its own custom-developed optical switches.