Researchers believe that household robots should take advantage of their mobility and their relatively static environments to make object recognition easier, by imaging objects from multiple perspectives before making judgements about their identity.
Matching up the objects depicted in the different images, however, poses its own computational challenges.
Researchers at the Massachusetts Institute of Technology (MIT) show that a system using an off-the-shelf algorithm to aggregate different perspectives can recognise four times as many objects as one that uses a single perspective, while reducing the number of mis-identifications.
Lawson Wong, lead author on a new research paper and his thesis adviser, Leslie Kaelbling, considered scenarios in which they had 20 to 30 different images of household objects clustered together on a table.
Also Read
In several of the scenarios, the clusters included multiple instances of the same object, closely packed together, which makes the task of matching different perspectives more difficult.
The first algorithm they tried was developed for tracking systems such as radar, which must also determine whether objects imaged at different times are in fact the same.
In hopes of arriving at a more efficient algorithm, the researchers adopted a different approach.
Their algorithm does not discard any of the hypotheses it generates across successive images, but it does not attempt to canvass them all, either.
Since there is significant overlap between different hypotheses, an adequate number of samples will generally yield consensus on the correspondences between the objects in any two successive images.
The research was published in the International Journal of Robotics Research.