Don’t miss the latest developments in business and finance.

New computer better than humans at cataloguing science

Image
Press Trust of India Washington
Last Updated : Dec 02 2014 | 4:00 PM IST
A new computer system is better than scientists at the complex task of extracting data from scientific publications and placing it in a database that catalogues the results of thousands of individual studies.
"We demonstrated that the system was no worse than people on all the things we measured, and it was better in some categories," said Christopher Re, who guided the software development for the project while at the University of Wisconsin-Madison.
The development marks a milestone in the quest to rapidly and precisely summarise, collate and index the vast output of scientists around the globe, said first author Shanan Peters, a professor of geoscience at UW-Madison.
Peters and colleagues set up the faceoff between PaleoDeepDive, their new machine reading system, and the human scientists who had manually entered data into the Paleobiology Database.
The knowledge produced by paleontologists is fragmented into hundreds of thousands of publications.
Yet many research questions require what Peters calls a "synthetic approach: For example, how many species were on the planet at any given time?"

More From This Section

Teaming up with Re, now at Stanford University, and UW-Madison computer sciences professor Miron Livny, the group built on the DeepDive machine reading system and the HTCondor distributed job management system to create PaleoDeepDive.
"Getting started required a million hours of computer time," said Peters.
PaleoDeepDive mimics the human activities needed to assemble the Paleobiology Database.
"We extracted the same data from the same documents and put it into the exact same structure as the human researchers, allowing us to rigorously evaluate the quality of our system, and the humans," Peters said.
Computers often have trouble deciphering even simple-sounding statements, Re said.
"Information that was manually entered into the Paleobiology Database by humans cannot be assessed or enhanced without going back to the library and re-examining original documents. Our machine system, on the other hand, can extend and improve results essentially on the fly as new information is added," said Re.
The study was published in the journal PLoS.

Also Read

First Published: Dec 02 2014 | 4:00 PM IST

Next Story