Researchers in the Decentralized Information Group (DIG) at Massachusetts Institute of Technology's Computer Science and Artificial Intelligence Laboratory (CSAIL) are developing a protocol they call "HTTP with Accountability," or HTTPA, which will automatically monitor the transmission of private data and allow the data owner to examine how it's being used.
At the IEEE's Conference on Privacy, Security and Trust next month in Toronto, Oshani Seneviratne, an MIT graduate student in electrical engineering and computer science, and Lalana Kagal, a principal research scientist at CSAIL, will present a paper that gives an overview of HTTPA.
Remote access to a Web server would be controlled much the way it is now, through passwords and encryption.
But every time the server transmitted a piece of sensitive data, it would also send a description of the restrictions on the data's use.
More From This Section
An HTTPA-compliant programme also incurs certain responsibilities if it reuses data supplied by another HTTPA-compliant source, researchers said.
Suppose, for instance, that a consulting specialist in a network of physicians wishes to access data created by a patient's primary-care physician, and suppose that she wishes to augment the data with her own notes.
The network of servers is where the heavy lifting happens. When the data owner requests an audit, the servers work through the chain of derivations, identifying all the people who have accessed the data, and what they've done with it.
Seneviratne uses a technology known as distributed hash tables - the technology at the heart of peer-to-peer networks like BitTorrent - to distribute the transaction logs among the servers.
She then simulated a set of transactions - pharmacy visits, referrals to specialists, use of anonymised data for research purposes, and the like - that the volunteers reported as having occurred over the course of a year.
Seneviratne used 300 servers on the experimental network PlanetLab to store the transaction logs; in experiments, the system efficiently tracked down data stored across the network and handled the chains of inference necessary to audit the propagation of data across multiple providers.