DataShare: Decentralized Privacy-Preserving Search Engine for Investigative Journalists

By Kasra Edalatnejad, EPFL


Investigative journalists collect large numbers of digital documents during their investigations. Many of these documents contain sensitive information. Revealing possession of such documents could, therefore, endanger reporters, their stories, and their sources. As a result, even though these documents could greatly benefit other journalists’ work, many documents are used only for single, local, investigations.
We present DataShare, a decentralized and privacy-preserving global search system that enables journalists worldwide to find documents via a dedicated network of peers. This work stems from the need of the International Consortium of Investigative Journalists (ICIJ) for securing their search and discovery platform under development.
DataShare combines well-known anonymous credentials and anonymous communication primitives with a novel multi-set private set intersection protocol (MS-PSI) into a decentralized peer-to-peer private document search engine. MS-PSI enables efficient search in many collections at a time, while not leaking more than state-of-the-art PSI protocols would. By significantly reducing the computation and communication cost of performing intersections, MS-PSI enables DataShare to scale to thousands of users and millions of documents.

Wednesday, July 3rd 2019 @16:15, room BC 410