Large-Scale Privacy-Preserving Matching and Search

TitleLarge-Scale Privacy-Preserving Matching and Search
Publication TypeThesis
Year of Publication2016
AuthorsRiazi, S. M.
Academic DepartmentElectrical and Computer Engineering
DegreeMaster of Science
Number of Pages123
Date Published07/2017
UniversityRice University

The past few decades have witnessed considerable efforts for achieving a Privacy-Preserving Computing (PPC) scheme where the input data of each engaging party is not revealed to any other party. PPC, in fact, prevents any data leakage or invasion of privacy. While there exist provably secure solutions, they su↵er from high communication overheads and therefore they cannot scale for large-scale datasets. This thesis proposes several novel techniques to overcome this limitation and opens a new door for real-world applications. In this thesis, we focus on two of the most important branches in this area which are matching and search. In the process of matching, two groups of individuals are interested to be matched with each other given certain rules and based on their preference list while keeping all preference lists private. In the process of search, a user holding a query wants to find the most similar profile (with a specific similarity metric) in a bank of profiles while keeping both query and bank private. Our approach is based on a well-known protocol called Garbled Circuit (GC) protocol which can securely evaluate any function of choice. Previous works su↵er from immense overheads and lack of scalability. In this thesis, we introduce a new set of techniques that makes the GC-based protocols more ecient and make them scalable. We also study a new paradigm which is based on Randomized Embeddings to develop a new framework for a faster privacy-preserving search that can deliver unprecedented speed up. We have developed a framework that introduces a privacy/accuracy trade-off and can process search query for Millions of users in real-time, an infeasible task prior to this work. Proof-of-concept evaluations on large-scale datasets prove the practicality and scalability of our approach. For the matching case, we evaluate our proposed solution on a bank of one million di↵erent genomes. Privacy-preserving stable matching is performed on a set size as big as National Residency Matching Program (NRMP). The last experiment is performed on the largest MovieLens dataset which contains hundreds of thousands di↵erent user profiles that is used to recommend a movie to a user based on the most similar profile without compromising the privacy of user and profiles in the dataset. 


Thesis.pdf7.45 MB


Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer