Thoth: Practical data flow protection in a search engine
Online data retrieval services like commercial search engines, online social networking, and trading and sharing sites process large volumes of data of different origins and types. Each data item indexed by a search engine, such as, online social network (OSN) data, personal email, corporate documents, public web documents, has its own usage policy. For example, email is private, OSN data and blogs may be limited to friends, and corporate documents may be restricted to employees. Furthermore, providers may have to filter certain data items in order to comply with local laws and court orders. Ensuring compliance with applicable policies in such a complex system, however, is a labor-intensive and error prone challenge. In Thoth we explore the problem of providing a practical safety net for policy compliance in a search engine.