MrLazy: Lazy runtime label propagation for MapReduce
Published version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Abstract
Organisations are starting to publish datasets containing potentially sensitive information in the Cloud; hence it is important there is a clear audit trail to show that involved parties are respecting data sharing laws and policies. Information Flow Control (IFC) has been proposed as a solution. However, fine-grained IFC has various deployment challenges and runtime overhead issues that have limited wide adoptation so far. In this paper we present MrLazy, a system that practically addresses some of these issues for MapReduce. Within one trust domain, we relax the need of continuously checking policies. We instead rely on lineage (information about the origin of a piece of data) as a mechanism to retrospectively apply policies on-demand. We show that MrLazy imposes manageable temporal and spatial overheads while enabling fine-grained data regulation.