Data Quality in Analytics: Key Problems Arising from the Repurposing of Manufacturing Data
Type
Change log
Authors
Abstract
Repurposing data is when data is used for a completely different decision/task to what it was originally intended to be used for. This is often the case in data analytics when data, which has been captured by the business as part of its normal operations, is used by data scientists to derive business insight. However, when data is collected for its primary purpose some consideration is given to ensure that the level of data quality is “fit for purpose”. Data repurposing, by definition, is using data for a different purpose, and therefore the original quality levels may not be suitable for the secondary purpose. Using interviews with various manufacturers, this paper describes examples of repurposing in manufacturing, how manufacturing organisations repurpose data, data quality problems that arise specifically from this, and how the problems are currently addressed. From these results we present a framework which manufacturers can use to help identify and mitigate the issues caused when attempting to repurpose data.
Description
This is the author accepted manuscript. It was first presented at the 20th Annual MIT International Conference on Information Quality.