Moving beyond the empty cell: The threat of decontextualized healthcare data.
Published version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Abstract
Missing, inaccurate, or poorly documented data in healthcare is often treated as a technical problem to be statistically resolved via imputation, deletion, or modeling assumptions about randomness. However, such inaccuracies relate to far more complex socioeconomic and geopolitical issues, rather than "errors of data entry" to be ameliorated with statistical modeling techniques. We outline that what is really missing or inaccurate is the context in which the data is collected-and that only by understanding this context can we begin to prevent artificial intelligence's (AIs) amplification of misleading, decontextualized data. We critically examine how traditional modeling methods fail to account for the factors that influence what data gets recorded, and for whom. We show how AI systems trained on decontextualized data reinforce health inequities at scale. And, we review recent literature on context-aware approaches to understanding data, that incorporate metadata, social determinants of health, fairness constraints, and participatory governance to build more ethical and representative systems. Our analysis urges the AI and healthcare communities to move beyond the traditional emphasis on statistical convenience, toward socially grounded and interdisciplinary strategies for handling decontextualized data.
Description
Acknowledgements: AI-assisted tools were used to improve the clarity and coherence of certain sections of this manuscript. Their contribution was limited to refinement of language wherein it was ensured that no distortion of the research content or data interpretation was made.
Keywords
Journal Title
Conference Name
Journal ISSN
2767-3170

