Repository logo

Crowdsourcing the General Public for Large Scale Molecular Pathology Studies in Cancer.

Change log


Candido Dos Reis, Francisco J 
Lynn, Stuart 
Ali, H Raza 
Eccles, Diana 
Hanby, Andrew 


BACKGROUND: Citizen science, scientific research conducted by non-specialists, has the potential to facilitate biomedical research using available large-scale data, however validating the results is challenging. The Cell Slider is a citizen science project that intends to share images from tumors with the general public, enabling them to score tumor markers independently through an internet-based interface. METHODS: From October 2012 to June 2014, 98,293 Citizen Scientists accessed the Cell Slider web page and scored 180,172 sub-images derived from images of 12,326 tissue microarray cores labeled for estrogen receptor (ER). We evaluated the accuracy of Citizen Scientist's ER classification, and the association between ER status and prognosis by comparing their test performance against trained pathologists. FINDINGS: The area under ROC curve was 0.95 (95% CI 0.94 to 0.96) for cancer cell identification and 0.97 (95% CI 0.96 to 0.97) for ER status. ER positive tumors scored by Citizen Scientists were associated with survival in a similar way to that scored by trained pathologists. Survival probability at 15 years were 0.78 (95% CI 0.76 to 0.80) for ER-positive and 0.72 (95% CI 0.68 to 0.77) for ER-negative tumors based on Citizen Scientists classification. Based on pathologist classification, survival probability was 0.79 (95% CI 0.77 to 0.81) for ER-positive and 0.71 (95% CI 0.67 to 0.74) for ER-negative tumors. The hazard ratio for death was 0.26 (95% CI 0.18 to 0.37) at diagnosis and became greater than one after 6.5 years of follow-up for ER scored by Citizen Scientists, and 0.24 (95% CI 0.18 to 0.33) at diagnosis increasing thereafter to one after 6.7 (95% CI 4.1 to 10.9) years of follow-up for ER scored by pathologists. INTERPRETATION: Crowdsourcing of the general public to classify cancer pathology data for research is viable, engages the public and provides accurate ER data. Crowdsourced classification of research data may offer a valid solution to problems of throughput requiring human input.



Breast cancer, Citizen science, Crowd science, Crowdsourcing, Breast Neoplasms, Crowdsourcing, Female, Humans, Kaplan-Meier Estimate, Pathology, Molecular, Proportional Hazards Models, ROC Curve, Receptors, Estrogen

Journal Title


Conference Name

Journal ISSN


Volume Title



Elsevier BV
Cancer Research Uk (None)
Cancer Research Uk (None)
Cell Slider is supported by funding from Cancer Research UK. The individual studies were supported by grants from: Cancer Research UK (C490/A10124, C490/A16561), Dutch Cancer Society (NKI 2007–3839 and 2009–4363), the NIHR Biomedical Research Centre at the University of Cambridge, Yorkshire Cancer Research (S295, S299, S305PA), Red Temática de Investigación Cooperativa en Cáncer, Fondo de Investigación Sanitario (PI11/00923 and PI081120). The Human Genotyping-CEGEN Unit (CNIO) is supported by the Instituto de Salud Carlos III and the ESTHER study was supported by the Baden Württemberg Ministry of Science, Research and Arts.