Repository logo
 

The carbon footprint of bioinformatics

Accepted version
Peer-reviewed

Type

Article

Change log

Authors

Grealey, Jason 
Lannelongue, Loïc 
Saw, Woei-Yuh 
Marten, Jonathan 
Meric, Guillaume 

Abstract

jats:titleAbstract</jats:title>jats:pBioinformatic research relies on large-scale computational infrastructures which have a non-zero carbon footprint. So far, no study has quantified the environmental costs of bioinformatic tools and commonly run analyses. In this study, we estimate the bioinformatic carbon footprint (in kilograms of COjats:sub2</jats:sub> equivalent units, kgCOjats:sub2</jats:sub>e) using the freely available Green Algorithms calculator (<jats:ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="http://www.green-algorithms.org">www.green-algorithms.org</jats:ext-link>). We assess (i) bioinformatic approaches in genome-wide association studies (GWAS), RNA sequencing, genome assembly, metagenomics, phylogenetics and molecular simulations, as well as (ii) computation strategies, such as parallelisation, CPU (central processing unit) vs GPU (graphics processing unit), cloud vs. local computing infrastructure and geography. In particular, for GWAS, we found that biobank-scale analyses emitted substantial kgCOjats:sub2</jats:sub>e and simple software upgrades could make GWAS greener, e.g. upgrading from BOLT-LMM v1 to v2.3 reduced carbon footprint by 73%. Switching from the average data centre to a more efficient data centres can reduce carbon footprint by ~34%. Memory over-allocation can be a substantial contributor to an algorithm’s carbon footprint. The use of faster processors or greater parallelisation reduces run time but can lead to, sometimes substantially, greater carbon footprint. Finally, we provide guidance on how researchers can reduce power consumption and minimise kgCOjats:sub2</jats:sub>e. Overall, this work elucidates the carbon footprint of common analyses in bioinformatics and provides solutions which empower a move toward greener research.</jats:p>

Description

Keywords

31 Biological Sciences, 3105 Genetics, Genetics, Human Genome, 12 Responsible Consumption and Production

Journal Title

Molecular Biology and Evolution

Conference Name

Journal ISSN

0737-4038

Volume Title

Publisher

Oxford University Press (OUP)
Sponsorship
Medical Research Council (MR/L003120/1)
British Heart Foundation (None)
British Heart Foundation (RG/18/13/33946)