Repository logo

Structured Mining and Analysis of Public Energy Data to Evaluate, Improve and Assist the Execution of the UK’s Built Environment Energy Efficiency Policies



Change log


Yuan, Mingda 


The UK has committed to reducing its carbon emissions by 80% compared to the 1990 levels by 2050. The government has introduced various built environment energy efficiency policies in the hope of meeting the 2050 target. Alongside the policies, the government has also collected and published high-coverage, high-quality and high-resolution public data to monitor, target and aid carbon reduction and energy efficiency policies. Despite significant improvements in the availability of data and information, we believe much of the public data is still under-utilised. Structured mining and analysis that take advantage of different data mining and statistical algorithms to gain a deeper understanding of the built environment are desired and necessary. In this thesis, we explore new structured mining and analysis frameworks and procedures to tackle problems in built environment energy efficiency based on the available public datasets. In this way, we evaluate, improve and assist the application of the UK’s built environment energy efficiency policies. We demonstrate our exploration in three studies that have successfully found unforeseen patterns in public data and provided insights into the energy efficiency of the built environment. The first study proposed a two-step framework, demonstrated by analysing domestic gas consumption, to bridge the resolution differences between data and policy executive bodies. Different clustering algorithms (i.e. Gaussian Mixture Model and Hierarchical Clustering) were selected to cope with the objective in each layer. We showed that the same gas consumption-related variable could have different relationships, both qualitative and quantitative-wise, in different clusters of small areas. We also grouped the executive bodies to help them better collaborate and fine-tune the execution of the built environment energy efficiency policies. We combined statistical analysis and data mining in the second study to analyse the reliability of non-domestic EPC ratings. Local weighted regression (LOWESS) analysis found inconsistency and human factors in the non-domestic EPC ratings. Buildings whose conditions remain unchanged can be better rated by approximately 10 points on average when their initial rating is above the minimum requirement in the regulation. Clustering analysis of EPC recommendation changes further justified these findings and provided practical energy efficiency improvement strategies that people have already naturally adopted together with association rule mining. The third study opened the discussion of the relationship between local economic development and built environment energy efficiency. The geographical units created by complex network community detection, also called high-growth business community areas in our study, are shown to have their non-domestic buildings improved to a better average with a lower starting position compared to other areas. Further clustered community groups show different development paths also tend to be linked with different built environment energy e ciency improvement patterns.





Choudhary, Ruchi


Built Environment, Clustering, Data Mining, Energy Efficiency, EPC, Machine Learning, Public Data


Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge