dsSynthetic: synthetic data generation for the DataSHIELD federated analysis system.
Publication Date
2022-06-27Journal Title
BMC Res Notes
ISSN
1756-0500
Publisher
Springer Science and Business Media LLC
Volume
15
Issue
1
Language
en
Type
Other
This Version
VoR
Metadata
Show full item recordCitation
Banerjee, S., & Bishop, T. R. (2022). dsSynthetic: synthetic data generation for the DataSHIELD federated analysis system.. [Other]. https://doi.org/10.1186/s13104-022-06111-2
Abstract
OBJECTIVE: Platforms such as DataSHIELD allow users to analyse sensitive data remotely, without having full access to the detailed data items (federated analysis). While this feature helps to overcome difficulties with data sharing, it can make it challenging to write code without full visibility of the data. One solution is to generate realistic, non-disclosive synthetic data that can be transferred to the analyst so they can perfect their code without the access limitation. When this process is complete, they can run the code on the real data. RESULTS: We have created a package in DataSHIELD (dsSynthetic) which allows generation of realistic synthetic data, building on existing packages. In our paper and accompanying tutorial we demonstrate how the use of synthetic data generated with our package can help DataSHIELD users with tasks such as writing analysis scripts and harmonising data to common scales and measures.
Keywords
Research Note, Synthetic data, Data harmonisation, Federated analysis
Sponsorship
European Commission Horizon 2020 (H2020) Societal Challenges (824989)
Identifiers
s13104-022-06111-2, 6111
External DOI: https://doi.org/10.1186/s13104-022-06111-2
This record's DOI: https://doi.org/10.17863/CAM.85945
Rights
Licence:
http://creativecommons.org/licenses/by/4.0/
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk