Data integration between Swedish national clinical health registries and biobanks using an availability system
Spjuth O, Heikkinen J, Litton J-E, Palmgren J, Krestyaninova M.
Data integration between Swedish national clinical health registries and biobanks using an availability system.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 8574 LNBI, 32-40. (2014). DOI: 10.1007/978-3-319-08590-6_3
Linking biobank data, such as molecular profiles, with clinical phenotypes is of great importance in epidemiological and predictive studies. A comprehensive overview of various data sources that can be combined in order to power up a study is a key factor in the design. Clinical data stored in health registries and biobank data in research projects are commonly provisioned in different database systems and governed by separate organizations, making the integration process challenging and hampering biomedical investigations. We here describe the integration of data on prostate cancer from a clinical health registry with data from a biobank, and its provisioning in the SAIL availability system. We demonstrate the implications of using the actual raw data, data transformed to availability data, and availability data which has been subjected to anonymization techniques to reduce the risk of re-identification. Our results show that an availability system such as SAIL with integrated clinical and biobank data can be a valuable tool for planning new studies and finding interesting subsets to investigate further. We also show that an availability system can deliver useful insights even when the data has been subjected to anonymization techniques. © 2014 Springer International Publishing Switzerland.