Column

Welcome to the Hospital District of Helsinki and Uusimaa (HUS) hematological subdatalake data catalogue!



The HUS datalake is a Microsoft Azure-based medical database managed by the HUS Data Administration Department and supported by TietoEvry and other IT companies. Clinical data are integrated from multiple electronical health registries in a daily basis, pseudonymized and distributed to data-secure Acamedic analytical environments according to data and research permits.

In total, there are
* >3.5 million patients
* >36 million EMR records
* >80 million EMR notes
* >780 million lab test results
* >21 million imaging studies

The hematological analytical environment has been initiated ~2019 as part of the Cleverhealth eCare-4-me multi-disciplinary team lead by Prof. Kimmo Porkka. There are 93577 patients with a hematological diagnosis and 14638 patients with a hematological cancer (ICD-10 C81-C99). The last full data update was on 2023-05-01. We aim to build clinically-meaningful softwares saving time of physicians, providing additional data to support clinical decision-making and making research more efficient.

We have developed multiple pioneering solutions to reach these goals. For instance, text-formatted variables are mined into structrured format (e.g., cytogenetics, pathology, cytomorphology, flow cytometry, qpcr, tcr) using a custom pipeline by Otso Brummer & Oscar Brück.


Oscar Brück
May 2023
Helsinki, Finland