Genome researchers in the Netherlands are working in close cooperation in the field of omics data (such as genomic and metabolomic data). To ensure that these omics data can be shared easily and securely, a pilot involving E-LAN network technology is currently underway. This has led to the development of a shared network environment which is separate from the Internet: the UMC Research LAN. In this pilot, we are also testing infrastructure for authentication and authorisation that would enable researchers from different institutions to grant each other secure access to their data sets and computing power.
Need for collaboration
Genome researchers in the Netherlands are working in close cooperation in the field of omics data (such as genomic and metabolomic data). This collaboration is crucial: if significant patterns are to be identified in data, large samples are required, especially now that the entire genome can be analysed. “Gathering all the data yourself is too expensive and simply impossible,” explains Marian Beekman. Beekman works at Leiden University Medical Center (LUMC) and is work package coordinator at the national infrastructure for bio-banks BBMRI-NL (Biobanking and BioMolecular Resources Research Infrastructure Netherlands). To facilitate collaboration, the BBMRI is currently building a single virtual platform, part of which is federated and part of which is centralised within SURF. This omics data platform is a virtual ‘data house’ containing different types of data sets that enable researchers from different UMCs to perform analyses.
Sharing data between UMCs
“Within the BBMRI, researchers work on large data sets that they are unable to analyse in full at a single location,” says LUMC researcher Jeroen Laros. An obvious solution is the ability to transfer data quickly and securely between the various UMCs and to share computing power. “This is already happening between the UMCG (University Medical Center Groningen) and the LUMC. In the long term, we want to share data and computing power with other UMCs as well,” says Laros.
Special network infrastructure
Combining and sharing such large data sets requires a special network infrastructure. To ensure that these omics data can be shared easily and securely, a pilot involving E-LAN network technology is currently underway. This has led to the development of a shared network environment which is separate from the Internet: the UMC Research LAN. This is effectively a national ‘local’ network for UMCs (see Figure 1). It combines data and computing clusters from different UMCs and SURF virtually in a single location. This allows researchers to share and analyse data within a protected network environment which is optimised for research purposes. As SURFsara is also connected to the UMC Research LAN, it is easy for researchers to obtain more processing power from SURFsara if there is insufficient capacity within their own UMC.
Access to data
These partnerships, which go beyond institutional boundaries, also require suitable infrastructure for authentication and authorisation. “It should be easy to define who has access to the data and what people can do with the data,” says Marian Beekman. A pilot involving COmanage and a proxy component is currently underway in this field. This will enable researchers to log in using their institution account. This leads to various advantages for the researcher (it is their own trusted account) and for the owner of the shared data set or processing power. It is now clear exactly who the user is, because the user has authenticated themselves through an account that has been verified by the institution. The lead researcher can use COmanage to create groups, invite researchers and assign roles in terms of who can do what. The aim is that researchers from all over the Netherlands will have access to this virtual data house. Ultimately, though, researchers involved in international collaborative projects also need to be able to access these data.
Collaboration on customised medicine
Customised medicine is the overarching objective behind these developments, i.e. medicine that is predictive, personalised, preventive and participative. As well as close collaboration between researchers, this requires ICT infrastructure that guarantees mutual trust and that responds to the security requirements and needs of the researchers, e.g. being able to share data sets and processing capacity quickly. Without suitable ICT infrastructure, researchers are not able to collaborate effectively with one other, and solutions that would not pass a security test may be selected as a result.
- The UMC Research LAN pilot is being implemented under the Data4lifesciences programme.
- Read the blog A safe, protected network environment for UMCs.