Highlighted Projects

HASTE: Hierarchical Analysis of Spatial and TEmporal image data

From intelligent data acquisition via smart data-management to confident predictions

The HASTE project takes a hierarchical approach to acquisition, analysis, and interpretation of image data. We develop computationally efficient measurements for data description, confidence-driven machine learning for determination of interestingness, and a theory and framework to apply intelligent spatial and temporal information hierarchies, distributing data to computational resources and storage options based on low-level image features.

HASTE is a collaboration between the Wählby lab (PI), Hellander lab (co-PI), both at the Department of Information Technology, Uppsala University, the Spjuth lab (co-PI) at the Department of Pharmaceutical Biosciences, Uppsala University, the Nilsson lab at the Department of Biochemistry and Biophysics at Stockholm University and SciLifeLab, Vironova AB and AstraZeneca AB.

Project website

Project website: http://haste.research.it.uu.se/

Read more »

Large-scale Predictive Modelling in Drug Discovery

This project aims at developing computational methods, tools and predictive models to aid the drug discovery process on large data sets. Methods include ligand-based and structure-based methods such as QSAR (machine learning) and docking, with applications including prediction of drug safety, toxicology, interactions, target profiles and secondary pharmacology. In order to analyze large-scale data we use high-performance computing, cloud computing resources, and data analytics platforms such as Apache Hadoop and Apache Spark. We also use and develop scientific workflow systems such as Luigi and BPipe to automate and streamline analysis. The work is carried out in collaboration with AstraZeneca R&D, Maastricht University NL, and Karolinska Institutet. We aim at making models and tools available from the Bioclipse workbench. We are also founding partners of the OpenTox association (www.opentox.org) and associated partner with the consortia OpenPhacts (www.openphacts.org) and e-nanomapper (http://www.enanomapper.net).

Figure: Data is extracted from various data sources, and we use high performance computing, cloud computing, workflows and big data frameworks to train predictive models which are published in the Bioclipse workbench for easy and user-friendly access with graphical interpretations.

Read more »

Prediction of metabolism

This project aims at developing methods for predicting site-of-metabolism and metabolites based on chemical structure. Using data mining techniques we have developed the tool MetaPrint2D for site-of-metabolism prediction. The project aims at improving these models and also to predict putative metabolites. The work is carried out in close collaboration with AstraZeneca R&D and models and tools are available from the Bioclipse workbench.

Figure: Prediction of site-of-metabolism with the MetaPrint2D method in Bioclipse.

Read more »

OpenRiskNet EU-H2020 project

OpenRiskNet is a 3-year EU Horizon 2020 project starting on December 1st 2016 that will develop and deploy an integrated, secure, permanent, service driven and sustainable infrastructure for data managing, data sharing, processing, analysis, information mining and modelling as well as workflow development and sharing, visualisation and reporting to serve communities in the areas of toxicology, risk assessment and chemical, pharmaceutical, cosmetic and nanomaterial product development including safe-bydesign aspects at an early stage. This e-infrastructure will support all aspects of risk assessment mentioned above by allowing for the integration of all toxicology­related data sources, for the implementation and execution of processing and analysis pipelines.

OpenRiskNet will address the challenges arising from the fragmentation of the data and the insufficient harmonization of user guidance by creating application programming interfaces (APIs) including technical and semantic interoperability layers, containerizing the databases and computational tools, and integrating the micro­services into virtual environments (VEs) allowing for deployment of personal and multi­tenant instances of this flexible, secure and high­performance e-infrastructure.

Ola Spjuth leads WP2: “Interoperability, Deployment and Security”.

Read more »

PhenoMeNal EU-H2020 project

PhenoMeNal is a 3-year EU Horizon 2020 project starting on September 1st 2015 and will develop a standardised e-infrastructure for analysing medical metabolic phenotype data. This comprises development of standards for data exchange, pipelines, computational frameworks and resources for the processing, analysis and information-mining of the massive amount of medical molecular phenotyping and genotyping data that will be generated by metabolomics applications now entering research and clinic.

Ola Spjuth leads Work Package 5: “Maintenance and Operation of PhenoMeNal grid/cloud e-Infrastructure”.

Project website: http://phenomenal-h2020.eu

Read more »

Translational bioinformatics

Translational bioinformatics is defined as: ”The development of storage, analytic, and interpretive methods to optimize the transformation of increasingly voluminous biomedical data into proactive, predictive, preventative, and participatory health”. Our group carries out research focused on translating massively parallel sequencing via automated bioinformatics analysis, informatics solutions, and reporting systems to aid in clinical settings. Projects include long-read amplicon sequencing of chronic myeloid leukemia (CML), TP53, and multi-drug resistant bacteria. We are also part of the joint SeRC-eSSENCE flagship project “e-Science for Cancer Prevention and Control” (eCPC). Collaborators include the National Genomics Institute (NGI), Uppsala Academic Hospital, and Karolinska Institutet.

Figure: Screenshot from our developed system for translating long-read amplicon sequencing to be used as decision-aid for chronic myeloid leukemia (CML) with mutation frequencies in the Philadelphia chromosome

Read more »

e-Science for Cancer Prevention and Control

The SeRC flagship project e-Science for Cancer Prevention and Control (eCPC) will set up a modular system for prediction of cancer initiation and progression. It will be based on computational models that integrate data from different sources, including molecular (e.g. genomic, proteomic), environmental and life-style factors. By superimposing screening and prevention strategies on the models, reduced over-treatment, morbidity, mortality and cost can be quantified.

Ola Spjuth leads WP1 (data management and integration) and is also member of the management group.

eCPC Website: http://ecpc.e-science.se

Read more »

RDFIO

Read more »