I am an artificial intelligence and machine learning (AI/ML) researcher based in Australia, working at the intersection of applied mathematics, machine learning, deep learning, and applications in scientific and industry domains. My PhD research addresses the challenges of applying AI/ML in domains like health and biology, and my career also spans international experience in AI/ML consulting and startups.

CV and Consulting Services
CV and Consulting
Services

RESEARCH

Addressing challenges in applying AI and ML

My PhD research aims to bridge the gap between advances in AI and realising impact in domains like health and biology

My PhD research aims to bridge the gap between advances in AI and realising impact in domains like health and biology

Learn More About My Research
Learn More About
My Research

MY recent projects

hierarchical models based on similarity of causal mechanisms

Given a dataset of related tasks, it is often beneficial to pool learning across tasks using techniques such as meta-learning and multi-task learning. This preprint studies an important setting where tasks are generated by different causal models, which occurs in medical/biological data. A probabilistic machine learning framework is presented for predictive modelling in this setting, which utilises the similarity structure of the tasks.

Learn More

Published in

Preprint, Under review

Technical keywords

Bayesian Hierarchical models, causality, out-of-domain generalisation, meta-learning, Bayesian deep learning, robust mL

hierarchical models based on similarity of causal mechanisms

Given a dataset of related tasks, it is often beneficial to pool learning across tasks using techniques such as meta-learning and multi-task learning. This preprint studies an important setting where tasks are generated by different causal models, which occurs in medical/biological data. A probabilistic machine learning framework is presented for predictive modelling in this setting, which utilises the similarity structure of the tasks.

Learn More

Published in

Preprint, Under review

Technical keywords

Bayesian Hierarchical models, causality, out-of-domain generalisation, meta-learning, Bayesian deep learning, robust mL

Modeling disease risk in families with graph neural networks

Electronic health records (EHRs) spanning multiple generations present a new way for examining health trends in families. In collaboration with the Institute of Molecular Medicine Finland, an AI system was developed to analyze a network of over 7 million patients’ EHR data. The findings demonstrate that a geometric deep learning approach is beneficial for modeling the shared genetic, environmental, and lifestyle factors influencing disease risk in families.

Learn More

Published in

Machine Learning for Healthcare Conference (MLHC)

Technical keywords

graph neural networks, geometric deep learning, deep learning for time series data, electronic health records, genetics, familial factors of disease


Modeling disease risk in families with graph neural networks

Electronic health records (EHRs) spanning multiple generations present a new way for examining health trends in families. In collaboration with the Institute of Molecular Medicine Finland, an AI system was developed to analyze a network of over 7 million patients’ EHR data. The findings demonstrate that a geometric deep learning approach is beneficial for modeling the shared genetic, environmental, and lifestyle factors influencing disease risk in families.

Learn More

Published in

Machine Learning for Healthcare Conference (MLHC)

Technical keywords

graph neural networks, geometric deep learning, deep learning for time series data, electronic health records, genetics, familial factors of disease


Synthetic data FOR GENETICS RESEARCH

HAPNEST is a new software tool that efficiently generates large synthetic datasets that closely mimic real genetics and phenotypic data. This work was carried out with the European-wide INTERVENE consortium, enabling researchers to test new computational methods for polygenic risk scoring across diverse ancestry groups, while protecting sensitive health information. The software and a synthetic dataset of 6.8 million common variants and nine phenotypes for over 1 million individuals has been made publicly available.

Learn More

Published in

BIOINFORMATICS

Technical keywords

computational biology, statistical genetics, simulation-based inference, generative modeling, polygenic risk scoring

Synthetic data FOR GENETICS RESEARCH

HAPNEST is a new software tool that efficiently generates large synthetic datasets that closely mimic real genetics and phenotypic data. This work was carried out with the European-wide INTERVENE consortium, enabling researchers to test new computational methods for polygenic risk scoring across diverse ancestry groups, while protecting sensitive health information. The software and a synthetic dataset of 6.8 million common variants and nine phenotypes for over 1 million individuals has been made publicly available.

Learn More

Published in

BIOINFORMATICS

Technical keywords

computational biology, statistical genetics, simulation-based inference, generative modeling, polygenic risk scoring

Working in collaboration with

Working in collaboration with

BLOG

Get to know more about me.

Learn More

© 2024 Sophie wharrie

RECENT Research Highlights

hierarchical models based on similarity of causal mechanisms

Given a dataset of related tasks, it is often beneficial to pool learning across tasks using techniques such as meta-learning and multi-task learning. This preprint studies an important setting where tasks are generated by different causal models, which occurs in medical/biological data. A probabilistic machine learning framework is presented for predictive modelling in this setting, which utilises the similarity structure of the tasks.

Learn More

Published in

Preprint, Under review

Technical keywords

Bayesian Hierarchical models, causality, out-of-domain generalisation, meta-learning, Bayesian deep learning, robust mL

Modeling disease risk in families with graph neural networks

Electronic health records (EHRs) spanning multiple generations present a new way for examining health trends in families. In collaboration with the Institute of Molecular Medicine Finland, an AI system was developed to analyze a network of over 7 million patients’ EHR data. The findings demonstrate that a geometric deep learning approach is beneficial for modeling the shared genetic, environmental, and lifestyle factors influencing disease risk in families.

Learn More

Published in

Machine Learning for Healthcare Conference (MLHC)

Technical keywords

graph neural networks, geometric deep learning, deep learning for time series data, electronic health records, genetics, familial factors of disease


Synthetic data FOR GENETICS RESEARCH

HAPNEST is a new software tool that efficiently generates large synthetic datasets that closely mimic real genetics and phenotypic data. This work was carried out with the European-wide INTERVENE consortium, enabling researchers to test new computational methods for polygenic risk scoring across diverse ancestry groups, while protecting sensitive health information. The software and a synthetic dataset of 6.8 million common variants and nine phenotypes for over 1 million individuals has been made publicly available.

Learn More

Published in

BIOINFORMATICS

Technical keywords

computational biology, statistical genetics, simulation-based inference, generative modeling, polygenic risk scoring

RECENT Research Highlights

hierarchical models based on similarity of causal mechanisms

Given a dataset of related tasks, it is often beneficial to pool learning across tasks using techniques such as meta-learning and multi-task learning. This preprint studies an important setting where tasks are generated by different causal models, which occurs in medical/biological data. A probabilistic machine learning framework is presented for predictive modelling in this setting, which utilises the similarity structure of the tasks.

Learn More

Published in

Preprint, Under review

Technical keywords

Bayesian Hierarchical models, causality, out-of-domain generalisation, meta-learning, Bayesian deep learning, robust mL

Modeling disease risk in families with graph neural networks

Electronic health records (EHRs) spanning multiple generations present a new way for examining health trends in families. In collaboration with the Institute of Molecular Medicine Finland, an AI system was developed to analyze a network of over 7 million patients’ EHR data. The findings demonstrate that a geometric deep learning approach is beneficial for modeling the shared genetic, environmental, and lifestyle factors influencing disease risk in families.

Learn More

Published in

Machine Learning for Healthcare Conference (MLHC)

Technical keywords

graph neural networks, geometric deep learning, deep learning for time series data, electronic health records, genetics, familial factors of disease


Synthetic data FOR GENETICS RESEARCH

HAPNEST is a new software tool that efficiently generates large synthetic datasets that closely mimic real genetics and phenotypic data. This work was carried out with the European-wide INTERVENE consortium, enabling researchers to test new computational methods for polygenic risk scoring across diverse ancestry groups, while protecting sensitive health information. The software and a synthetic dataset of 6.8 million common variants and nine phenotypes for over 1 million individuals has been made publicly available.

Learn More

Published in

BIOINFORMATICS

Technical keywords

computational biology, statistical genetics, simulation-based inference, generative modeling, polygenic risk scoring

RECENT
Research Highlights

hierarchical models based on similarity of causal mechanisms

Given a dataset of related tasks, it is often beneficial to pool learning across tasks using techniques such as meta-learning and multi-task learning. This preprint studies an important setting where tasks are generated by different causal models, which occurs in medical/biological data. A probabilistic machine learning framework is presented for predictive modelling in this setting, which utilises the similarity structure of the tasks.

Learn More

Published in

Preprint, Under review

Technical

keywords

Bayesian Hierarchical models, causality, out-of-domain generalisation, meta-learning, Bayesian deep learning, robust mL

Published in

Machine Learning for Healthcare Conference, PMLR 2023

Technical

keywords

graph neural networks, geometric deep learning, deep learning for time series data, electronic health records, genetics, familial factors of disease

Modeling disease risk in families with graph neural networks

Electronic health records (EHRs) spanning multiple generations present a new way for examining health trends in families. In collaboration with the Institute of Molecular Medicine Finland, an AI system was developed to analyze a network of over 7 million patients’ EHR data. The findings demonstrate that a geometric deep learning approach is beneficial for modeling the shared genetic, environmental, and lifestyle factors influencing disease risk in families.

Learn More

Published in

BIOINFORMATICS

Technical

keywords

computational biology, statistical genetics, simulation-based inference, generative modeling, polygenic risk scoring

Synthetic data FOR GENETICS RESEARCH

HAPNEST is a new software tool that efficiently generates large synthetic datasets that closely mimic real genetics and phenotypic data. This work was carried out with the European-wide INTERVENE consortium, enabling researchers to test new computational methods for polygenic risk scoring across diverse ancestry groups, while protecting sensitive health information. The software and a synthetic dataset of 6.8 million common variants and nine phenotypes for over 1 million individuals has been made publicly available.

Learn More

RECENT
Research Highlights

hierarchical models based on similarity of causal mechanisms

Given a dataset of related tasks, it is often beneficial to pool learning across tasks using techniques such as meta-learning and multi-task learning. This preprint studies an important setting where tasks are generated by different causal models, which occurs in medical/biological data. A probabilistic machine learning framework is presented for predictive modelling in this setting, which utilises the similarity structure of the tasks.

Learn More

Published in

Preprint, Under review

Technical

keywords

Bayesian Hierarchical models, causality, out-of-domain generalisation, meta-learning, Bayesian deep learning, robust mL

Published in

Machine Learning for Healthcare Conference, PMLR 2023

Technical

keywords

graph neural networks, geometric deep learning, deep learning for time series data, electronic health records, genetics, familial factors of disease

Modeling disease risk in families with graph neural networks

Electronic health records (EHRs) spanning multiple generations present a new way for examining health trends in families. In collaboration with the Institute of Molecular Medicine Finland, an AI system was developed to analyze a network of over 7 million patients’ EHR data. The findings demonstrate that a geometric deep learning approach is beneficial for modeling the shared genetic, environmental, and lifestyle factors influencing disease risk in families.

Learn More

Published in

BIOINFORMATICS

Technical

keywords

computational biology, statistical genetics, simulation-based inference, generative modeling, polygenic risk scoring

Synthetic data FOR GENETICS RESEARCH

HAPNEST is a new software tool that efficiently generates large synthetic datasets that closely mimic real genetics and phenotypic data. This work was carried out with the European-wide INTERVENE consortium, enabling researchers to test new computational methods for polygenic risk scoring across diverse ancestry groups, while protecting sensitive health information. The software and a synthetic dataset of 6.8 million common variants and nine phenotypes for over 1 million individuals has been made publicly available.

Learn More