I am an artificial intelligence and machine learning (AI/ML) researcher based in Australia, working at the intersection of applied mathematics, machine learning, deep learning, and applications in scientific and industry domains. My PhD research addresses the challenges of applying AI/ML in domains like health and biology, and my career also spans international experience in AI/ML consulting and startups.
CV and Consulting Services
CV and Consulting
Services
RESEARCH
Addressing challenges in applying AI and ML
My PhD research aims to bridge the gap between advances in AI and realising impact in domains like health and biology
My PhD research aims to bridge the gap between advances in AI and realising impact in domains like health and biology
Learn More About My Research
Learn More About
My Research
MY recent projects
hierarchical models based on similarity of causal mechanisms
Given a dataset of related tasks, it is often beneficial to pool learning across tasks using techniques such as meta-learning and multi-task learning. This preprint studies an important setting where tasks are generated by different causal models, which occurs in medical/biological data. A probabilistic machine learning framework is presented for predictive modelling in this setting, which utilises the similarity structure of the tasks.
Learn More
Published in
Preprint, Under review
Technical keywords
Bayesian Hierarchical models, causality, out-of-domain generalisation, meta-learning, Bayesian deep learning, robust mL
hierarchical models based on similarity of causal mechanisms
Given a dataset of related tasks, it is often beneficial to pool learning across tasks using techniques such as meta-learning and multi-task learning. This preprint studies an important setting where tasks are generated by different causal models, which occurs in medical/biological data. A probabilistic machine learning framework is presented for predictive modelling in this setting, which utilises the similarity structure of the tasks.
Learn More
Published in
Preprint, Under review
Technical keywords
Bayesian Hierarchical models, causality, out-of-domain generalisation, meta-learning, Bayesian deep learning, robust mL
Modeling disease risk in families with graph neural networks
Electronic health records (EHRs) spanning multiple generations present a new way for examining health trends in families. In collaboration with the Institute of Molecular Medicine Finland, an AI system was developed to analyze a network of over 7 million patients’ EHR data. The findings demonstrate that a geometric deep learning approach is beneficial for modeling the shared genetic, environmental, and lifestyle factors influencing disease risk in families.
Learn More
Published in
Machine Learning for Healthcare Conference (MLHC)
Technical keywords
graph neural networks, geometric deep learning, deep learning for time series data, electronic health records, genetics, familial factors of disease
Modeling disease risk in families with graph neural networks
Electronic health records (EHRs) spanning multiple generations present a new way for examining health trends in families. In collaboration with the Institute of Molecular Medicine Finland, an AI system was developed to analyze a network of over 7 million patients’ EHR data. The findings demonstrate that a geometric deep learning approach is beneficial for modeling the shared genetic, environmental, and lifestyle factors influencing disease risk in families.
Learn More
Published in
Machine Learning for Healthcare Conference (MLHC)
Technical keywords
graph neural networks, geometric deep learning, deep learning for time series data, electronic health records, genetics, familial factors of disease
Synthetic data FOR GENETICS RESEARCH
HAPNEST is a new software tool that efficiently generates large synthetic datasets that closely mimic real genetics and phenotypic data. This work was carried out with the European-wide INTERVENE consortium, enabling researchers to test new computational methods for polygenic risk scoring across diverse ancestry groups, while protecting sensitive health information. The software and a synthetic dataset of 6.8 million common variants and nine phenotypes for over 1 million individuals has been made publicly available.
Learn More
Published in
BIOINFORMATICS
Technical keywords
computational biology, statistical genetics, simulation-based inference, generative modeling, polygenic risk scoring
Synthetic data FOR GENETICS RESEARCH
HAPNEST is a new software tool that efficiently generates large synthetic datasets that closely mimic real genetics and phenotypic data. This work was carried out with the European-wide INTERVENE consortium, enabling researchers to test new computational methods for polygenic risk scoring across diverse ancestry groups, while protecting sensitive health information. The software and a synthetic dataset of 6.8 million common variants and nine phenotypes for over 1 million individuals has been made publicly available.
Learn More
Published in
BIOINFORMATICS
Technical keywords
computational biology, statistical genetics, simulation-based inference, generative modeling, polygenic risk scoring
BLOG
Get to know more about me.
Learn More
© 2024 Sophie wharrie
RECENT Research Highlights
hierarchical models based on similarity of causal mechanisms
Given a dataset of related tasks, it is often beneficial to pool learning across tasks using techniques such as meta-learning and multi-task learning. This preprint studies an important setting where tasks are generated by different causal models, which occurs in medical/biological data. A probabilistic machine learning framework is presented for predictive modelling in this setting, which utilises the similarity structure of the tasks.
Learn More
Published in
Preprint, Under review
Technical keywords
Bayesian Hierarchical models, causality, out-of-domain generalisation, meta-learning, Bayesian deep learning, robust mL
Modeling disease risk in families with graph neural networks
Electronic health records (EHRs) spanning multiple generations present a new way for examining health trends in families. In collaboration with the Institute of Molecular Medicine Finland, an AI system was developed to analyze a network of over 7 million patients’ EHR data. The findings demonstrate that a geometric deep learning approach is beneficial for modeling the shared genetic, environmental, and lifestyle factors influencing disease risk in families.
Learn More
Published in
Machine Learning for Healthcare Conference (MLHC)
Technical keywords
graph neural networks, geometric deep learning, deep learning for time series data, electronic health records, genetics, familial factors of disease
Synthetic data FOR GENETICS RESEARCH
HAPNEST is a new software tool that efficiently generates large synthetic datasets that closely mimic real genetics and phenotypic data. This work was carried out with the European-wide INTERVENE consortium, enabling researchers to test new computational methods for polygenic risk scoring across diverse ancestry groups, while protecting sensitive health information. The software and a synthetic dataset of 6.8 million common variants and nine phenotypes for over 1 million individuals has been made publicly available.
Learn More
Published in
BIOINFORMATICS
Technical keywords
computational biology, statistical genetics, simulation-based inference, generative modeling, polygenic risk scoring
RECENT Research Highlights
hierarchical models based on similarity of causal mechanisms
Given a dataset of related tasks, it is often beneficial to pool learning across tasks using techniques such as meta-learning and multi-task learning. This preprint studies an important setting where tasks are generated by different causal models, which occurs in medical/biological data. A probabilistic machine learning framework is presented for predictive modelling in this setting, which utilises the similarity structure of the tasks.
Learn More
Published in
Preprint, Under review
Technical keywords
Bayesian Hierarchical models, causality, out-of-domain generalisation, meta-learning, Bayesian deep learning, robust mL
Modeling disease risk in families with graph neural networks
Electronic health records (EHRs) spanning multiple generations present a new way for examining health trends in families. In collaboration with the Institute of Molecular Medicine Finland, an AI system was developed to analyze a network of over 7 million patients’ EHR data. The findings demonstrate that a geometric deep learning approach is beneficial for modeling the shared genetic, environmental, and lifestyle factors influencing disease risk in families.
Learn More
Published in
Machine Learning for Healthcare Conference (MLHC)
Technical keywords
graph neural networks, geometric deep learning, deep learning for time series data, electronic health records, genetics, familial factors of disease
Synthetic data FOR GENETICS RESEARCH
HAPNEST is a new software tool that efficiently generates large synthetic datasets that closely mimic real genetics and phenotypic data. This work was carried out with the European-wide INTERVENE consortium, enabling researchers to test new computational methods for polygenic risk scoring across diverse ancestry groups, while protecting sensitive health information. The software and a synthetic dataset of 6.8 million common variants and nine phenotypes for over 1 million individuals has been made publicly available.
Learn More
Published in
BIOINFORMATICS
Technical keywords
computational biology, statistical genetics, simulation-based inference, generative modeling, polygenic risk scoring
RECENT
Research Highlights
hierarchical models based on similarity of causal mechanisms
Given a dataset of related tasks, it is often beneficial to pool learning across tasks using techniques such as meta-learning and multi-task learning. This preprint studies an important setting where tasks are generated by different causal models, which occurs in medical/biological data. A probabilistic machine learning framework is presented for predictive modelling in this setting, which utilises the similarity structure of the tasks.
Learn More
Published in
Preprint, Under review
Technical
keywords
Bayesian Hierarchical models, causality, out-of-domain generalisation, meta-learning, Bayesian deep learning, robust mL
Published in
Machine Learning for Healthcare Conference, PMLR 2023
Technical
keywords
graph neural networks, geometric deep learning, deep learning for time series data, electronic health records, genetics, familial factors of disease
Modeling disease risk in families with graph neural networks
Electronic health records (EHRs) spanning multiple generations present a new way for examining health trends in families. In collaboration with the Institute of Molecular Medicine Finland, an AI system was developed to analyze a network of over 7 million patients’ EHR data. The findings demonstrate that a geometric deep learning approach is beneficial for modeling the shared genetic, environmental, and lifestyle factors influencing disease risk in families.
Learn More
Published in
BIOINFORMATICS
Technical
keywords
computational biology, statistical genetics, simulation-based inference, generative modeling, polygenic risk scoring
Synthetic data FOR GENETICS RESEARCH
HAPNEST is a new software tool that efficiently generates large synthetic datasets that closely mimic real genetics and phenotypic data. This work was carried out with the European-wide INTERVENE consortium, enabling researchers to test new computational methods for polygenic risk scoring across diverse ancestry groups, while protecting sensitive health information. The software and a synthetic dataset of 6.8 million common variants and nine phenotypes for over 1 million individuals has been made publicly available.
Learn More
RECENT
Research Highlights
hierarchical models based on similarity of causal mechanisms
Given a dataset of related tasks, it is often beneficial to pool learning across tasks using techniques such as meta-learning and multi-task learning. This preprint studies an important setting where tasks are generated by different causal models, which occurs in medical/biological data. A probabilistic machine learning framework is presented for predictive modelling in this setting, which utilises the similarity structure of the tasks.
Learn More
Published in
Preprint, Under review
Technical
keywords
Bayesian Hierarchical models, causality, out-of-domain generalisation, meta-learning, Bayesian deep learning, robust mL
Published in
Machine Learning for Healthcare Conference, PMLR 2023
Technical
keywords
graph neural networks, geometric deep learning, deep learning for time series data, electronic health records, genetics, familial factors of disease
Modeling disease risk in families with graph neural networks
Electronic health records (EHRs) spanning multiple generations present a new way for examining health trends in families. In collaboration with the Institute of Molecular Medicine Finland, an AI system was developed to analyze a network of over 7 million patients’ EHR data. The findings demonstrate that a geometric deep learning approach is beneficial for modeling the shared genetic, environmental, and lifestyle factors influencing disease risk in families.
Learn More
Published in
BIOINFORMATICS
Technical
keywords
computational biology, statistical genetics, simulation-based inference, generative modeling, polygenic risk scoring
Synthetic data FOR GENETICS RESEARCH
HAPNEST is a new software tool that efficiently generates large synthetic datasets that closely mimic real genetics and phenotypic data. This work was carried out with the European-wide INTERVENE consortium, enabling researchers to test new computational methods for polygenic risk scoring across diverse ancestry groups, while protecting sensitive health information. The software and a synthetic dataset of 6.8 million common variants and nine phenotypes for over 1 million individuals has been made publicly available.