Pınar Demetçi

I am a postdoctoral researcher at the Eric and Wendy Schmidt Center of the Broad Institute of MIT and Harvard. My research is at the intersection of computer science, molecular biology, and statistics. I develop algorithms and machine learning models to integrate multi-modal genomic data, with a goal of understanding cellular dynamics in health and disease and assisting cell state engineering. I received my Ph.D. in computer science and computational biology from Brown University, under the advisement of Ritambhara Singh, Ph.D. (primary advisor) and Sorin Istrail, Ph.D. My doctoral dissertation was on optimal transport algorithms for integrated analysis of single-cell multi-omics data.

Since January 2024, I co-chair the Models, Inference, and Algorithms (MIA) seminar series at the Broad, where we host hybrid (in-person & online) talks by computational scientists in bio(medicine). You can check out our upcoming schedule and past talks.

Apart from research, I have had the opportunity to work on various fun projects that can be found here. Outside of research, I enjoy boardgames with friends, swimming, hiking, playing with my cat, and practicing the ukulele and violin.
Feel free to e-mail me if you’d like to talk.

Research Interests

Methodology

Application areas

Representation learning
Optimal transport
Manifold learning
Bayesian statistics and inference
Variable selection
Deep learning
Causality
Graph algorithms
Regulatory genomics
Functional genomics
Single-cell sequencing & imaging
Multi-omics
3D genome
Precision medicine
Structural biology

Education

2018 - 2023	Ph.D. in Computer Science and Computational Biology (3.90/4.00) M.Sc. in Computer Science (4.00/4.00) Brown University (Providence, RI)
2013 - 2017	B.Sc. in Engineering (with concentration in Bioengineering) (3.67/4.00) Olin College of Engineering (Needham, MA)
2008 - 2013	TEVITOL High School with IB Diploma (Gebze, Turkey)

Professional Experience

July 2023 - Present	Broad Institute of MIT and Harvard, EWSC Postdoctoral Fellow (Cambridge, MA)
June 2022 - Aug 2022	Microsoft Research, Research Intern (Redmond, WA)
June 2020 - Sep 2020	Microsoft Research, Research Intern (Redmond, WA)
June 2017 - August 2018	Massachusetts Institute of Technology, Research Support Associate (Cambridge, MA)
Jan 2016 - Oct 2016	Design That Matters, Student Engineer (Salem, MA)
Jan 2015 - Dec 2015	Daktari Diagnostics, Student Engineer (Cambridge, MA)

Research Publications Google Scholar

2024

Breaking isometric ties and introducing priors in Gromov-Wasserstein distances
P. Demetci, Q.H. Tran, I. Redko, R. Singh
Proceedings of the 27th International Conference on Artificial Intelligence and Statistics
(AISTATS 2024)
Proceedings of Machine Learning Research (PMLR) -- to appear
[11] [abstract] [paper] [code]

2023

Unbalanced CO-Optimal Transport
Q.H. Tran, H. Janati, N. Courty, R. Flamary, I. Redko, P. Demetci, R Singh
Proceedings of the 37th AAAI Conference on Artificial Intelligence
(AAAI 2023)
[10] [abstract] [paper] [code]

Mammalian olfactory cortex neurons retain molecular signatures of ancestral cell types
S. Zeppilli, A. Ortega Gurrola, P. Demetci, DH. Brann, R. Attey, N. Zilkha, T. Kimchi, SR. Datta, R. Singh, MA. Tosches, A. Crombach, A. Fleischmann
bioRxiv (Under review at Nature Neuroscience)
[9] [abstract] [paper] [code]

2022

Unsupervised Integration of Single-Cell Multi-omics Datasets with Disproportionate Cell-Type Representation
P. Demetci, R. Santorella, B. Sandstede, R. Singh
Proceedings of the 26th Annual Intl. Conference on Research in Computational Molecular Biology
(RECOMB 2022)
Springer Nature Lecture Notes in Bioinformatics (2022) pp 3-19
[8] [abstract] [paper] [code]

2021

	SCOT: Single-cell multi-omics integration with optimal transport P. Demetci, R. Santorella, B. Sandstede, W. Stafford Noble and Ritambhara Singh# Equal Contribution, #Corresponding Author Journal of Computational Biology, 2021* [7] [abstract] [paper] [code] [tutorial] Recent advances in sequencing technologies have allowed us to capture various aspects of the genome at single-cell resolution. However, with the exception of a few of co-assaying technologies, it is not possible to simultaneously apply different sequencing assays on the same single cell. In this scenario, computational integration of multi-omic measurements is crucial to enable joint analyses. This integration task is particularly challenging due to the lack of sample-wise or feature-wise correspondences. We present Single-Cell alignment with Optimal Transport (SCOT), an unsupervised algorithm that uses Gromov-Wasserstein optimal transport to align single-cell multi-omics datasets. SCOT performs on par with the current state-of-the-art unsupervised alignment methods, is faster, and requires tuning of fewer hyperparameters. More importantly, SCOT uses a self-tuning heuristic to guide hyperparameter selection based on Gromov-Wasserstein distance. Thus, in the fully unsupervised setting, SCOT aligns single-cell datasets better than the existing methods without requiring any orthogonal correspondence information. </a>
	Gromov-Wasserstein Optimal Transport to Align Single-Cell Multi-Omics Data P. Demetci, R. Santorella, B. Sandstede, W. Stafford Noble and Ritambhara Singh# Equal Contribution, #Corresponding Author International Conference on Research in Computational Molecular Biology (RECOMB 2021)* [6] [abstract] [paper] [code] [tutorial] ICML WCB Best Poster Award Data integration of single-cell measurements is critical for our understanding of cell development and disease, but the lack of correspondence between different types of single-cell measurements makes such efforts challenging. Several unsupervised algorithms are capable of aligning heterogeneous types of single-cell measurements in a shared space, enabling the creation of mappings between single cells in different data modalities. We present Single-Cell alignment using Optimal Transport (SCOT), an unsupervised learning algorithm that uses Gromov Wasserstein-based optimal transport to align single-cell multi-omics datasets. SCOT calculates a probabilistic coupling matrix that matches cells across two datasets. The optimization uses k-nearest neighbor graphs, thus preserving the local geometry of the data. We use the resulting coupling matrix to project one single-cell dataset onto another via a barycentric projection. We compare the alignment performance of SCOT with state-of-the-art algorithms on three simulated and two real datasets. Our results demonstrate that SCOT yields results that are comparable in quality to those of competing methods, but SCOT is significantly faster and requires tuning fewer hyperparameters. The code is available at https://github.com/rsinghlab/SCOT
	Multi-scale Inference of Genetic Trait Architecture using Biologically Annotated Neural Networks P. Demetci,W. Cheng,Gregory Darnell, Xiang Zhou, Sohini Ramachandran, Lorin Crawford# PLOS Genetics, 2021 [5] [abstract] [paper] In this article, we present Biologically Annotated Neural Networks (BANNs), a nonlinear probabilistic framework for association mapping in genome-wide association (GWA) studies. BANNs are feedforward models with partially connected architectures that are based on biological annotations. This setup yields a fully interpretable neural network where the input layer encodes SNP-level effects, and the hidden layer models the aggregated effects among SNP-sets. We treat the weights and connections of the network as random variables with prior distributions that reflect how genetic effects manifest at different genomic scales. The BANNs software uses variational inference to provide posterior summaries which allow researchers to simultaneously perform (i) mapping with SNPs and (ii) enrichment analyses with SNP-sets on complex traits. Through simulations, we show that our method improves upon state-of-the-art association mapping and enrichment approaches across a wide range of genetic architectures. We then further illustrate the benefits of BANNs by analyzing real GWA data assayed in approximately 2,000 heterogenous stock of mice from the Wellcome Trust Centre for Human Genetics and approximately 7,000 individuals from the Framingham Heart Study. Lastly, using a random subset of individuals of European ancestry from the UK Biobank, we show that BANNs is able to replicate known associations in high and low-density lipoprotein cholesterol content.</a>

2020

Unsupervised Manifold Alignment for Single-Cell Multi-Omics Data
R. Singh#, P. Demetci, G. Bonora, V. Ramani, C. Lee, H. Fang, Z. Duan, X. Deng, J. Shendure, C. Disteche and W. Stafford Noble#
#Corresponding Authors
Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics
(ACM BCB 2020)
[4] [abstract] [paper] [code]

Combinatorial and statistical prediction of gene expression from haplotype sequence
B. Alpay*, P. Demetci* Sorin Istrail, and Derek Aguiar#
*Equal Contribution, #Corresponding Author
Bioinformatics (Oxford Press) vol.36, Supplement_1, p:i194-i202. 2020
Proceedings of the 27th International Conference on Intelligent Systems for Molecular Biology
(ISMB 2020)
[3] [abstract] [paper] [code]

Genome-wide association studies (GWAS) have discovered thousands of significant genetic effects on disease phenotypes.By considering gene expression as the intermediary between genotype and disease phenotype, eQTL studies have interpreted many of these variants by their regulatory effects on gene expression. However, there remains a considerable gap between genotype-to-gene expression association and genotype-to-gene expression prediction. Accurate prediction of gene expression enables gene-based association studies to be performed post-hoc for existing GWAS, reduces multiple testing burden, and can prioritize genes for subsequent experimental investigation.In this work, we develop gene expression prediction methods that relax the independence and additivity assumptions between genetic markers. First, we consider gene expression prediction from a regression perspective and develop the HAPLEXR algorithm which combines haplotype clusterings with allelic dosages. Second, we introduce the new gene expression classification problem, which focuses on identifying expression groups rather than continuous measurements; we formalize the selection of an appropriate number of expression groups using the principle of maximum entropy. Third, we develop the HAPLEXD algorithm that models haplotype sharing with a modified suffix tree data structure and computes expression groups by spectral clustering. In both models, we penalize model complexity by prioritizing genetic clusters that indicate significant effects on expression. We compare HAPLEXR and HAPLEXD with three state-of-the-art expression prediction methods and two novel logistic regression approaches across five GTEx v8 tissues. HAPLEXD exhibits significantly higher classification accuracy overall; HAPLEXR shows higher prediction accuracy on approximately half of the genes tested and the largest number of best predicted genes ($r^2>0.1$) among all methods. We show that variant and haplotype features selected by HAPLEXR are smaller in size than competing methods (and thus more interpretable) and are significantly enriched in functional annotations related to gene regulation. These results demonstrate the importance of explicitly modelling non-dosage dependent and intragenic epistatic effects when predicting expression.

2019

Rapid accumulation of motility-activating mutations in resting liquid culture of Escherichia coli
D. Parker*, P.Demetci*, and G.W. Li#
*Equal Contribution, #Corresponding Author
Journal of Bacteriology, 2019
[2] [abstract] [paper]

2016

Internalization and externalization in the classroom: How do they emerge and why is it important?
P.Demetci, C. Nichols, Y. V. Zastavker, J. D. Stolk, A. Dillon, M. D. Gross.
IEEE Frontiers in Education
(FIE 2016)
[1] [abstract] [paper]

Awards and Honors

2023	EWSC Postdoctoral Fellowship (up to three years)
2023	Harvard/DFCI Data Science Postdoctoral Fellowship (turned down)
2022	Selected to the Rising Stars in EECS cohort (by UT Austin)
2022	RECOMB Travel Fellowship
2020	ICML WCB Fellowship
2020	Microsoft Research Ph.D. Fellowship Nominee (by Brown CCMB)
2016	Meritorius Winner: 2016 MCM/ICM Interdisciplinary Contest in Mathematical Modeling
2015-2017	Olin Alumni Scholarship (towards expenses beyond tuition)
2013-2017	Sunlin Chou International Scholarship (50% tuition)
2013-2017	Olin Merit Scholarship (50% tuition)
2013	Honorable Mention (Instrumentation): 21st Intl. Competition of First Step to Nobel Prize in Physics (by the Polish Academy of Sciences)
2013	First Place in Physics: 22nd MEF International Research Projects Contest (Turkiye)

Last updated on 23 Jan 2024. Template based on github.com/bamos/bamos.github.io