Hello and welcome to my website! 🎉 🙌 😃
I am a statistician and data scientist. I have deep expertise in solving data problems that arise from complex systems, spanning biological networks to commercial and enterprise data pipelines, using a wide range of tools in data science, algorithmic research, engineering, and tech lead/product management. My work is defined by the pursuit of innovation and excellence.
I completed my PhD in Structural Biology (Biophysics) at Stanford University along with Master’s Degree in Public Policy. My BA was in Biochemistry and Statistics at the University of California, Berkeley.
Let's explore the possibilities together!
My academic journey started at the University of California, Berkeley, where I studied Statistics and Biochemistry. I performed undergraduate research on the mechanism of DNA replication in Prof. John Kuriyan’s lab with Dr. Brian Kelch. I mastered the foundation of biochemical and biophysical methods for experimental design, project management and data collection; additionally, I applied computational biology methods to organize, process, analyze, and interpret the data to model complex biological systems.
To further my passion for unraveling biological complexities, I pursued the intersection of biology and statistics at Stanford University, where I completed a PhD in Structural Biology (Biophysics). Under the guidance of renowned advisors like Prof. Wing Wong, Prof. Garry Nolan, and Prof. Michael Levitt, I completed my doctoral research on cellular states and biological systems. My work shed new light on dynamical gene networks, contributed to our understanding of the biochemical and genetic factors in bone development and aging, and pioneered new algorithms for high-dimensional single-cell datasets with statistical emphasis on data batch effects and cross-sample mapping of cell populations.
In my postdoctoral work at the Stanford Blood Center with Prof. Ed Engleman, I developed and applied computational techniques on high-dimensional tissue images to understand the side effects of immunotherapy strategies.
To expand my knowledge as a statistician, I decided to move into the tech industry to learn how to solve business problems at scale by launching reliable machine learning capabilities. I joined Uber Technologies, a fast-moving leading startup at the time, to learn high-quality, industry-standard data science and engineering best practices from some of the most experienced professionals in the industry.
At Uber, I optimized processes behind various business products as part of a small and agile data science team. I developed and built Uber’s first statistical data anomaly detection service to scale the company’s data quality monitoring. In addition, I developed Uber’s data-driven business outage detection and mitigation framework. I also worked with the internal hardware and infrastructure teams to model the company’s computing needs, validate hardware purchasing plans with scenario forecasting, and support the transition of data storage from on-premise data centers to the cloud. In all, these infrastructure projects improved the company’s cost efficiency by >$50 million.
After building Uber’s data quality monitoring service, I wanted to develop the same product for all companies. No software company at the time offered an enterprise tool for data monitoring–most companies relied on building data monitoring in-house, which is very expensive. I joined Bigeye Data as the third employee and first data scientist to achieve the vision of a data observability enterprise software for all. I developed and built machine learning and AI automation in the Bigeye software. Bigeye is used by organizations in many industries, and past and present users come from many companies including Zoom, Instacart, Confluent, and the intelligence community in a partnership with In-Q-Tel.
I will continue to use my deep expertise in statistics and the sciences to contribute to groundbreaking advancements. I am committed to making a significant impact in information technology, the biosciences, and policy through data-driven innovations.