Hi! I am

Nikhil Bisht

I am a Data Scientist & a

Nikhil Bisht

About Me

I am a PhD Candidate in Physics with a focus on applied data science and machine learning. My work involves building and evaluating models on large-scale datasets, quantifying uncertainty, and translating complex quantitative problems into actionable insights. I am currently pursuing industry data science roles.

Physics Simulations

Stochastic 3D Multi-Field Systems

HPC / ETL

Distributed Pipelines

Deep Learning

ConvGRU Architectures

Analysis

Forecasting Turbulence

My work so far bridges the gap between theoretical physics and production-grade software engineering. In my PhD research, I build end-to-end ML pipelines for terabyte-scale 3D simulations, optimize code for HPC environments, and develop novel architectures to solve complex spatiotemporal problems. Whether it's forecasting molecular cloud collapse or optimizing distributed ETL pipelines, I thrive where data is massive, noisy, and physically constrained.

For my current work, I developed a multi-field fully 3D convGRU-UNet architecture to predict turbulent flows in star-forming regions. We improved predictive accuracy by 28% while enforcing statistical and physical constraints to ensure model validity. Learn more about the project here.

Fig 1. My Hybrid ConvGRU-UNet Architecture

Detailed Model Architecture Diagram

My Timeline so far

Nov 2022 - Present

Data Scientist, Graduate Research Fellow

Dept. of Physics, Florida State University

Architecting 3D Spatiotemporal forecasting models (ConvGRU-UNet). Engineered distributed ETL pipelines for HDF5/Parquet, reducing training time from weeks to days via Multi-GPU DDP.

PyTorch HPC 3D Vision
Jul 2020 - Aug 2022

Applied Physics & ML Engineer

BITS Pilani, Goa

Designed Bayesian Neural Networks (BNN) for multi-target probabilistic regression, successfully modeling high-dimensional non-linear relationships with quantified uncertainty intervals

Python Nuclear Astrophysics Bayesian Statistics
Mar 2018 - July 2022

Lead Engineer (Computer Science Vertical)

Project Radio Telescope

Founded and led a cross-functional engineering team of 10+ to build a full-stack instrumentation facility. Developed Python-based automated signal processing pipelines for real-time spectral analysis and noise reduction.

Arduino Astronomy Python
August 2021 - July 2022

Data Analyst

Max Planck Institute for Astronomy, Heidelberg

Investigated a statistically significant sample of Milky-Way/M31 analogs in TNG-5O, a set of large, cosmological magnetohydrodynamic galaxy simulations to quantify stellar Radial Migration.

Python Time Series Analysis Inferential Statistics
Mar 2019 - July 2021

Coordinator

Physics Association, BITS Pilani, Goa

Planned the logistics for meetings, conferences, seminars, lectures, and workshops. Served as the central point of contact and liaison between students, faculty and staff.

Communication Organizational Management
Mar 2018 - April 2019

Secretary

Students for Exploration & Development of Space, Celestia

Worked towards promoting space exploration through projects, conferences, and career development for students.

Scientific Communication Project Management Organizational

Scroll to explore history

Applied Data Science & Modeling Projects

SimSuite

Forecasting Complex System Evolution with Machine Learning

Built and evaluated deep learning models on large-scale simulation data to forecast the evolution of complex systems under uncertainty.

Project Importance: Scalable model evaluation, uncertainty-aware forecasting, and working with multi-terabyte, high-dimensional datasets.

Read more here

Python • PyTorch • Large-Scale Data • Parquet
SHAP Summary

Panhandle Health Network: XGBoost based Appointment No-Show Predictor

End-to-end predictive system (XGBoost/Streamlit) for healthcare appointments that identifies high-risk patients and optimizes intervention thresholds to project ~$193K in revenue recovery.

Project Importance: Translating model performance into financial ROI, optimizing operational decision-making, and deploying interpretable AI tools for non-technical stakeholders.

Read more here

Python • XGBoost • Streamlit • SHAP • Cost-Benefit Analysis
BNNViz

Probabilistic Regression and Uncertainty Quantification

Applied Bayesian machine learning methods to perform multi-target regression on noisy, high-dimensional data.

Project Importance: Uncertainty quantification, model interpretability, and statistical reasoning for decision-making under uncertainty.

Read more here

Python • Bayesian Modeling • Probabilistic ML
TNG50galaxy

Statistical Analysis of Large-Scale Simulation Populations

Performed statistical analysis across hundreds of simulated systems to quantify population-level trends and variability.

Project Importance: Large-sample analysis, hypothesis testing, and extracting insights from noisy observational data.

Inferential Statistics • Data Analysis • Python

Technical Arsenal

An interactive map of my production capabilities.

Something caught your eye?

Start the conversation!