Senior Data Scientist · Bay Area, CA

Building intelligent
data systems that
drive decisions.

AI-native analytics engineer with 4+ years shipping production ML pipelines, clinical NLP, and self-serve analytics platforms — currently at Johnson & Johnson.

View Projects GitHub ↗ Research ↗
4+
Years Experience
3
Companies
18%
Stock-out Reduction at J&J
40%
Faster Time-to-Insight
01

About

I'm a Senior Data Scientist who lives at the intersection of machine learning engineering, clinical data, and scalable analytics infrastructure. My work at Johnson & Johnson involves building AI-driven forecasting systems for patient adherence, deploying BERT-based NLP over clinical notes, and delivering BI dashboards that translate model outputs into decisions executives can act on.

Before J&J, I led demand forecasting and inventory optimization at Orion Technolab — end-to-end, from ETL pipelines through LSTM models to production APIs. I hold an MS in Data Analytics from San Jose State University (Dec 2024) and am actively publishing research on privacy-preserving ML for healthcare.

I'm a daily user of AI coding tools (Claude, ChatGPT, Cursor) and am upskilling on RAG architectures, agentic frameworks, and LLM pipelines — looking for roles where clinical domain expertise meets the frontier of GenAI.

Location Bay Area, CA (Los Altos / Palo Alto)
Phone 408-581-4706
Research Google Scholar
Visa H1B sponsorship required

HIGH5 Strengths

1 Coach
2 Self-Believer
3 Storyteller
4 Commander
5 Deliverer

Independently certified · Oct 2025. Coach + Storyteller + Deliverer = someone who builds reliable systems and communicates their impact clearly to any audience.

02

Experience

Feb 2025 — Present
Johnson & Johnson
USA
Data Analyst / Senior Data Scientist
Built production ML models for clinical risk prediction and patient outcome analysis using Python, TensorFlow, and scikit-learn. Deployed BERT + Hugging Face NLP pipelines over unstructured clinical notes and regulatory documents, reducing manual review by 30%. Owned end-to-end data governance with GitHub CI/CD, Docker, and Kubernetes — GxP-aligned. Designed AWS QuickSight, Tableau, and Power BI dashboards tracking operational KPIs and model performance for cross-functional teams including Product, Operations, and Research.
↑15% prediction accuracy ↓30% manual review 30% faster decisions 3× dev speed via AI tools GxP / regulatory compliant
Feb 2024 — Jun 2024
Quantiphi
Massachusetts, USA
Data Analyst
Built scalable ETL pipelines and data transformation workflows using Python (Pandas, NumPy), SQL, and Apache Airflow. Developed self-serve Tableau dashboards for customer support and operational metrics, eliminating 50+ monthly ad-hoc requests. Conducted A/B testing and statistical analysis to evaluate RNN/LSTM forecasting models for time-series predictions.
35% pipeline efficiency 22% forecast accuracy Eliminated 50+ monthly reports
May 2020 — Dec 2022
Orion Technolab
India
Data Analyst / Data Scientist
Led end-to-end demand forecasting and inventory optimization — from ETL (Kafka, Airflow, SQL) through LSTM/ARIMA modeling to production REST APIs. Built NLP pipelines with SpaCy and NLTK for sentiment analysis over 1M+ customer reviews. Trained CNN and GAN models for product image defect detection. Deployed all systems with Flask/FastAPI, Docker, CI/CD, and CloudWatch monitoring.
20% forecast accuracy ↓15% stockouts 28% defect detection 35% faster deployments 40% less data prep time
03

Key Projects

🧬
Healthcare AI
Patient Adherence & Demand Forecasting
AI-driven system to predict patient adherence and optimize therapy demand across multiple regions for J&J. Integrated 250M+ structured records and 30M+ clinical text entries. Built BERT NLP for adverse event detection and deployed via AWS SageMaker with drift monitoring.
18%
Stock-out reduction
40%
Faster insights
Python TensorFlow BERT AWS Glue Snowflake SageMaker Tableau Power BI
📦
Retail AI
Intelligent Demand Forecasting & Inventory Optimization
End-to-end system replacing manual spreadsheet forecasting for retail clients. Built automated ETL, compared ARIMA/Prophet/XGBoost/LSTM models, and integrated NLP sentiment features from 1M+ customer reviews. Deployed as REST APIs with serverless infrastructure and live monitoring.
20%
Forecast accuracy
15%
Stockout reduction
LSTM XGBoost SpaCy AWS Glue FastAPI Docker CloudWatch
🔬
Research
Healthcare ML Research & Publication
Active peer-reviewed research on privacy-preserving ML for behavioral health signal detection (JMIR AI) and LSTM-GPT-4 integration for biomedical signal classification. Demonstrates frontier-level clinical AI work beyond production engineering.
2+
Publications
JMIR
Target journal
GPT-4 LSTM Privacy-ML Biomedical NLP Signal Classification
🏭
Computer Vision
Product Quality Defect Detection
CNN and GAN-based visual inspection system to classify manufacturing defects. Used transfer learning (ResNet, EfficientNet) and GAN-generated synthetic samples to overcome limited defective image data. Deployed as a lightweight inference API for QC teams.
28%
Detection improvement
Real-time
API inference
CNN GAN ResNet TensorFlow Flask Docker
04

Skills & Stack

Languages
SQL (Expert) Python R PySpark Pandas / NumPy
ML & Deep Learning
TensorFlow PyTorch Scikit-learn Keras XGBoost LSTM MLflow
NLP & LLMs
BERT / HuggingFace Transformers SpaCy / NLTK GPT-4 RAG Agentic frameworks
Cloud & Data Platforms
AWS S3 / Glue Redshift SageMaker Lambda Snowflake Databricks BigQuery Azure
Data Engineering
dbt Apache Airflow Apache Kafka Spark ETL / ELT Data Modeling
BI & Visualization
Tableau Power BI AWS QuickSight Looker Sigma Amplitude
MLOps & DevOps
Docker Kubernetes CI/CD GitHub Actions FastAPI Flask CloudWatch
Databases
MySQL PostgreSQL MongoDB Redis Redshift
05

Education & Certifications

MS in Data Analytics
San Jose State University, CA
Dec 2024
BE in Computer Science
Anna University, Chennai
May 2021

Certifications

Azure Fabric Analytics Engineer Associate
AWS Data Engineer Associate
Azure Fundamentals

Research

JMIR AI Active
Privacy-preserving ML for behavioral health signal detection
Biomedical Signal Classification
LSTM + GPT-4 integration for biomedical signal classification

Let's build something
meaningful together.

Open to Data Analyst, Data Scientist, Analytics AI Engineer, and ML Engineering roles. H1B sponsorship required.