Machine Learning Engineer · CMU

Collin
Shen.

Building intelligent systems at Carnegie Mellon.
Predictive modeling, NLP, and real-time ML.

Collin Shen
01

About
Me

CMU grad student, ML engineer, content creator — and whatever else I feel like being.

Collin Shen

Hey, I'm Collin — a Machine Learning grad student at Carnegie Mellon University originally from Houston, TX. I finished my B.S. in Computer Science at Stevens Institute of Technology in just 3 years, and now I'm pursuing my M.S. at CMU with a focus on ML and AI systems. During the day I'm building models and pipelines; outside of that I'm usually at the gym, watching something, or creating content online.

My Interests

Favorite TV Shows

Coming soon...

Music & Artists

Coming soon...

Hobbies

Coming soon...

Idols

Coming soon...

Currently Listening

Open Spotify
Loading…
Loading…
02

My Coding
Career

Machine Learning Engineer focused on predictive modeling, NLP, and scalable systems. Currently pursuing my M.S. at Carnegie Mellon University.

Collin Shen

Timeline

← Education Work →
2027 2026 2025 2024 2023 2022

Completed degree in 3 years (accelerated). Dean's List ×6.

Minor in Data Science. Relevant coursework: Machine/Deep Learning, Generative AI, LLMs, Embedded Systems, Operating Systems, Cloud Computing, Concurrent Programming, Optimization.

Dean's List ×63-Year AcceleratedData Science Minor

Master of Computer Engineering with a focus in Machine Learning.

Machine LearningDeep LearningGenerative AI LLMsEmbedded SystemsCloud
  • Developed web-based ticket management system with tree visualization showing organizational hierarchy and real-time employee workload
  • Developed random forest model trained on historical ticket data to predict resolution time based on severity, type, and employee workload
  • Integrated internal ticket database with Django/PostgreSQL backend, enabling predictive assignment and tracking of software defect resolution
DjangoPostgreSQLRandom ForestPythonData Visualization
  • Trained XGBoost gradient-boosted model to predict high-dimensional locomotive system outputs, enabling scalable offline inference and design exploration
  • Built feature engineering and cross-validation pipeline using Python, scikit-learn, and Pandas to ensure model robustness across configurations
  • Automated prediction of simulation outputs across multiple configurations, reducing engineering time while preserving model accuracy
XGBoostPythonscikit-learnPandasPredictive Modeling

Upcoming ML Dev.

Skills

Machine Learning

Gradient Boosting Transformers / BERT Fine-tuning / LoRA NLP Supervised Learning Surrogate Modeling Prompt Engineering Feature Engineering

Systems & Software

Python C / C++ SQL Linux Kernel Programming Concurrency Embedded Systems Django

Tools & Frameworks

PyTorch scikit-learn HuggingFace NumPy / Pandas Docker PostgreSQL Git Jupyter

Research & Projects

Multimodal Text-to-Audio Generation

Carnegie Mellon University

Fine-tuned Meta MusicGen with LoRA adapters to enable genre + environmental audio prompts, using 3,000+ paired MusicCaps/FSD50K examples. Built training pipeline and feature preprocessing to blend musical and ambient audio for controllable multi-modal generation.

PyTorchLoRAMusicGenGenerative AI

LLMs in Healthcare

Carnegie Mellon University

Privacy-preserving de-identification pipeline using fine-tuned BERT-based NER to pseudonymize sensitive patient attributes. Processed 45,000+ clinical notes (MIMIC-III) for secure data sharing and downstream ML training while maintaining HIPAA compliance.

BERTNERHuggingFaceNLPHIPAA

Embedded Realtime Systems

Carnegie Mellon University

Implemented kernel-level modules and synchronization primitives to coordinate periodic tasks under real-time constraints for deterministic edge ML systems. Designed and tested concurrency mechanisms and periodic scheduling for predictable embedded task execution.

CLinux KernelRTOSConcurrency
03

My
Content

Behind the code there's a whole other life. Follow along.

Collin Shen

Weight Loss Journey 🏃

Documenting my journey publicly — raw, honest, no filter.

Loading…

Instagram

Follow Me
Loading posts…

Current Series

Ongoing

Weight Loss Journey

Weekly updates — workouts, meals, progress photos, and honest reflections. Posted every Sunday.

Follow along
Coming Soon

Your Next Series

Something you're planning. Add it here to build anticipation.

04

My
Contact

Recruiter, follower, or just want to say hi — reach out.