Data · 2026
Harvard Resume for Data Scientists & ML Engineers
Data science hiring splits into two tracks: applied (drive a business metric) and research (publish, train models, contribute to OSS). The Harvard format works for both. This recipe shows how to pivot the same skeleton — Education first, Experience with quantified impact, projects/publications — toward either track.
What recruiters look for
- Modelling depth (architectures, frameworks, eval methodology)
- Causal/inferential rigor (A/B test design, confounders handled)
- Productionisation experience (not just notebook output)
- Publications, OSS contributions, Kaggle medals (for research-track)
- Business outcomes attributable to your models (for applied-track)
Required sections, in this order
Track selection
- Applied track: lead Experience with business-metric bullets; Publications optional
- Research track: surface Publications section between Education and Experience; longer Skills section with frameworks listed
Skills section content
- Languages: Python, R, SQL, Scala (only what you're conversant in)
- ML frameworks: PyTorch, JAX, scikit-learn, XGBoost
- Infrastructure: Spark, Airflow, dbt, Databricks, Snowflake
- Cloud + MLOps: AWS/GCP, MLflow, Weights & Biases, BentoML
Sample in Harvard format

Strong vs weak bullets
Built a recommendation model for the marketplace
Built and shipped a two-tower recommendation model (PyTorch, BigQuery embeddings) replacing a collaborative-filtering baseline; A/B tested over 6 weeks across 8M users; +14.2% click-through and +$3.7M monthly GMV
Architecture (two-tower), data infra (BigQuery embeddings), what it replaced, A/B duration + sample, and dual metrics (CTR + GMV) — a senior reviewer infers full-cycle ownership.
Authored a paper on transformer efficiency
Co-authored 'Efficient Sparse Attention for Long-Context Transformers' (NeurIPS 2025 main); reduced inference cost 38% at comparable accuracy on 4 standard benchmarks; cited by 12 papers in first 6 months
Venue (NeurIPS main), specific contribution (sparse attention), measurable improvement (-38% cost), and downstream impact (citations).
Improved A/B testing infrastructure
Redesigned the experimentation platform's sequential testing module to support 4 simultaneous treatments per surface; cut median experiment duration from 28 to 11 days; adopted by 12 product pods
Names what changed (sequential testing for multi-treatment), measurable speed (28 → 11 days), and adoption (12 pods).
Mistakes specific to this role
- Listing every ML algorithm from a textbook. Pick 5-8 you've actually deployed.
- Omitting business outcomes for applied roles. Models don't matter without a metric they moved.
- Hiding Kaggle medals — if you have a gold or 2+ silvers, surface them under Awards.
- Overweighting an undergrad coursework section over your real Experience. Coursework is for new grads only.
Your résumé starts here. Pay later.
Start composingFrequently asked
- Should I include LeetCode for DS roles?
- Most DS roles don't filter on LeetCode the way engineering roles do. Top tier (Google Research, Anthropic) may quiz, but it's not a résumé filter. Skip unless your rating is top 1%.
- Where do Kaggle competitions go?
- Under Awards if you medalled in a Featured competition, under Projects otherwise. Don't list every competition you entered.
- How do I show I can productionise, not just prototype?
- Lead at least one bullet per role with a deploy verb (shipped, rolled out, productionised) + a uptime/latency/throughput metric.