Should I take the engineering-manager fork at staff or stay IC?

Decision is real at staff. The two tracks pay similarly at most companies (senior manager ~ staff IC; director ~ principal IC). The work shape is fundamentally different. Engineering management is people-leadership, hiring, performance, headcount strategy; IC staff is technical leadership, architecture, mentorship-via-technical-review. The right question: what energizes you on a Friday afternoon when no one is watching — fixing a thorny ML bug or coaching a struggling engineer through a hard quarter? Pick the track that matches; trying to do both is the failure mode.

How important is publishing at staff at AI labs?

Required at research-track AI-lab staff MTS. Anthropic, OpenAI, DeepMind, Cohere all expect staff MTS on the research track to publish at NeurIPS / ICML / ICLR / ACL or to author public engineering-blog posts of equivalent technical depth. The pattern: 1–2 published papers per year, frequently as senior author with junior research engineers as first authors. Publication is part of the multiplier — it scales the engineer's impact beyond the team.

What's the difference between staff and principal?

Scope and time-horizon. Staff owns architecture for an org over a 12–18 month horizon. Principal owns architecture for the company over a 24–48 month horizon. Principal engineers brief the C-suite directly on technical strategy; their decisions affect the company's competitive position. Staff is the level where you're recognized as a technical leader; principal is the level where you're recognized as a technical leader whose judgment defines what the company does. Promotion takes 2–5 years from staff at most large tech companies.

Do I need to be famous in the ML community to reach staff?

Helpful, not required. External visibility (Twitter, conference talks, technical blog posts, open-source contributions) is a senior-staff differentiator at AI-labs and at companies where ML is core to the product (Anthropic, OpenAI, Databricks, Scale AI). At FAANG production-ML, internal impact (lift on the north-star metric, multi-team architecture leadership) is the dominant signal; external visibility is a tiebreaker.

How much should I be coding at staff?

Less than at senior, but not zero. The benchmark at most large tech companies: 30–50% of a staff engineer's time is hands-on technical work — prototyping, code-review, architecture documents with code sketches, debugging hard production issues. The other 50–70% is meetings, mentorship, technical strategy, and writing. Staff engineers who don't code at all stall — they lose technical credibility with the engineers they're meant to multiply.

What's the right ML tooling investment at staff?

Build leverage, not vanity. Staff engineers who invest in shared eval harnesses, internal feature stores, training-pipeline infrastructure, or experiment-tracking platforms multiply the entire org's velocity. Vanity investments (a custom training framework that no other team adopts) waste the staff engineer's time. The signal: 6 months after you ship the infrastructure, are 3+ teams using it without you in the loop? If yes, it's leverage. If no, it's vanity.

Data Scientist / ML Engineer Hub

Staff Data Scientist / ML Engineer Guide for Tech Companies (2026)

By Blake Crosley · Last verified 2026-04-29

In short

Staff data scientist or ML engineer (8–12 years) at a tech company in 2026 owns ML strategy and architecture for an org — not a team. FAANG-tier total comp clusters $600k–$900k at L6/E6/IC6; AI-labs (Anthropic Staff MTS, OpenAI Staff MTS) sit $1M–$2.5M+ on heavy equity, with peak-vesting cycles in public levels.fyi reports exceeding $4M. Staff is where 'multiplier' becomes the entire job — your time is spent on architecture decisions that affect 10+ engineers, technical strategy that affects company outcomes, and senior engineers who promote under your sponsorship. The work is no longer measured by what you build; it's measured by what your org ships because of you.

Key takeaways

FAANG-tier staff DS / MLE total comp $600k–$900k at L6/E6/IC6 per levels.fyi 2026; Meta E6 DS $620k–$900k (levels.fyi/companies/facebook), Google L6 MLE $650k–$950k (levels.fyi/companies/google), Anthropic Staff MTS $1M–$2.5M+ (anthropic.com/careers + levels.fyi public reports).
Staff scope expands beyond a domain to multi-team or org-level work: a foundation-model strategy across multiple recommendation surfaces, an eval-platform that other teams adopt, a distributed-training infrastructure used by the whole ML org.
Strategic-articulation skill is non-negotiable at staff: write the one-pager that ties ML investment to company-strategic outcomes, brief the VP-of-Engineering and CTO without a manager mediating, name what's right and what's wrong about the org's ML roadmap.
Mentorship is multiplied at staff: 2–3 senior engineers level up under your sponsorship per year; their promotion cases name your scoping and review as load-bearing. Without this, you do not reach principal.
The 'staff-vs-management' fork is real at staff. Some engineers at this level move to engineering-management; others stay IC and move toward principal. Both paths exist at FAANG and AI-labs, with comp converging at the senior-staff / principal / senior-manager band.

What staff DS / MLEs actually do

Staff is the level where the work becomes nearly entirely about leverage. The senior signal — 'multi-project ownership' — is amplified at staff into 'multi-team or org-level architecture ownership.' Four behaviors define staff in 2026:

Architecture for the org, not the team. A staff MLE at Netflix owns the recommendation foundation-model strategy across three recommendation surfaces (homepage, search, collections), each with its own senior MLE as the project lead. A staff MLE at Anthropic owns the eval-harness infrastructure that the entire research-engineering org uses. The architecture decisions you make are felt by 10+ engineers and 6+ months of roadmap.
Strategic articulation at the executive level. Staff engineers write the one-pagers that brief the VP-of-Engineering and the CTO. They name what the org's next ML investment should be in terms of company-strategic outcomes — not just engineering metrics. They are quoted in board-level materials. They are invited into the technical-strategy meetings where investment decisions are made.
Senior-engineer mentorship. Staff engineers mentor seniors, not juniors. The signal: 2–3 senior engineers per year level up to staff under your sponsorship; their promotion cases name your scoping and review as load-bearing. Without this multiplier, you do not reach principal — the bar is unambiguous at every FAANG and AI-lab.
Cross-org influence. Other orgs come to you for ML-architecture decisions because your domain depth is recognized company-wide. You're invited into adjacent-team architecture reviews. Your technical documents are circulated org-wide. You give tech-talks at internal conferences. Your name shows up on patent applications, conference papers, or company engineering-blog posts.

What staff IS NOT at most large tech companies: 'tech-lead-manager' (TLM). TLM is a separate track — typically a senior engineer who's transitioning to people management while still doing some IC work. Staff IC is fully IC; the multiplier is technical, not management.

A worked staff-level project: foundation-model strategy at scale

A worked example — a staff MLE at a streaming-platform company driving a foundation-model strategy across four recommendation surfaces over a 12-month roadmap:

Q1: Strategy and architecture. Existing state: each of four recommendation surfaces (homepage, search, collections, kids-mode) has its own model and its own training pipeline. ML cost per surface is high; quality compounds slowly because each team iterates independently. Staff engineer's scope: design a unified foundation-model layer that all four surfaces can build on, while preserving each surface's ability to specialize. Three architectural options sketched: (a) shared embedding model, surface-specific heads, (b) shared embedding + surface-specific fine-tunes, (c) separate models with a shared eval-harness only. Recommendation: (a) for first phase, (b) as the year-2 target. One-pager circulated to VP-of-Engineering and CTO; sign-off received.
Q2: Foundation-model layer. Build the shared embedding model: a fine-tuned Llama-4-8B with multi-task training across all four surfaces, optimized for embedding quality (recall@1000) rather than completion. Eval-harness designed: a 50k-example held-out set with surface-specific evals. Staff engineer scopes 6 sub-projects across three teams: (1) the multi-task training pipeline (senior MLE on the platform team), (2) the embedding-serving infrastructure (senior infra-eng), (3) the surface-specific evals × 4 (senior MLE on each surface team). Staff engineer reviews each design doc, leaves dense feedback, unblocks technical decisions across team boundaries.
Q3: Surface-by-surface migration. Homepage and search migrate to the shared embedding first (lower-risk surfaces). A/B tests run; both show +1.4% to +2.1% on retention with confidence intervals at 95%. Collections and kids-mode migrate in Q4. Staff engineer writes the migration runbook, the rollback playbook, and the cost-impact one-pager. Two senior MLEs on the project go up for staff promotion in the next cycle; their promotion cases name the staff engineer's scoping and review as load-bearing.
Q4: Year-2 roadmap and write-up. Tech-talk at the company's monthly all-hands. Internal blog post. Public conference talk at NeurIPS Industry track or similar. Staff engineer writes the year-2 strategy: surface-specific fine-tunes built on the shared embedding, with a stretch goal of unified ranking across surfaces. Two of the senior MLEs become tech-leads on year-2 sub-projects.

What made this staff scope: the engineer designed an architecture that affected 10+ engineers and 12+ months of roadmap, articulated the strategic outcome at the VP / CTO level, mentored two senior MLEs through promotion, and unblocked decisions across team boundaries. The same problem at senior level would have been one engineer leading the foundation-model layer for a single surface, with a staff engineer overseeing.

The staff interview: what gets tested

Staff interview rounds at FAANG-tier and AI-labs in 2026 are heavily-weighted on architecture-and-leadership and lightly-weighted on coding. Typical loop: 1 phone screen + 5–7 onsite rounds (1 ML coding — light, 2 ML system design — heavy, 1–2 cross-functional / leadership / 'tell me about a time you led a multi-team initiative,' 1 stats / eval / research-fluency, 1 hiring-committee or team-match). Staff-specific weighting:

Architecture-level system design. 'Design a [foundation-model platform / multi-tenant ML serving layer / company-wide eval infrastructure / RLHF training pipeline] for a [company-shaped scenario at FAANG / AI-lab scale].' The bar: 60-min round where you scope, design, articulate trade-offs, and defend against an experienced staff/principal interviewer. Hello Interview's staff-level ML system design walkthroughs (hellointerview.com/learn) cover the canonical rubric.
Leadership and cross-team coordination. 'Tell me about a project you led that involved 10+ engineers across multiple teams.' 'Walk me through a technical decision you made that was unpopular, and how you navigated it.' 'Describe a time you mentored a senior engineer through a difficult promotion case.' Staff candidates without these stories fail this round at every FAANG and every AI-lab.
Strategic articulation. 'How would you propose your team invest the next 4 quarters of ML headcount?' 'What's wrong with the way most companies do RLHF / experimentation / model evaluation today?' Staff candidates are graded on the quality of their opinion, not just the absence of bad opinions.
Research / domain depth. 'What's the most important paper in your domain in the last 12 months and why?' 'Where is the field of [recommendations / LLM eval / multi-modal foundation models] going in the next 24 months?' Staff candidates are expected to articulate technical strategy, not just execute on it.

The staff interview is the level where 'cannot articulate trade-offs at architectural depth' is an immediate disqualifier. Coming in with three rehearsed stories is not enough; the interviewer will probe one or two layers down on each. The candidates who clear staff interviews tend to be the ones who've actually done staff-shape work (multi-team architecture leadership) at their current company, and can speak from lived experience.

Compensation: the real bands at staff

Total comp at staff FAANG-tier and AI-labs in 2026 (US, per levels.fyi):

Company	Level	Base	Total comp
Meta DS	E6	$240k–$300k	$620k–$900k
Google MLE	L6	$260k–$330k	$650k–$950k
Netflix MLE	L6	$550k–$700k	$700k–$1.1M (single-band)
Anthropic Staff MTS	staff	$450k–$600k	$1M–$2.5M+
OpenAI Staff MTS	staff	$500k–$700k	$1.4M–$3.5M+ (heavy PPU)
Databricks MLE	L6	$300k–$390k	$650k–$1.1M
Scale AI	staff MLE	$370k–$470k	$800k–$1.5M+

The structural fact at staff: AI-lab staff MTS at Anthropic and OpenAI commonly clears $2M+ on equity-heavy total. OpenAI's PPU has produced reported staff-MTS total comp in the $3M–$5M range during peak vesting in public levels.fyi reports. Risk-adjusted comparisons require accounting for AI-lab equity concentration vs FAANG diversification across years and stock cycles. The senior-staff band at most companies adds another $200k–$400k of total comp on top of the staff band.

Frequently asked questions

Should I take the engineering-manager fork at staff or stay IC?: Decision is real at staff. The two tracks pay similarly at most companies (senior manager ~ staff IC; director ~ principal IC). The work shape is fundamentally different. Engineering management is people-leadership, hiring, performance, headcount strategy; IC staff is technical leadership, architecture, mentorship-via-technical-review. The right question: what energizes you on a Friday afternoon when no one is watching — fixing a thorny ML bug or coaching a struggling engineer through a hard quarter? Pick the track that matches; trying to do both is the failure mode.
How important is publishing at staff at AI labs?: Required at research-track AI-lab staff MTS. Anthropic, OpenAI, DeepMind, Cohere all expect staff MTS on the research track to publish at NeurIPS / ICML / ICLR / ACL or to author public engineering-blog posts of equivalent technical depth. The pattern: 1–2 published papers per year, frequently as senior author with junior research engineers as first authors. Publication is part of the multiplier — it scales the engineer's impact beyond the team.
What's the difference between staff and principal?: Scope and time-horizon. Staff owns architecture for an org over a 12–18 month horizon. Principal owns architecture for the company over a 24–48 month horizon. Principal engineers brief the C-suite directly on technical strategy; their decisions affect the company's competitive position. Staff is the level where you're recognized as a technical leader; principal is the level where you're recognized as a technical leader whose judgment defines what the company does. Promotion takes 2–5 years from staff at most large tech companies.
Do I need to be famous in the ML community to reach staff?: Helpful, not required. External visibility (Twitter, conference talks, technical blog posts, open-source contributions) is a senior-staff differentiator at AI-labs and at companies where ML is core to the product (Anthropic, OpenAI, Databricks, Scale AI). At FAANG production-ML, internal impact (lift on the north-star metric, multi-team architecture leadership) is the dominant signal; external visibility is a tiebreaker.
How much should I be coding at staff?: Less than at senior, but not zero. The benchmark at most large tech companies: 30–50% of a staff engineer's time is hands-on technical work — prototyping, code-review, architecture documents with code sketches, debugging hard production issues. The other 50–70% is meetings, mentorship, technical strategy, and writing. Staff engineers who don't code at all stall — they lose technical credibility with the engineers they're meant to multiply.
What's the right ML tooling investment at staff?: Build leverage, not vanity. Staff engineers who invest in shared eval harnesses, internal feature stores, training-pipeline infrastructure, or experiment-tracking platforms multiply the entire org's velocity. Vanity investments (a custom training framework that no other team adopts) waste the staff engineer's time. The signal: 6 months after you ship the infrastructure, are 3+ teams using it without you in the loop? If yes, it's leverage. If no, it's vanity.

Sources

About the author. Blake Crosley founded ResumeGeni and writes about data science, machine learning, hiring technology, and ATS optimization. More writing at blakecrosley.com.