Security Engineer at Anthropic (2026): Levels, Comp, Culture, Interview
In short
A Security Engineer at Anthropic in 2026 protects Claude; the model weights, the training pipeline, and the API and Claude.ai surfaces; against adversaries that include state actors and prompt-injection-as-attack-vector. The work spans classical product security, Trust and Safety, and the Frontier Red Team adversarial-research function. The Responsible Scaling Policy on anthropic.com defines deployment-time safety thresholds (ASL-2, ASL-3) Security owns. Compensation belongs on levels.fyi per-company filters; the role is heavily research-shaped.
Key takeaways
- Security Engineering at Anthropic spans three surfaces classical product-security teams do not all touch in one job: AppSec and infrastructure security on Claude.ai, the API, and Claude Code; Trust and Safety security against abuse and prompt-injection attacks; and frontier-AI safety security including model-weight protection and training-pipeline integrity.
- The Frontier Red Team is a publicly named, research-publishing function; papers and posts on anthropic.com/research evaluate jailbreaks, prompt-injection chains, and agentic-misuse risks against Claude itself. Security Engineers on or adjacent to this surface produce written research, not just internal-only findings.
- Model weights are the highest-stakes asset Anthropic protects; the threat model includes nation-state actors interested in frontier model extraction and sophisticated insider risk. Anthropic publishes its security posture in the Trust Center on anthropic.com and in the Responsible Scaling Policy.
- The Responsible Scaling Policy at anthropic.com/responsible-scaling defines AI Safety Level (ASL) thresholds; ASL-2 is the current operating level for deployed Claude models with ASL-3 evaluations gating future more-capable releases. Security Engineers own the deployment-time evaluations that ASL transitions depend on.
- Interviews lean toward AI-safety-aware AppSec rather than pure red-team or pure SRE-security: an AI safety round (NIST AI Risk Management Framework, OWASP LLM Top 10, prompt-injection threat modeling), an AppSec round, a distributed-systems coding round in Python and TypeScript, and a behavioral round with a frontier-AI-specific scenario such as model-misuse incident response.
- Compensation belongs on levels.fyi per-company filters at levels.fyi/companies/anthropic; single-number claims for senior Security Engineer total comp are unreliable. Anthropic anchors the upper band of frontier-AI lab comp; equity is a meaningful component for a private company at this stage.
- BLS projects 29 percent employment growth for Information Security Analysts (SOC 15-1212) from 2024 to 2034 with about 16,000 annual openings and a May 2024 median of $124,910; Anthropic compensation sits well above this national baseline given the frontier-AI lab positioning and equity.
Three Security Engineer surfaces at Anthropic in 2026
Anthropic publishes Security Engineer roles on anthropic.com/jobs across three broad surface areas, often blended within a single role description rather than partitioned cleanly the way they are at older tech companies.
Trust and Safety Security. Protecting the Claude.ai consumer product, the API, and Claude Code against abuse, account-takeover, platform-misuse, and prompt-injection-as-attack-vector; the surface where an adversarial prompt is a security event rather than a product-quality issue. This is closest to classical AppSec at a SaaS company, but with the Trust-and-Safety AI-misuse threat model layered on top.
Frontier Red Team. A publicly named adversarial-research function. The Frontier Red Team evaluates Claude itself against jailbreaks, prompt-injection chains, and agentic-misuse risks, and publishes findings on anthropic.com/research. Security Engineers on or adjacent to this surface produce written research; the deliverable looks more like a paper than a Jira ticket.
Frontier-AI safety security. Model-weight protection, training-pipeline integrity, and supply-chain protection for models. The threat model is unusual; state actors interested in frontier model weights, sophisticated insider risk with model-extraction motives, and supply-chain attacks on training infrastructure. The Trust Center publishes a portion of the security posture publicly.
The Responsible Scaling Policy and Security Engineering
The Responsible Scaling Policy (RSP) at anthropic.com/responsible-scaling is Anthropic's public commitment to specific evaluation thresholds before deploying more-capable models. The policy defines AI Safety Levels (ASL); ASL-2 is the current operating level for deployed Claude models, with ASL-3 evaluations gating future releases that could meaningfully uplift capability in catastrophic-risk domains.
Security Engineering is one of the disciplines that owns the deployment-time evaluations RSP transitions depend on: the cybersecurity uplift evaluations that test whether a frontier model materially advances offensive-cyber capability, the biosecurity and CBRN-uplift evaluations Anthropic runs in partnership with external experts, and the model-weight protection regime that gates higher ASL levels. Model cards published on anthropic.com/news at each Claude release summarize a portion of these evaluations publicly.
The interview-loop signal here is fluency in the NIST AI Risk Management Framework; csrc.nist.gov/projects/ai-risk-management-framework publishes the AI RMF and the companion AI RMF Generative AI Profile (NIST AI 600-1). The OWASP LLM Top 10 at owasp.org is the canonical AppSec-flavored vocabulary for LLM-application-layer risk: Prompt Injection (LLM01), Sensitive Information Disclosure (LLM02), Supply Chain (LLM03), and so on through LLM10. Senior+ candidates speak both fluently.
Interview loop shape; what to expect
The Anthropic Security Engineer interview loop in 2026 looks AI-safety-aware AppSec rather than pure red-team or pure SRE-security. Public role pages on anthropic.com/jobs filtered to Security describe loops with these recurring rounds.
AI safety round. Prompt-injection threat modeling against an agentic Claude deployment (MCP server, tool-use, browser-using agent). Expect questions grounded in OWASP LLM Top 10; for example, a take-home or whiteboard scenario where an untrusted document is loaded into context and the candidate must enumerate the prompt-injection attack surface, the data-exfiltration pathways, and the mitigations that hold up versus the ones that look like security theater. NIST AI RMF Map / Measure / Manage / Govern functions are the structuring framework.
AppSec round. A classical application-security interview against an Anthropic-shaped service: an authenticated API endpoint, a sandboxed code-execution environment, a multi-tenant data flow. OWASP Top 10 (Broken Access Control, Injection, Insecure Design, Identification and Authentication Failures) is the vocabulary; OWASP ASVS is the verification rubric senior+ candidates are expected to know exists.
Distributed-systems coding round. Python and TypeScript heavy. The exercise is production-shaped; write the input-validation layer, write the authorization middleware, write the safe-deserialization parser. Code quality matters; the rubric is closer to a software-engineering coding round than to a CTF.
Behavioral round with a frontier-AI-specific scenario. Model-misuse incident response, RSP-evaluation friction with a product team, prioritizing weight-protection investment versus product-security investment under finite headcount. The rubric tests judgment; there is no clean answer; the bar is reasoning that takes the frontier-AI threat model seriously without abandoning shipping discipline.
Compensation, level structure, and how to apply
Compensation. The accurate anchor is levels.fyi/companies/anthropic with the role filter applied. Single-number claims for senior Security Engineer total comp at Anthropic are unreliable; the per-company filter is the only honest source. Anthropic anchors the upper band of frontier-AI lab compensation, and as a private company at this stage equity is a meaningful component of total comp; the equity grant structure is not public and varies by level and offer.
The broader-industry baseline from BLS Occupational Outlook Handbook; Information Security Analysts (SOC 15-1212) reports a May 2024 median of $124,910 with 29 percent projected employment growth from 2024 to 2034 and about 16,000 annual openings. Anthropic compensation sits well above this national baseline.
Level structure. Public role pages on anthropic.com/jobs do not publish a fixed level ladder the way Google's L3-L8 system is documented externally. Roles are titled along Member of Technical Staff, Senior Member of Technical Staff, and Manager-of-Technical-Staff lines on the engineering side, with Security Engineer titling common on the security-specific roles. Specific level-name claims beyond what current public role pages show would be speculation.
How to apply. The hiring surface is anthropic.com/jobs filtered to Security. Roles rotate; a recurring posting cadence covers Trust and Safety Security, Security Engineering on the Infrastructure side, and the Frontier Red Team. Public artifacts that strengthen an application: OWASP LLM Top 10 fluency, NIST AI RMF familiarity, prompt-injection or jailbreak research published under your own name, and CVE or bug-bounty work that demonstrates real adversarial craft.
What is genuinely different about Security at a frontier-AI lab
The threat model is unusual in three ways most tech-company SecEng roles do not encounter together. Naming them honestly is the senior-bar signal at interview.
First, model weights are the asset. Most product-security programs protect customer data; Anthropic also protects the model weights themselves as a top-tier asset, with a threat model that includes nation-state actors. The Trust Center at trust.anthropic.com publishes a portion of the security posture; the underlying weight-protection regime ratchets at higher ASL levels per the Responsible Scaling Policy.
Second, research is part of the job. The Frontier Red Team publishes papers on anthropic.com/research. Even outside the Frontier Red Team, Security Engineers at Anthropic produce written artifacts at an unusually high rate; model cards, RSP evaluation summaries, security disclosures, prompt-injection research. Comfort with technical writing is part of the senior bar.
Third, the AI-safety frame is real, not decorative. The OWASP LLM Top 10 and the NIST AI Risk Management Framework are working vocabulary, not check-the-box references. Senior+ candidates can talk through prompt-injection threat modeling on an agentic deployment, evaluate a jailbreak research paper on its merits, and reason about RSP-evaluation scope without dismissing the framing as marketing.
Frequently asked questions
- What is the Responsible Scaling Policy and does it actually shape engineering work?
- Yes. The Responsible Scaling Policy at anthropic.com/responsible-scaling-policy is Anthropic's public commitment to specific evaluation thresholds before deploying more-capable models. It defines AI Safety Levels; ASL-2 is the current operating level for deployed Claude models, with ASL-3 evaluations gating future releases that could meaningfully uplift capability in catastrophic-risk domains (cybersecurity uplift, biosecurity / CBRN, autonomy). Security Engineers own a portion of the deployment-time evaluations and the model-weight protection regime that gate ASL transitions; the work shows up in real engineering decisions, not just policy documents.
- What is the Frontier Red Team and is it different from a normal red team?
- The Frontier Red Team is Anthropic's publicly named adversarial-research function focused on Claude itself. Findings are published on anthropic.com/research as papers and research posts. Unlike a classical corporate red team; which exercises an organization's defenses against simulated adversaries; the Frontier Red Team evaluates the model: jailbreaks, prompt-injection chains, agentic-misuse risks, capability evaluations relevant to RSP thresholds. The deliverable looks more like a research paper than a pentest report. Security Engineers on or adjacent to this surface should expect to write publicly.
- How prompt-injection-aware does the AppSec interview round actually get?
- Quite. The OWASP LLM Top 10 at owasp.org is the canonical reference; Prompt Injection (LLM01) is the first item and it dominates real attacks against agentic deployments. Expect a scenario where an untrusted document is loaded into a Claude context window with tool-use enabled, and you must enumerate the attack surface (data exfiltration via tool call, lateral movement via MCP server, indirect prompt injection from a retrieved web page) plus the mitigations that hold up under adversarial pressure versus the ones that look like security theater. The NIST AI RMF Map / Measure / Manage / Govern functions are a useful structuring framework.
- How does Security Engineering at Anthropic differ from Security Engineering at OpenAI or Google DeepMind?
- All three labs share the frontier-AI threat model; model weights as a top-tier asset, prompt-injection-as-attack-vector, agentic-misuse risk; but Anthropic publishes more of its security posture than peers. The Responsible Scaling Policy is public; the Trust Center is public; the Frontier Red Team publishes papers on anthropic.com/research; model cards on anthropic.com/news document evaluations. The day-to-day Security Engineer work is similar in shape to peer labs; the writing-and-research surface is more public-facing.
- What is the Trust Center and what does it actually publish?
- The Trust Center at trust.anthropic.com is Anthropic's public security and compliance portal. It publishes a portion of the security posture; the compliance attestations, data-handling policies, and security controls customers and partners need to evaluate the platform. It is not the full internal security architecture, but it is a meaningful artifact for a company at Anthropic's stage and it reflects the publish-by-default disposition the Responsible Scaling Policy and Frontier Red Team research output also reflect.
- How much does AI safety research background actually matter for the interview loop?
- It matters more than at a classical tech company and less than for a research-scientist role. The interview loop tests AI-safety-aware AppSec, not AI safety theory. Senior+ candidates should be fluent in the OWASP LLM Top 10 vocabulary, NIST AI RMF structure, and prompt-injection threat modeling on agentic deployments; not in the academic alignment-research literature. Public artifacts that signal real depth: prompt-injection or jailbreak research published under your own name, OWASP LLM Top 10 contributions, CVE work on AI-platform infrastructure.
- What does levels.fyi say about Security Engineer comp at Anthropic?
- The accurate anchor is levels.fyi/companies/anthropic with the Security Engineer role filter applied. Single-number claims for senior Security Engineer total comp at Anthropic are unreliable because compensation varies materially by level, equity grant date and refresh cadence, location, and individual offer; the per-company filter on levels.fyi is the only honest source. Anthropic anchors the upper band of frontier-AI lab comp, and as a private company at this stage equity is a meaningful component of total comp.
- What classical certifications still help for a Security Engineer role at Anthropic?
- Less than at a compliance-heavy enterprise. The bar is engineering-quality work plus AI-safety-aware judgment, not a stack of certifications. The certifications that retain real signal are OSCP for offensive depth, AWS / GCP security specialty for cloud-security depth, and CISSP for senior+ candidates whose ladder included a security-leadership chapter. None of these substitute for OWASP LLM Top 10 fluency, prompt-injection threat-modeling fluency, NIST AI RMF familiarity, and a public artifact that demonstrates real adversarial craft.
- Is Anthropic actively hiring Security Engineers in 2026?
- Yes; anthropic.com/jobs filtered to Security publishes recurring postings across Trust and Safety Security, Security Engineering on the Infrastructure side, and the Frontier Red Team. The structural demand is strong: BLS projects 29 percent employment growth for Information Security Analysts (SOC 15-1212) from 2024 to 2034 with about 16,000 annual openings, and the frontier-AI lab category specifically is in a hiring expansion cycle as ASL-3 evaluations and weight-protection regimes ramp. Cadence varies; check anthropic.com/jobs directly rather than third-party aggregators.
Sources
- Anthropic; Careers (filter: Security)
- Anthropic Responsible Scaling Policy (ASL framework)
- Anthropic Research; Frontier Red Team and safety publications
- Anthropic News; Claude release model cards
- Anthropic Trust Center; public security posture
- OWASP LLM Top 10 for Large Language Model Applications
- OWASP Top 10; 2021 (current canonical version)
- NIST AI Risk Management Framework (AI RMF + Generative AI Profile)
- MITRE ATT&CK; Adversary Tactics and Techniques
- levels.fyi; Anthropic compensation per role and level
- BLS Occupational Outlook Handbook; Information Security Analysts (SOC 15-1212)
About the author. Blake Crosley founded ResumeGeni and writes about security engineering, hiring technology, and ATS optimization. More writing at blakecrosley.com.