Security and Authentication for Backend Engineers (2026)
In short
Backend security in 2026 is layered. The senior bar: validate JWTs by pinning the algorithm and verifying iss, aud, and exp before trusting any claim; use OAuth 2.1 with PKCE for every flow including confidential clients; rate-limit at the edge with token-bucket so one client cannot starve the fleet; rotate secrets on a schedule and revoke on suspicion; require mTLS between internal services; pass user input only through parameterized queries; treat the OWASP Top 10 2024 as a continuous audit.
Key takeaways
- JWT validation has three classic failure modes: accepting alg=none, key confusion between HS256 and RS256, and skipping iss / aud / exp checks. The OWASP JWT cheat sheet (cheatsheetseries.owasp.org) is the canonical reference; pin the expected algorithm in code and reject every token that does not match.
- OAuth 2.1 (oauth.net/2.1/) folds the best-practice guidance from RFC 8252 and RFC 6749 errata into a single profile: PKCE is required for every authorization-code flow including confidential clients, the implicit grant is removed, the resource owner password credentials grant is removed, and refresh tokens must be sender-constrained or rotated on use.
- Rate limiting uses token-bucket for bursty workloads (allows short spikes up to bucket size, then steady refill) and leaky-bucket for smoothing (constant outflow regardless of input shape). The AWS Builders' Library essays on throttling and shuffle-sharding are the canonical engineering reference.
- Secret rotation is a discipline, not a feature. Long-lived static secrets are a single point of failure; rotate on a schedule (90 days for service credentials, 24 hours for short-lived tokens) and on suspicion (any unexpected access pattern). Use a secret manager (AWS Secrets Manager, HashiCorp Vault, Doppler) with per-secret IAM and full audit logs.
- mTLS between internal services proves both ends of the connection. The client presents a certificate signed by the trusted internal CA; the server verifies it before accepting the request. Pair mTLS with SPIFFE / SPIRE (or a service mesh like Istio / Linkerd) for automatic certificate rotation.
- SQL injection is still in the OWASP Top 10 in 2026 because string interpolation into queries still happens. Parameterized queries (psycopg's %s placeholders, SQLAlchemy's text() with bindparams, asyncpg's $1 placeholders) are the only correct answer; ORMs help, but only if you never reach for raw() with user input.
- Defense in depth means every layer assumes the layer above failed. Validate input at the edge AND in the service AND at the database. Authenticate the user AND the device AND the service. Log everything an auditor will ask for. CISA (cisa.gov/topics/cyber-threats-and-advisories) publishes the threat landscape that tells you which layer to harden next.
Authentication patterns: JWT, OAuth 2.1, mTLS
Backend authentication in 2026 has three primary patterns, each with a clear use case:
- JWT bearer tokens for stateless service-to-service or short-lived API access. The token carries claims (sub, iss, aud, exp) and is signed with HS256 (shared secret) or RS256 / ES256 (asymmetric). The server verifies the signature on every request and trusts the claims if and only if the signature checks out. RFC 7519 (datatracker.ietf.org/doc/html/rfc7519) is the canonical specification.
- OAuth 2.1 with PKCE for delegated authorization where a third-party client acts on behalf of a user. PKCE (RFC 7636, datatracker.ietf.org/doc/html/rfc7636) closes the authorization-code interception attack: the client generates a random code_verifier, sends a SHA-256 hash (code_challenge) with the authorization request, and proves possession of the verifier when exchanging the code for tokens. OAuth 2.1 makes PKCE mandatory for every authorization-code flow, including confidential clients with client secrets.
- mTLS for service-to-service authentication inside the trust boundary. The client presents a certificate signed by the internal CA; the server verifies the chain and the certificate's subject before accepting the request. mTLS is what lets you say "every request hitting this internal service is from a known service" without trusting network position.
The architecture decision is which pattern fits the use case: JWTs for stateless APIs where the issuer and verifier share trust; OAuth 2.1 for any flow where a user delegates access to a third party; mTLS for internal service-to-service traffic. The patterns compose — a service mesh can require mTLS at the transport layer AND a JWT in the application header, with the JWT sub claim bound to the certificate subject.
The mistake to avoid: treating any single pattern as universal. JWTs are not a session-management primitive (you cannot revoke a JWT before it expires without a denylist that defeats the stateless property). OAuth 2.1 is not a backend-to-backend auth protocol (that is the OAuth 2.0 client_credentials grant, narrow scope). mTLS does not authenticate the human user (only the calling service).
JWT validation pitfalls and how to avoid them
Three classic JWT validation bugs ship to production every year. Each one is a complete authentication bypass.
Pitfall 1: accepting alg=none. The JWT spec includes a "none" algorithm that produces an unsigned token. A library that respects the alg header without an allowlist will accept a forged token with alg=none and an empty signature. The fix is to pin the expected algorithm in code and reject every other value.
Pitfall 2: HS256 / RS256 key confusion. If your verifier accepts both HS256 (symmetric) and RS256 (asymmetric), an attacker can sign a token with HS256 using your RS256 public key as the HMAC secret. The verifier loads the public key as the HMAC key and the signature checks out. The fix is to pin one algorithm per issuer.
Pitfall 3: trusting claims without verifying iss, aud, and exp. A signature-valid token from a different issuer or with an expired exp is still cryptographically valid. The fix is to verify every standard claim before trusting the token body.
Here is a correct Python implementation using PyJWT:
import jwt
from jwt import PyJWKClient
from typing import Any
EXPECTED_ISS = "https://auth.example.com/"
EXPECTED_AUD = "https://api.example.com/"
EXPECTED_ALG = "RS256"
JWKS_URL = "https://auth.example.com/.well-known/jwks.json"
_jwks_client = PyJWKClient(JWKS_URL, cache_keys=True, lifespan=3600)
def verify_jwt(token: str) -> dict[str, Any]:
"""Verify a JWT and return its claims, or raise."""
# 1. Resolve the signing key from the JWKS endpoint by kid.
signing_key = _jwks_client.get_signing_key_from_jwt(token).key
# 2. Decode with explicit algorithm allowlist + audience + issuer.
# PyJWT enforces exp, nbf, iat by default; we add iss + aud here.
claims = jwt.decode(
token,
signing_key,
algorithms=[EXPECTED_ALG], # pin algorithm; rejects alg=none, HS256
audience=EXPECTED_AUD, # rejects tokens for other audiences
issuer=EXPECTED_ISS, # rejects tokens from other issuers
options={
"require": ["exp", "iat", "iss", "aud", "sub"],
"verify_signature": True,
"verify_exp": True,
"verify_iss": True,
"verify_aud": True,
},
leeway=30, # 30s clock skew tolerance
)
return claims
What this code gets right: the algorithm is pinned (no alg=none, no HS256 / RS256 confusion); iss and aud are checked by the library; exp is required; the JWKS client caches keys with a finite lifespan (so key rotation propagates without restart). The OWASP JWT cheat sheet (cheatsheetseries.owasp.org/cheatsheets/JSON_Web_Token_for_Java_Cheat_Sheet.html) covers the same checks for the Java ecosystem.
What still needs explicit handling: revocation. JWTs are stateless by design, so a stolen token is valid until exp. Mitigations: short exp (5-15 minutes for access tokens), refresh-token rotation with reuse detection, a denylist for known-bad jti values, or sender-constrained tokens (DPoP / mTLS-bound access tokens) so possession alone is not enough.
Rate limiting at scale: token bucket vs leaky bucket
Rate limiting protects the service from abuse and protects every other tenant from a single noisy neighbor. Two algorithms cover almost every use case.
Token bucket allows bursts up to the bucket size and then refills at a steady rate. It is the right algorithm when the workload is naturally bursty (a user uploads a batch of documents, then idles for an hour) and you want to permit the burst without configuring a high steady-state rate.
Leaky bucket drains at a constant rate regardless of input shape. Requests that arrive faster than the drain rate queue (or are rejected if the queue is full). It is the right algorithm when the downstream system has a hard throughput ceiling and the input must be smoothed before it hits that ceiling.
Most production rate limiters are token-bucket because the burst tolerance maps to user expectations. Here is an in-process token-bucket implementation in Python (the same algorithm runs server-side as a Lua script when the bucket state lives in Redis):
import time
import asyncio
from dataclasses import dataclass, field
@dataclass
class TokenBucket:
capacity: int # max tokens (burst size)
refill_per_sec: float # tokens added per second
tokens: float = field(init=False)
updated_at: float = field(default_factory=time.monotonic)
_lock: asyncio.Lock = field(default_factory=asyncio.Lock)
def __post_init__(self) -> None:
self.tokens = float(self.capacity)
async def take(self, cost: int = 1) -> bool:
"""Atomically refill, then attempt to consume `cost` tokens."""
async with self._lock:
now = time.monotonic()
elapsed = max(0.0, now - self.updated_at)
self.tokens = min(
float(self.capacity),
self.tokens + elapsed * self.refill_per_sec,
)
self.updated_at = now
if self.tokens < cost:
return False
self.tokens -= cost
return True
What this code gets right: the read-modify-write is atomic (asyncio.Lock for in-process; Lua for the Redis-backed version); refill is computed from elapsed time using monotonic clock (so the algorithm survives clock drift); cost is parameterized so an expensive endpoint can charge more than a cheap one; the bucket starts full so a fresh client gets the burst allowance.
What still needs handling at scale: shuffle-sharding (so a noisy tenant cannot starve the rate-limiter's Redis cluster), 429-with-Retry-After response headers (clients must back off on the value, not on a fixed schedule), and per-user + per-IP + per-endpoint composition (a single key is rarely sufficient; most production systems compose three or four limiters per request). The AWS Builders' Library (aws.amazon.com/builders-library/) has the canonical essays on rate-limiter design at scale, including "Avoiding insurmountable queue backlogs" and "Workload isolation using shuffle-sharding".
Defense in depth: from query to runtime
Defense in depth means every layer assumes the layer above it failed. The OWASP Top 10 2024 (owasp.org/Top10) is the empirical evidence for which failures actually ship to production. The senior bar is treating each entry as a continuous audit, not a one-time fix.
A03:2021 Injection remains in the Top 10 because string interpolation into queries still happens. The only correct answer is parameterized queries. Compare:
# UNSAFE — string interpolation, classic SQL injection.
# A user_id of "1; DROP TABLE users--" runs the DROP.
query = f"SELECT * FROM users WHERE id = {user_id}"
await conn.execute(query)
# SAFE — parameterized query with asyncpg. The driver sends the
# SQL and parameters separately; the database never parses the
# user input as SQL.
query = "SELECT * FROM users WHERE id = $1"
row = await conn.fetchrow(query, user_id)
# SAFE — SQLAlchemy 2.0 with bound parameters.
from sqlalchemy import text
stmt = text("SELECT * FROM users WHERE id = :uid")
row = (await session.execute(stmt, {"uid": user_id})).first()
The unsafe form concatenates user input into the SQL string before it leaves the application. The safe forms send the SQL and the parameters as separate values; the database driver binds the parameter into a typed slot in a prepared statement, and the database engine never parses the value as SQL. ORMs (SQLAlchemy, Django ORM, Prisma) parameterize by default — the failure mode is reaching for raw() with user input or building dynamic SQL with f-strings.
The other Top 10 categories that hit backend code most often:
- A01 Broken Access Control — every endpoint must check authorization, not just authentication. A logged-in user is not authorized to read every record. Test with deliberate IDOR (insecure direct object reference) probes in CI.
- A02 Cryptographic Failures — TLS 1.2+ everywhere, modern cipher suites, no bespoke crypto. Use libsodium or the platform crypto library; never implement AES or HMAC by hand.
- A05 Security Misconfiguration — disable debug mode in production, strip stack traces from error responses, set security headers (Content-Security-Policy, Strict-Transport-Security, X-Frame-Options), and make the default deny.
- A07 Identification and Authentication Failures — rate-limit login endpoints, hash passwords with argon2id or bcrypt (never SHA-256), enforce MFA for admin paths, and rotate session tokens on privilege change.
- A09 Security Logging and Monitoring Failures — log every authentication event, every authorization failure, and every administrative action. Ship logs to a system the application cannot tamper with. CISA (cisa.gov/topics/cyber-threats-and-advisories) publishes the threat landscape that tells you which signals matter this quarter.
The discipline that ties this together: every change to an authentication, authorization, or data-access path goes through a security review with a written threat model. The reviewer asks three questions — what does an attacker gain if they break this? what is the simplest attack? what is the detection? — and the change does not ship until each has an answer.
Frequently asked questions
- Is JWT a session token?
- No. A JWT is a signed claim, not a session. Sessions need server-side state to support revocation; JWTs are stateless and valid until exp. If you need revocation (logout, password change, compromise response) before exp, you need either a server-side session, a JWT denylist (which sacrifices the stateless property), or sender-constrained tokens like DPoP. Treating a JWT as a session token without those mitigations is one of the most common authentication bugs in production.
- Why is PKCE required even for confidential clients in OAuth 2.1?
- OAuth 2.0 originally required PKCE only for public clients (mobile, SPA) because the authorization-code interception attack required a hostile app on the same device. OAuth 2.1 (oauth.net/2.1/) extends the requirement to confidential clients because additional attack surfaces emerged: mix-up attacks, code-injection through faulty redirects, and authorization-code leaks via Referer headers or browser history. PKCE closes those even when the client has a secret. The change is mandatory for new deployments.
- Token bucket or leaky bucket — which should I default to?
- Default to token bucket. Real-world workloads are bursty; users perform several actions in quick succession, then idle. A token bucket permits the burst without forcing a high steady-state rate that would let a single client dominate. Leaky bucket is the right answer when the downstream system has a hard throughput ceiling (a third-party API at 100 req/sec) and bursty input would cause cascading 429s. The two compose: token-bucket at the edge for fairness, leaky-bucket in front of the rate-limited downstream for smoothing.
- How long should an access token live?
- Five to fifteen minutes for access tokens, paired with longer-lived refresh tokens that rotate on use. Short access tokens limit the blast radius if a token is exfiltrated; refresh-token rotation lets the server detect reuse (an old refresh token presented after a new one was issued is evidence of compromise). Long-lived access tokens (24 hours, 7 days) are the wrong default because they cannot be revoked before expiry without a denylist, and the denylist defeats the stateless reason you chose JWTs.
- Do I need mTLS if I already have a service mesh?
- A service mesh like Istio or Linkerd usually provides mTLS as the default transport encryption between services in the mesh; that is the answer. The senior bar is verifying that mTLS is actually enforced (not just available), that certificates rotate automatically (SPIFFE / SPIRE handles this), and that the mesh policy denies plaintext traffic. mTLS without enforcement is theater. Outside a mesh, you implement mTLS at the load balancer or the application; the mesh just makes it cheaper.
- ORMs prevent SQL injection — do I still need to think about it?
- Yes. ORMs parameterize by default, but the failure mode is the escape hatch: SQLAlchemy text() with f-string interpolation, Django raw() with user input, Prisma $queryRawUnsafe. Every codebase eventually has a query the ORM cannot express, and that is when the unsafe pattern shows up. The discipline is a lint rule that flags string concatenation into raw() / text() / queryRawUnsafe and a code-review checklist that requires parameter bindings for every dynamic value. The OWASP Top 10 2024 still lists injection because this happens.
- What is the right cadence for secret rotation?
- Service credentials (database passwords, API keys to internal systems) rotate every 90 days as a baseline; sensitive secrets (signing keys, encryption keys) rotate every 30-60 days; short-lived tokens (access tokens, session tokens) rotate every 5-15 minutes. The schedule is the floor; rotation also fires on suspicion (any unexpected access pattern) and on personnel change (anyone with access to the secret leaves the team). Rotation is cheap if you build for it from day one and expensive if you bolt it on later.
- How do I handle authorization separately from authentication?
- Authentication answers who you are; authorization answers what you can do. They live at different layers. Authentication runs once at the edge (validate the JWT or session cookie, attach the principal to the request context). Authorization runs at the service or resource boundary on every operation (check whether this principal can read this record, mutate this resource, hit this admin endpoint). The OWASP A01 Broken Access Control category is the largest in the 2021/2024 Top 10 because services routinely authenticate but skip the authorization check on a specific record.
- Should I use HS256 or RS256 for JWTs?
- RS256 (or ES256) for almost every case. With RS256, the issuer holds the private signing key and verifiers fetch the public key from a JWKS endpoint; the verifier never sees the signing key. With HS256, the signing key is the verification key, so every verifier holds material that can issue tokens — a single compromised verifier compromises the issuer. HS256 is acceptable only when the issuer and verifier are the same service. For any cross-service or cross-organization flow, use asymmetric signatures and pin the algorithm in the verifier code.
- What is the simplest defense-in-depth checklist for a new backend service?
- Enforce TLS 1.2+ at the edge; authenticate every request and authorize every operation; parameterize every query; rate-limit every public endpoint; rotate every credential on a schedule and on suspicion; log every authentication, authorization, and administrative event; ship logs to a tamper-resistant store; deny by default in every config. Audit against the OWASP Top 10 every quarter and the CISA advisories every week. Each item is one line; the discipline is doing all of them, because an attacker only needs one to be missing.
Sources
- OWASP Top 10 — empirical list of the most critical web-application security risks (2024 edition)
- OAuth 2.1 — consolidated specification with PKCE-mandatory and best-current-practice profile
- RFC 7519 — JSON Web Token (JWT) canonical specification
- RFC 7636 — Proof Key for Code Exchange (PKCE) by OAuth Public Clients
- Amazon Builders' Library — production-grade engineering essays on throttling, shuffle-sharding, and queue management
- OWASP JSON Web Token Cheat Sheet — canonical JWT validation checklist
- CISA Cyber Threats and Advisories — current threat landscape and known-exploited vulnerabilities catalog
About the author. Blake Crosley founded ResumeGeni and writes about backend engineering, hiring technology, and ATS optimization. More writing at blakecrosley.com.