SRE at Honeycomb (2026): Observability-2.0 Home, Interview, Comp, Stack
In short
Honeycomb is the ~200-person observability-2.0 company co-founded by Charity Majors (CTO; charity.wtf) and Christine Yen (CEO), with Liz Fong-Jones as the most-publicly-visible production-engineering and DevRel voice. The SRE-equivalent role at Honeycomb is 'production engineering' or 'platform engineering' — Honeycomb explicitly subscribes to the position (articulated in Charity Majors's writing) that observability and operability are owned by the engineers building the service, not by a separate ops org. The stack: their own product (Honeycomb dogfooded heavily), Kafka as the event-storage substrate, Go services on AWS, Retriever as the bespoke columnar query engine. Comp bands are not publicly transparent in 2026 — Honeycomb is private, levels.fyi has sparse data, and the company has not published salary ranges in the way Stripe or large public companies have.
Key takeaways
- Honeycomb is the canonical observability-2.0 company in 2026: high-cardinality wide events, single-source-of-truth structured data, event-driven debugging rather than dashboard-driven monitoring. Charity Majors's writing at charity.wtf and the 'Observability Engineering' book (Majors, Fong-Jones, Miranda; O'Reilly; free PDF at honeycomb.io/observability-engineering) are the canonical references.
- The role title is 'production engineer' or 'platform engineer', not 'SRE'. Honeycomb's engineering culture explicitly rejects the operator/developer split — the engineers who build a service own its production behavior. Charity Majors has written extensively on this (charity.wtf, including the 'Operations: A New Hope' series and 'On Call Shouldn't Suck'); the operational consequence is that production-engineering roles are deeply embedded in product teams rather than centralized in an SRE org.
- Liz Fong-Jones (DevRel; field CTO) is one of the most influential production-engineering and SRE voices of the 2020s, and her presence at Honeycomb is part of why the company punches above its weight on engineering culture. She has spoken extensively (SREcon, QCon, USENIX) on incident response, error budgets, the four golden signals, and the production-engineering posture; her published talks are a useful pre-interview read.
- Honeycomb's stack is small and opinionated: Go services, Kafka as the event-storage and processing substrate, Retriever (their bespoke columnar storage engine, written in Go) for the high-cardinality query workload, AWS for infrastructure, and Honeycomb itself for observability. They dogfood heavily — engineering posts at honeycomb.io/blog regularly cover 'how we run Honeycomb on Honeycomb'.
- Comp at Honeycomb is not publicly transparent in 2026. The company is private, levels.fyi has sparse self-reports for Honeycomb specifically, and Honeycomb has not published salary bands the way Stripe or larger public companies have. Honest framing: total comp at the senior production-engineer tier is reported (in scattered Glassdoor and levels.fyi entries) to be competitive with mid-tier startups but materially below FAANG and frontier-lab tiers.
- Honeycomb is remote-first and has been since founding (pre-pandemic). The company's published values include explicit support for distributed work; engineering hiring is open to candidates across US time zones, with periodic on-sites. The remote-first posture is a genuine cultural feature rather than a 2020s adaptation.
- The interview process is reputed (per candidate reports on Glassdoor and discussion threads on Hacker News and Reddit r/sre) to be strong on production-engineering judgment: incident-response scenarios, real debugging exercises, system-design with explicit attention to failure modes and observability. Less algorithmic than FAANG, more 'show me how you think about a service in production'.
SRE at Honeycomb in 2026: observability-2.0 home
Honeycomb is one of the most consequential engineering-culture companies of the 2020s relative to its size. ~200 employees in 2026, private, observability-product-focused — and the company's intellectual influence on how SRE / production engineering is practiced industry-wide is materially larger than its headcount would suggest. Three structural facts shape what 'SRE' actually means at Honeycomb:
- The observability-2.0 thesis. Honeycomb's product position — and the surrounding intellectual posture — is that the traditional 'three pillars' framing (logs, metrics, traces) is wrong. The right primitive is the arbitrarily-wide structured event: every request, every job, every operation emits a single rich event with high-cardinality attributes (user ID, request ID, customer ID, version, region, feature-flag state). Aggregations are derived from events at query time rather than precomputed at write time. Charity Majors's writing at charity.wtf has been articulating this position since 2017; the 'Observability Engineering' book (Majors, Fong-Jones, Miranda; O'Reilly 2022, free PDF at honeycomb.io/observability-engineering) is the canonical book-length statement. The 'observability-101' framing on the Honeycomb blog (honeycomb.io/blog/observability-101) is the introductory read.
- The role title is 'production engineer', not 'SRE'. Honeycomb's job postings (honeycomb.io/careers) historically use 'production engineer' or 'platform engineer' rather than 'site reliability engineer'. The distinction is intentional and ideological. Charity Majors has written explicitly (charity.wtf, multiple posts including 'Operations: A New Hope' and 'On Call Shouldn't Suck') against the operator/developer split that the SRE title can imply. The Honeycomb posture: engineers who build a service own its production behavior, including on-call. Production engineers exist to provide platforms and tooling that make this ownership tractable, not to be the operators of services they didn't build.
- Dogfooding as cultural infrastructure. Honeycomb runs Honeycomb on Honeycomb. The engineering blog (honeycomb.io/blog) is dense with 'how we ran a query that surfaced this incident in 4 minutes' and 'what our own SLOs look like' posts. The cultural consequence: production-engineering candidates who join Honeycomb arrive at a company where the product is the observability substrate, the engineers use it constantly, and the feedback loop between product and platform is unusually tight. Liz Fong-Jones's external talks at SREcon, QCon, and USENIX repeatedly draw on Honeycomb's own internal experience as the case study.
The reading list for a production-engineering candidate at Honeycomb: the 'Observability Engineering' book (free PDF at honeycomb.io/observability-engineering); Charity Majors's archive at charity.wtf, especially the operations-and-on-call series; Liz Fong-Jones's published SREcon talks (USENIX archive); and the Honeycomb engineering blog (honeycomb.io/blog) — the recent posts on Kafka tuning, Retriever performance, and incident postmortems are the most directly relevant.
Interview process — strong on production-engineering judgment
What is externally known about the production-engineering interview at Honeycomb (drawn from candidate reports on Glassdoor, discussion threads on Hacker News and r/sre, and the published values on honeycomb.io/careers) — keeping in mind that Honeycomb is small enough that interview shape varies more by hiring manager than at FAANG scale:
- Recruiter screen (30 min). Standard logistics, role context, motivation. Honeycomb's recruiters are reported to be unusually candid about the company's stage and trade-offs.
- Hiring-manager / technical screen (45–60 min). A conversation about a past production incident: walk through what happened, what you saw, how you debugged, what you'd change. The interviewer is grading on depth of operational reasoning — can you describe the failure mode in terms of observable system behavior? Can you articulate the trade-offs you made under time pressure?
- On-site / virtual loop (4–5 rounds, 60–90 min each):
- Debugging round. A real-feeling scenario — sometimes presented as an actual Honeycomb dataset, sometimes as a structured walk-through — where the candidate has to form hypotheses, query observability data, and converge on a root cause. The skills tested: hypothesis discipline, query-formation literacy, the ability to distinguish symptoms from causes. LeetCode preparation does not prepare you for this round; reading Cindy Sridharan's 'Distributed Systems Observability' (free at theoryinpractice.dev) and the Honeycomb 'Observability Engineering' book does.
- System-design round. A production-system design problem with explicit attention to failure modes, observability, and operability. Honeycomb-distinctive expectations: name the failure modes; describe how you'd know the system was unhealthy; what your SLOs would be; what the on-call runbook entries would look like. A candidate who designs a beautiful system but cannot describe how they'd debug it at 3 a.m. has missed the point.
- Culture / values round. Honeycomb's published values (honeycomb.io/careers) include explicit positions on inclusion, on remote work, and on engineering ethics. The round is structured around past scenarios — how have you handled a difficult cross-team disagreement, how have you supported on-call sustainability for your team, how have you incorporated feedback as a senior engineer. Liz Fong-Jones has spoken publicly (at SREcon and USENIX) about Honeycomb's interview process specifically valuing candidates who can articulate the human and ethical dimensions of production engineering.
- Coding round (sometimes). Lighter than at FAANG. Often a small Go program or a real-shaped debugging exercise rather than algorithmic puzzle. Go fluency is appreciated but not strictly required for hire — the company supports candidates ramping on Go from another systems language.
What candidates report as Honeycomb-distinctive in the loop: the unusually-high weight on production-engineering judgment, the lower weight on algorithmic problem-solving, the explicit attention to incident-response skill, and the cultural fit screen on the values published at honeycomb.io/careers. Candidates who are strong on LeetCode but thin on real production experience are less competitive at Honeycomb than at FAANG; candidates who are average algorithmically but have lived through real production failures and can articulate what they learned are unusually competitive.
Compensation by level
Honest framing first: Honeycomb is a private ~200-person company, and compensation transparency is materially lower than at Stripe (which publishes per-posting US salary ranges per pay-transparency laws) or public FAANG (where levels.fyi self-reports are dense). What is and isn't publicly knowable in 2026:
- levels.fyi data is sparse. Per levels.fyi/companies/honeycomb/salaries/software-engineer, the page exists but the self-report count is small enough (single-digit-to-low-double-digit range) that level-by-level bands are not statistically reliable. The aggregate signal is that Honeycomb compensation is in the mid-tier-startup range — competitive with other Series-D-and-later venture-backed companies but materially below FAANG and frontier-lab tiers.
- Honeycomb does not publish bands publicly. The company's careers page (honeycomb.io/careers) describes role responsibilities and culture but does not include compensation ranges in the public posting (with the caveat that US pay-transparency laws may require ranges on individual postings in covered states; check the specific posting).
- Honest empty space. A responsible 2026 statement of compensation at Honeycomb is: 'mid-tier-startup compensation, materially below FAANG, with the equity value gated on company outcome — Honeycomb is private with no announced IPO timeline; equity is illiquid until a liquidity event.' Specific dollar bands at the production-engineer / senior-production-engineer / staff tiers are not publicly verifiable in 2026 in a way that would be honest to publish here.
What to do as a candidate: ask the recruiter for the specific band on the role early in the process, request the equity-grant size and the company's most recent preferred-share valuation (Honeycomb's most recent disclosed funding round per Crunchbase and TechCrunch coverage), and use levels.fyi's general SWE benchmarks (levels.fyi/t/software-engineer) to compare against the broader market. The 'levels.fyi for late-stage-private startups' coverage in the Pragmatic Engineer's archive is the right context-read.
The non-cash compensation, for candidates who weight it: Honeycomb has been remote-first since founding, has explicit and well-documented engineering values (charity.wtf has written at length about how the founders chose what to build and how to build it), and offers proximity to Charity Majors and Liz Fong-Jones — both of whom have shaped how production engineering is practiced industry-wide. For candidates who weight intellectual environment and craft heavily, Honeycomb's offer is unusually high on these axes relative to the cash component.
Tech stack: their own product (Honeycomb) + Kafka + Go services
Honeycomb's stack is small, opinionated, and unusually well-documented externally — the engineering blog (honeycomb.io/blog) regularly covers internal architecture decisions, incident postmortems, and Retriever (their bespoke query engine) tuning. The headline components in 2026:
- Go for services. The bulk of Honeycomb's backend is Go. The engineering-blog archive has years of posts on Go-specific decisions — error handling, profiling, garbage-collection tuning, the migration of specific subsystems to Go from earlier languages. Candidates joining production engineering are expected to ramp on Go quickly if they aren't already fluent; the Go community's conventions (effective Go, Dave Cheney's blog, the Go team's own blog at go.dev/blog) are part of the cultural reference.
- Kafka as the event-storage and processing substrate. Wide events arrive at Honeycomb's ingest, are written to Kafka, and are then consumed by indexing, storage, and query subsystems. The engineering blog has covered Kafka-specific architectural decisions in depth: partition design, retention policies, the migration to MSK and back, lessons-learned posts. A production-engineering candidate at Honeycomb should expect to operate, debug, and reason about Kafka at scale.
- Retriever — bespoke columnar storage and query engine. Retriever is Honeycomb's internal columnar storage and query engine, designed specifically for the high-cardinality wide-event workload that observability-2.0 implies. The engineering blog has multiple posts on Retriever architecture, query optimization, and the trade-offs the team has made (honeycomb.io/blog tag: Retriever). A production engineer at Honeycomb may not work directly on Retriever (depending on team), but the system's existence and behavior are part of the cultural literacy expected of senior engineers.
- AWS for infrastructure. Honeycomb runs on AWS. EC2, S3, MSK (Kafka), and the standard AWS-native primitives. Multi-region considerations, AWS-specific failure modes, and the cost trade-offs of high-cardinality storage on S3-class object storage are all part of the production-engineering surface.
- Honeycomb itself for observability. The dogfooding posture is real and load-bearing. Honeycomb engineers use Honeycomb to debug Honeycomb, to set SLOs on Honeycomb, and to run incident response on Honeycomb. The product's high-cardinality query model and structured-event primitive are how the team thinks about its own production behavior, not a separately-grafted observability tool.
What this means for a production-engineering candidate: Go fluency (or a credible plan to ramp), Kafka literacy (or a strong systems-engineering background that ramps to Kafka quickly), comfort with AWS at scale, and — distinctively — a willingness to use observability-2.0 primitives natively rather than reaching for the dashboard-and-metrics framing that the rest of the industry defaults to. The engineering blog's recent posts on 'how we debugged X using a single Honeycomb query' are a useful read for the cultural transition; candidates who arrive thinking in dashboards-and-pre-aggregations rather than wide-events-and-query-time-aggregation report a learning curve in the first months.
The Honeycomb engineering blog at honeycomb.io/blog is the authoritative external read on the stack. Sort by tag for the deepest treatment of any specific component.
Frequently asked questions
- Is Honeycomb hiring production engineers in 2026?
- The careers page at honeycomb.io/careers is the authoritative source. Honeycomb is a ~200-person private company; hiring tempo varies with funding and product cycles, but production-engineering and platform-engineering roles are recurring postings. Roles are reported to be remote-first across US time zones with periodic on-sites. Check the careers page directly for current openings.
- What is the difference between an SRE role at Honeycomb and an SRE role at Google or Meta?
- Three differences. (1) Honeycomb does not split developers from operators — production engineers exist to platform-and-tool the engineers who build services, not to operate services they didn't build. Charity Majors's writing at charity.wtf is the canonical articulation. (2) The cultural primitive is the wide structured event, not the dashboard or the metric. Production-engineering candidates are expected to think in events-and-queries, not in pre-aggregated metrics. (3) The scale is materially smaller than FAANG — Honeycomb's traffic is large but the engineering team is ~50–80 engineers, not thousands. The trade-offs (operational scope per engineer, pace of architectural change, proximity to founders) reflect the size.
- How load-bearing is the 'observability-2.0' framing in the interview?
- Materially load-bearing. Honeycomb's intellectual position on observability-2.0 — wide structured events, query-time aggregation, single-source-of-truth structured data — is part of the company's identity, and candidates who arrive thinking primarily in dashboards-and-pre-aggregated-metrics will need to recalibrate. Reading the 'Observability Engineering' book (free PDF at honeycomb.io/observability-engineering) and Charity Majors's writing at charity.wtf before the interview is genuinely expected by the loop, not just nice-to-have.
- Is Charity Majors involved in hiring decisions?
- Charity Majors is Honeycomb's CTO. At ~200-person scale, the CTO is involved in senior-engineering hires (staff and above) and in setting the cultural standard for hiring more broadly. Specific involvement varies by role and team. Candidates interviewing at the staff and principal tiers should expect to interact with Charity at some point in the loop; candidates at the senior tier may or may not, depending on team.
- How does Honeycomb's remote-first culture work in practice?
- Honeycomb has been remote-first since founding (pre-pandemic), so the operational machinery — async-first communication, written documentation, distributed on-call — is mature rather than retrofitted. The engineering blog and Charity Majors's writing have covered the remote-first posture in detail. Candidates moving from in-person-default companies should expect a real cultural recalibration; candidates already accustomed to remote-first work (GitLab, Buffer, Automattic alumni) report the transition as low-friction.
- What is on-call like at Honeycomb?
- Production engineers and software engineers share on-call for the systems they build and operate. Charity Majors has written extensively at charity.wtf on the philosophy ('On Call Shouldn't Suck' is canonical) — the position is that on-call should be sustainable, that engineers should own production behavior of services they wrote, and that pages should reflect real customer-facing problems rather than infrastructure noise. Liz Fong-Jones's SREcon talks have covered on-call sustainability in depth. The operational reality at Honeycomb (per engineering-blog and external talks) is closer to this stated ideal than at most companies, with the standard caveat that any small company has incident-cluster periods.
- What happens to Honeycomb compensation if there's a liquidity event?
- Honeycomb is private with no announced IPO timeline as of 2026-04-29. Equity grants are illiquid until a liquidity event (IPO or acquisition). The compensation framing for candidates: assume the equity component is gated on company outcome over a 5–10 year horizon, value the cash component on its own merits, and weight the non-cash compensation (remote-first, intellectual environment, proximity to Charity Majors and Liz Fong-Jones) according to your own preferences. Honest empty space: the specific equity-value distribution at outcome is not knowable; candidates should not weight Honeycomb equity at the high end of late-stage-startup outcome distributions without a specific reason to.
- Is Liz Fong-Jones still at Honeycomb in 2026?
- Liz Fong-Jones has been Honeycomb's field CTO / DevRel principal voice for years; she is one of the most-publicly-visible production-engineering voices of the 2020s, with extensive talks at SREcon, QCon, USENIX, and elsewhere. As of 2026-04-29 the public signals (her own writing, conference appearances, and Honeycomb's published material) indicate she remains at the company. Verify directly via her public profile and Honeycomb's careers/about pages before relying on this.
Sources
- Honeycomb Engineering Blog — recurring posts on stack, architecture, incident response, dogfooding.
- Honeycomb — 'Observability 101' (introductory framing of the observability-2.0 thesis).
- Honeycomb Careers — production engineer / platform engineer role descriptions and published values.
- Charity Majors (CTO, Honeycomb) — charity.wtf archive, including the operations-and-on-call series.
- 'Observability Engineering' (Majors, Fong-Jones, Miranda; O'Reilly 2022; free PDF). Canonical book-length statement of observability-2.0.
- levels.fyi — Honeycomb Software Engineer compensation page (sparse data; treat as directional, not authoritative).
- levels.fyi — Software Engineer benchmark page (use as cross-company comparison anchor).
About the author. Blake Crosley founded ResumeGeni and writes about site reliability engineering, hiring technology, and ATS optimization. More writing at blakecrosley.com.