AI Operations Engineer 2
Expedia Group brands power global travel for everyone, everywhere. We design cutting-edge tech to make travel smoother and more memorable, and we create groundbreaking solutions for our partners. Our diverse, vibrant, and welcoming community is essential in driving our success.
Why Join Us?
To shape the future of travel, people must come first. Guided by our Values and Leadership Agreements, we foster an open culture where everyone belongs, differences are celebrated and know that when one of us wins, we all win.
We provide a full benefits package, including exciting travel perks, generous time-off, parental leave, a flexible work model (with some pretty cool offices), and career development resources, all to fuel our employees' passion for travel and ensure a rewarding career journey. We’re building a more open world. Join us.
Role summary
We are building a new AI Ops Engineering team focused on automation, reliability, and operational excellence for one of the world’s largest e‑commerce and online travel platforms. This role focuses on operating, scaling, and improving AI‑enabled services and platforms in a cloud‑first, automation‑driven AWS environment. You will apply DevOps, SRE, and platform engineering principles to ensure AI‑driven systems are reliable, observable, secure, and easy to operate in production. Rather than building models, this role emphasizes operational ownership, infrastructure as code, automation, and safe integration of AI‑powered capabilities into real‑world systems.
In this role, you will
Operate, monitor, and optimize AI‑driven production systems and related services to ensure reliability, availability, and performance within defined SLAs.
You will implement and maintain automation, tooling, and runbooks that simplify AI system operations, including deployment, change management, incident response, and recovery.
You will collaborate with software engineers, data and ML teams, and product partners to support AI workloads in production, including model‑serving infrastructure, APIs, and data pipelines.
You will apply system design, API design, and data modeling principles to improve the robustness, observability, and maintainability of AI‑related services and platforms.
You will safely integrate and operate AI/ML‑enabled solutions that improve outcomes, ensuring appropriate controls, monitoring, and guardrails for quality, compliance, and security.
You will use metrics, logs, and data‑driven insights to continuously refine operational processes, reduce toil, and enhance the scalability and resilience of AI operations across multiple domains.
You will collaborate closely with software engineers, platform teams, and data/ML partners to support AI‑powered workloads in production, including APIs, model‑serving platforms, background processing, and data pipelines.
Minimum Qualifications
Relevant technical degree or equivalent practical experience in computer science, engineering, information systems, or a closely related field.
Professional experience operating or supporting production services or platforms, including responsibility for incident handling, on‑call participation, and continuous improvement of operational health.
Strong background in DevOps, SRE, or Automation-focused Operations.
Proven experience automating operational processes, and managing infrastructure via code.
Solid scripting or programming skills, preferably Python
Strong understanding of cloud infrastructure, with AWS experience required.
Practical exposure to AI‑driven systems, tools, or workflows in production, such as model‑serving platforms, AI‑enabled features, or automated decisioning systems.
Preferred Qualifications
Experience operating AI or ML workloads at scale, including model deployment, versioning, rollout strategies, and monitoring model performance and system behavior in production.
Demonstrated ability to design and implement robust operational architectures for AI‑driven services, including observability, resilience patterns, and capacity planning.
Proven track record of driving operational excellence for complex, business‑critical systems, including post‑incident reviews, automation of manual tasks, and reliability‑focused improvements.
Familiarity with AI‑driven systems, tools, or workflows and applying AI/ML concepts to real‑world products, including safely integrating and operating AI/ML‑enabled solutions that improve outcomes.
Experience collaborating with engineering and data/ML teams to influence system design, APIs, and data models for better operability of AI‑powered features and services.
Accommodation requests
If you need assistance with any part of the application or recruiting process due to a disability, or other physical or mental health conditions, please reach out to our Recruiting Accommodations Team through the Accommodation Request.
We are proud to be named as a Best Place to Work on Glassdoor in 2024 and be recognized for award-winning culture by organizations like Forbes, TIME, Disability:IN, and others.
Expedia Group's family of brands includes: Brand Expedia®, Hotels.com®, Expedia® Partner Solutions, Vrbo®, trivago®, Orbitz®, Travelocity®, Hotwire®, Wotif®, ebookers®, CheapTickets®, Expedia Group™ Media Solutions, Expedia Local Expert®, CarRentals.com™, and Expedia Cruises™. © 2024 Expedia, Inc. All rights reserved. Trademarks and logos are the property of their respective owners. CST: 2029030-50
Employment opportunities and job offers at Expedia Group will always come from Expedia Group’s Talent Acquisition and hiring teams. Never provide sensitive, personal information to someone unless you’re confident who the recipient is. Expedia Group does not extend job offers via email or any other messaging tools to individuals with whom we have not made prior contact. Our email domain is @expediagroup.com. The official website to find and apply for job openings at Expedia Group is careers.expediagroup.com/jobs.
Expedia is committed to creating an inclusive work environment with a diverse workforce. All qualified applicants will receive consideration for employment without regard to race, religion, gender, sexual orientation, national origin, disability or age.