Summary
Join a team at the forefront of ML infrastructure and generative AI, where data and model workflows come together to enable the next generation of intelligent experiences on Apple products and services. We build robust systems that connect scalable data pipelines with advanced ML workflows, accelerating the development of real-world AI applications. Our work spans the full ML lifecycle, from experimentation to deployment, and you’ll play a key role in shaping how AI models are built, optimized, and scaled. We develop a platform for ML data and features that powers advanced GenAI applications. This includes embeddings (generation, evaluation, ANN search, multimodal support), AI Ops, efficient inference, and a modern feature platform designed to streamline experimentation and drive innovation. We’re looking for engineers and researchers passionate about generative models, data-centric ML, and intelligent systems across diverse real-world use cases. With the autonomy to experiment, the scale to make an impact, and the support to take ideas from prototype to production, you’ll work alongside a world-class team to build intelligent, flexible systems that make ML development faster, more reliable, and more creative.
Description
The Apple Cloud AI Platform team enables Apple's next generation of intelligent products by giving Apple's ML engineers and researchers the data systems and large-scale compute they need to build and ship models at Apple's bar for quality and privacy.
Minimum Qualifications
Strong foundation in machine learning, with hands-on experience across the end-to-end ML workflow - including data preparation, pipeline development, experimentation, evaluation, and deployment
Expertise in building and running large scale distributed systems
Familiarity with modern generative techniques (e.g. transformers, diffusion, retrieval-augmented generation)
Proven experience building and delivering data and machine learning infrastructure in real-world production environments
Familiarity with fine-tuning workflows, model optimization, and preparing models for scalable inference
Familiarity with generative AI and its applications in accelerating and enhancing machine learning workflows
Experience configuring, deploying and troubleshooting large scale production environments
Experience in designing, building, and maintaining scalable, highly available systems that prioritize ease of use
Extensive programming experience in Java, Python or Go
Strong collaboration and communication (verbal and written) skills
Comfortable navigating ambiguity and evolving technical landscapes, especially in fast-moving areas
B.S., M.S., or Ph.D. in Computer Science, Computer Engineering, or equivalent practical experience
Preferred Qualifications
Experience in any of the below is preferred:
Proficiency with one or more modern ML frameworks (PyTorch, JAX, or TensorFlow), particularly the data loading and dataset access layer
Columnar and lakehouse formats: Parquet, Iceberg, Delta, or Lance
Distributed data loading frameworks for ML: Ray Data, NVIDIA DALI, WebDataset, or Mosaic StreamingDataset
Performance engineering for I/O-bound workloads — Arrow, zero-copy, memory mapping, async I/O
High-throughput object storage access patterns at GPU scale
Data lineage and governance systems (DataHub, OpenLineage, Unity Catalog, or equivalent)
Contributions to or operational experience with Spark, Daft, Polars, or DuckDB internals
Containerization and orchestration technologies (Docker, Kubernetes)