Computer Vision / Machine Learning Engineer (Video Generation)
Beijing
February 25, 2026
Apple Custom Ats
Summary
If you are passionate about advancing video generation, building state-of-the-art models that synthesize high-quality and controllable video, and optimizing them for on-device deployment, Apple is the right place for you. We are looking for engineers who combine deep technical expertise, creativity, and systems thinking to push the boundaries of video AI.
Description
As part of Apple’s Video Engineering org, you will develop models and infrastructure for video generation and understanding across Apple products. You will work on cutting-edge generative techniques, from diffusion and transformer-based models to frame interpolation and temporal modeling, while ensuring models run efficiently on iPhone, iPad, and Vision Pro. You will collaborate with research scientists, framework engineers, and cross-functional teams to design, train, optimize, and deploy scalable video generation systems.
Minimum Qualifications
M.S. or Ph.D. in Computer Science, Electrical Engineering, or related fields with focus on computer vision or machine learning.
Strong experience in one or more of: generative video modeling, video prediction, temporal modeling, or frame interpolation.
Proficiency in deep learning frameworks (PyTorch, JAX) and programming languages (Python, C++).
Experience with large-scale training pipelines and deploying models in real-world systems.
Strong written and verbal communication skills.
Preferred Qualifications
Publications in top-tier conferences (CVPR, ECCV, ICCV, NeurIPS, ICLR).
Experience with multi-modal video or text-video generation.
Familiarity with optimizing generative models for mobile/embedded devices.
Understanding of temporal consistency, controllable generation, and efficient infrastructure for large-scale video modeling.
Passion for building scalable, high-quality systems in cross-functional teams.