AI Evaluation Engineer DevOps / Infra
Review & validate AI benchmark tasks involving infra, Docker, CI/CD & deployment. Verify Dockerfiles, test scripts & environment setup. Debug container build failures & flaky pipelines. Assess task quality for reproducibility & determinism.