Agentic AI Benchmarking and Evaluation Engineer
As a member of our team, you will get a chance to learn latest cutting-edge technologies in machine learning and new emerging applications and participate in each step of the development process and drive quality improvements. You would be porting AI/ML solutions to various platforms and optimize the performance on multiple hardware accelerators (like CPU/GPU/NPU) You'll collaborate with members of the software teams and plan a comprehensive evaluation approach. You will work with a small team of engineers to implement the evaluation strategy, deciding key performance metrics (KPIs), developing automation, and performing qualitative tests. You will be responsible for developing a deep understanding of the ML algorithms developed in the project. You will apply tools and algorithms on wide variety of use cases, develop best practices, provide detailed analysis, extend research, and identify areas for further optimization. Perform in-depth benchmarking, and model evaluation, to ensure AI systems meet safety standards. Exceptional software development skills with excellent analytical, development, and debugging skills. Strong understanding of Machine Learning fundamentals. Understanding of generative AI and its usage in various application Experience with LLM, LVM, LMM models, and other NN architectures. Proficiency in designing, implementing and training ML algorithms in high-level languages/frameworks (PyTorch and TensorFlow). Excellent interpersonal, written, and oral communications skills MS or PhD in Computer Science/Engineering with 2+ years of professional or equivalent experience. 2 years of proven experience in software development for GenAI, machine learning or high-performance computing with strong programming skills in Python and software design. Familiarity with AI agent frameworks (like LangChain, LlamaIndex, Autogen). Experience with machine learning accelerators, optimizing algorithms for hardware acceleration cores, working with heterogeneous or parallel computing systems. Experience driving cross-functional projects, including close collaboration with AI application teams to translate product needs into evaluation frameworks. Design and develop generalized AI solutions, including RAG systems, to enhance user capabilities with our AI accelerators. Experience in Android/Linux or other embedded systems. Bachelor's degree in Computer Science, Engineering, Information Systems, or related field.