Associate Data Scientist

India-Remote Remote February 27, 2026 Full Time

Role Overview

We are seeking an Associate Data Scientist to support AI/ML engineering efforts by preparing, validating, and structuring data for LLM-driven systems. This is a hands-on role focused on real-world data processing, pipeline support, and model evaluation.

Key Responsibilities

  • Process and clean structured and unstructured data for AI/ML pipelines.

  • Prepare training-ready datasets for LLM fine-tuning and evaluation workflows.

  • Support RAG and NL→SQL systems through data preparation and validation.

  • Perform data quality checks and ensure completeness and consistency.

  • Assist in building and maintaining data pipelines and APIs (e.g., FastAPI).

  • Collaborate with engineering teams to troubleshoot and optimize data workflows.

Required Skills

  • 1–3 years of experience in data processing or data-focused roles.

  • Strong Python skills with experience in data libraries (Pandas, NumPy, Scikit-learn).

  • Experience supporting LLM workflows (fine-tuning, prompt engineering, evaluation).

  • Familiarity with structured (SQL) and unstructured text data.

  • Understanding of data preparation for AI/ML systems.

Nice to Have

  • Exposure to RAG pipelines, embeddings, or evaluation metrics.

  • Experience with ML frameworks (PyTorch/TensorFlow) and Docker-based workflows.

  • Experience with CI/CD pipelines for ML systems.

  • Familiarity with vector databases (e.g., Chroma) and reranking techniques.

  • Research exposure to Transformer-based architectures.

Apply on company site

How well do you match this role?

Check My Resume