Data Scientist

Hyderabad, Gurugram April 15, 2026 Full Time

The Team: As a member of the EDO, Collection Platforms & AI Cognitive Engineering team you will

work on building GenAI-driven and ML-powered products and capabilities to power natural language

understanding, data extraction, information retrieval and data sourcing solutions for S&P Global.

You will define AI strategy, mentor others, and drive production-ready AI products and pipelines

while leading by example in a highly engaging work environment. You will work in a (truly) global

team and be encouraged for thoughtful risk-taking and self-initiative.


What’s in it for you:

• Be a part of a global company and build solutions at enterprise scale

• Lead and grow a highly skilled, hands-on technical team (including mentoring junior data

scientists)

• Contribute to solving high-complexity, high-impact problems end-to-end

• Architect and oversee production-ready pipelines from ideation to deployment


Responsibilities:

• Define AI roadmap, tooling choices, and best practices for model building, prompt

engineering, fine-tuning, and vector retrieval systems

• Architect, develop and deploy large-scale ML and GenAI-powered products and pipelines

• Own all stages of the data science project lifecycle, including:

  • Identification and scoping of high-value data science and AI opportunities
  • Partnering with business leaders, domain experts, and end-users to gather

requirements and align on success metrics

  • Evaluation, interpretation, and communication of results to executive stakeholders
  • Lead exploratory data analysis, proof-of-concepts, model benchmarking, and

validation experiments for both ML and GenAI approaches

  • Establish and enforce coding standards, perform code reviews, and optimize data

science workflows

  • Drive deployment, monitoring, and scaling strategies for models in production

(including both ML and GenAI services)

  • Mentor and guide junior data scientists; foster a culture of continuous learning and

innovation

  • Manage stakeholders across functions to ensure alignment and timely delivery

Technical Requirements:

• Hands-on experience with large language models (e.g., OpenAI, Anthropic, Llama), prompt

engineering, fine-tuning/customization, and embedding-based retrieval


• Expert proficiency in Python (NumPy, Pandas, SpaCy, scikit-learn, PyTorch/TF 2, Hugging

Face Transformers)

• Deep understanding of ML & Deep Learning models, including architectures for NLP (e.g.,

transformers), GNNs, and multimodal systems

• Strong grasp of statistics, probability, and the mathematics underpinning modern AI

• Ability to surf and synthesize current AI/ML research, with a track record of applying new

methods in production

• Proven experience on at least one end-to-end GenAI or advanced NLP project: custom NER,

table extraction via LLMs, Q&A systems, summarization pipelines, OCR integrations, or GNN

solutions

• Familiarity with orchestration and deployment tools: Docker, Airflow, Kubernetes, Redis,

Flask/Django/FastAPI, PySpark, SQL, R-Shiny/Dash/Streamlit

• Openness to evaluate and adopt emerging technologies and programming languages as

needed


Good to have:

• Master’s or Ph.D. in Computer Science, Statistics, Mathematics, or related field (minimum

Bachelor’s)

• 6+ years of relevant experience in Data Science/AI, with at least 2 years in a leadership or

technical lead role

• Prior experience in the Economics/Financial industry, especially with market-intelligence or

risk analytics products

• Public contributions or demos on GitHub, Kaggle, StackOverflow, technical blogs, or

publications

Apply on company site

How well do you match this role?

Check My Resume