Data Science Researcher - AI Platform

Noida April 11, 2026 Full Time

Role: Data Science Researcher - AI Platform

Location: Remote
Work timing: IST Hours

Responsibilities:

Design and execute end-to-end research studies on healthcare claims data from hypothesis formation and experimental design through statistical analysis and findings delivery with clear documentation of methodology and limitations so results are reproducible and trustworthy.
Perform exploratory data analysis (EDA) using Python, pandas, and numpy across EDI 837/835 claim datasets, surfacing patterns in denial rates, payer behaviors, and claim adjudication outcomes that inform both product strategy and model development.
Build, evaluate, and iterate on machine learning models using scikit-learn, selecting metrics (precision, recall, F1, AUC-ROC) that reflect the real cost of errors in a healthcare billing context, and documenting assumptions and failure modes alongside results.
Translate research findings into clear, stakeholder-ready outputs — communicating uncertainty honestly, leading with the answer, and distinguishing what the data shows from what it cannot yet tell us.
Collaborate with data engineers to ensure the data pipelines feeding your research are correctly scoped.
Hand off validated findings to data and backend engineers with the specificity they need to operationalize models into production services on the Ailevate platform.
Version experiments and maintain a record of what was tried, what failed, and why — treating dead ends as part of the research record, not noise to be discarded.

Must have Skills:

5+ years of applied data science or quantitative research experience, with a track record of translating ambiguous business questions into structured, reproducible studies.
Strong working knowledge of statistical thinking: hypothesis testing, confidence intervals, distributions, and the difference between correlation and causation.
Proficiency in Python data science tooling: pandas, numpy, scikit-learn, Jupyter notebooks, matplotlib or seaborn.
Demonstrated ability to select and justify evaluation metrics for the problem at hand — not just the default metric.
Experience working with structured, real-world datasets that have quality issues, missing values, and domain-specific constraints.
Clear written communication: able to present findings to both technical teammates and non-technical stakeholders, with uncertainty quantified rather than hidden.

Preferred Skills

Experience in healthcare, health tech, or insurance — particularly with claims data, EDI formats (837/835), or Revenue Cycle Management workflows.
Familiarity with causal inference techniques (difference-in-differences, propensity score matching, instrumental variables).
Experience with experiment tracking practices: logging hyperparameters, feature lists, and data versions alongside model outputs.
Background in interpretable or explainable ML — experience making model behavior legible to domain experts who are not data scientists.
Graduate-level training in statistics, mathematics, or a quantitative field (not required; demonstrable capability matters more than credential).

Share your Resume at [email protected]