PhD Position F/M Foundation Models and Natural Language Interaction for Human-Robot Collaboration

March 21, 2026 Custom Inria Recruitment Portal (Jobs.inria.fr)

PhD Position F/M Foundation Models and Natural Language Interaction for Human-Robot Collaboration

Le descriptif de l’offre ci-dessous est en Anglais

Type de contrat : CDD

Niveau de diplôme exigé : Bac + 5 ou équivalent

Fonction : Doctorant

Contexte et atouts du poste

The HUCEBOT team is dedicated to advancing algorithms for human-centered robots: robots that are not working autonomously in isolation, but that instead react, interact, collaborate, and assist humans. To do so, these robots need to intertwine a multi-contact whole-body controller, a digital simulation of the interacting humans, and machine learning models to predict and respond to human movements and intentions. In a crescendo of complexity, the team tackles scenarios that involve collaboration with cobots, assistance with exoskeletons, and collaboration with humanoid robots. The application domains span from industrial robotics to space teleoperation.

The main robots of the team are the Tiago++ bimanual mobile manipulator, the Unitree G1 humanoid, and the Talos humanoid robot. The team also works with Franka cobots and exoskeletons.

The team currently consists of about 25 members, including permanent researchers, PhD students and post-doctoral students.

Serena Ivaldi, head of HUCEBOT, is holding the chair in Robotics and AI of the Cluster IA ENACT project (https://cluster-ia-enact.ai/) that is funding this PhD thesis. In the chair, she wants to push the research in Natural Language to assist humans in different scenarios of collaboration with robots, where safety is paramount. The ambition is to create a foundation that bridges natural language commands into interpretable commands for the robot, leading to robot actions that are contextualized and intrinsically safe.

Mission confiée

Most work on VLM/LLMs for robotics focused on generating sequences of actions and plans from high level goals, offline, only targeting autonomous robots isolated from humans. A critical limitation to deploy VLM/LLMs for robots collaborating with humans is their ability to be used online, in a human-in-the-loop scenario, to generate suitable motions and "safe" robot policies.

Here, we use VLM/LLMs to generate a robot's motions online in collaborative scenarios where safety is critical: active exoskeletons and mobile manipulators assisting humans in object manipulation. The human vocally commands the robot interactively, online, to control the generation of its motion at the low level: start, stop, direct, and change its low-level parametrization (e.g., compliant behavior, the velocity, the maximal torque assistance, etc.).
Extension of paradigms and comparison with existing and fine-tuning of VLAs is also considered, as this is part of the ongoing research of the team.

The first objective is to design the robot's controller with the natural language interaction feature in mind: the human's commands, corrections and Approximate Numerical Expressions must be translated into meaningful quantities, coherent with the physics of the problem. What do "faster", "a bit higher", "little to the right", and "more assistance" mean?

The second objective is to design new multimodal models fusing VLM/LLMs and multimodal pipelines to predict the human's intent and minimize the need for corrections. Natural language instructions may be incomplete or unclear, but cameras and microphones (or other sensors) could provide sufficient contextual information to generate an appropriate motion. For example, "take that" could be easily translated into "grasp the bottle", if it is the only item in front of the robot. "Move a bit to the right" needs clarifications, but also estimation of physical quantities that are context dependent.

The third objective is to detect emergency commands, leveraging both LLMs and audio processing models for nonverbal communication, and generating suitable robot's reactive behaviors. Humans are often unable to speak clearly when they interact with a robot: sometimes, fear takes over and they do not speak at all, or they mumble, or scream, when they could just say a clear "stop". Detecting emergency commands is critical to be able to deploy the robots into the real world. For example, "Watch out", "Attention!" are difficult to translate into precise motions, and require one-shot evaluations because of the urgent nature of the command.

The PhD student will carry out research in the aforementioned objectives, and will benefit from our collaboration with E. Zibetti (Paris 8, SHS), expert in Approximate Numerical Expressions for Psychology, and D. Sadigh (Stanford University), leading the research in LLMs for robot actions.

Real-world demonstrations with real robots and real humans interacting with the robots are mandatory in this PhD.

Principales activités

Main activities: implement, test and develop novel algorithms for real robots that use language models and foundation models. Write papers and present them at conferences. Write, test, validate and document its associated software. Experiments with real robots are mandatory.

The PhD will also be involved in the activities organized by the Cluster-AI project ENACT, which may involve dissemination actions, meetings and presentations to relevant stakeholders (Europe, France, industries, etc).

Compétences

Good skills in Python (Pytorch). Ideally, prior experience with LLM, VLM and Foundation Models.

Good knowledge of robotics.

Languages: English (English is the official language of the team and many members do no speak French).

Proactivity and curiosity, daily communication, ability to work in a team are fundamental.

Avantages

Subsidized meals
Partial reimbursement of public transport costs
Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
Professional equipment available (videoconferencing, loan of computer equipment, etc.)
Social, cultural and sports events and activities
Access to vocational training
Social security coverage

Rémunération

€2300 gross/month

Postuler à cette offre

Informations générales

Thème/Domaine : Robotique et environnements intelligents
Ingénierie logicielle (BAP E)
Ville : Villers lès Nancy
Centre Inria : Centre Inria de l'Université de Lorraine
Date de prise de fonction souhaitée : 2026-09-01
Durée de contrat : 3 ans
Date limite pour postuler : 2026-04-18

Attention: Les candidatures doivent être déposées en ligne sur le site Inria. Le traitement des candidatures adressées par d'autres canaux n'est pas garanti.

Consignes pour postuler

Sécurité défense :
Ce poste est susceptible d’être affecté dans une zone à régime restrictif (ZRR), telle que définie dans le décret n°2011-1425 relatif à la protection du potentiel scientifique et technique de la nation (PPST). L’autorisation d’accès à une zone est délivrée par le chef d’établissement, après avis ministériel favorable, tel que défini dans l’arrêté du 03 juillet 2012, relatif à la PPST. Un avis ministériel défavorable pour un poste affecté dans une ZRR aurait pour conséquence l’annulation du recrutement.

Politique de recrutement :
Dans le cadre de sa politique diversité, tous les postes Inria sont accessibles aux personnes en situation de handicap.

Contacts

Équipe Inria : HUCEBOT
Directeur de thèse :
Ivaldi Serena / [email protected]

L'essentiel pour réussir

The ideal candidate is fascinated by the recent developments in artificial intelligence and robotics, especially Foundation Models, LLM, VLM, OpenVLA. He/She wants to experiment with these new techniques, develop their skills, and experiment with state-of-the-art robots.

IMPORTANT: candidates must upload their CV, motivation letter and all documents listed in this page: https://team.inria.fr/hucebot/job-offers/
Applications that do not contain these documents will not be considered.

A propos d'Inria

Inria est l’institut national de recherche dédié aux sciences et technologies du numérique. Il emploie 2600 personnes. Ses 215 équipes-projets agiles, en général communes avec des partenaires académiques, impliquent plus de 3900 scientifiques pour relever les défis du numérique, souvent à l’interface d’autres disciplines. L’institut fait appel à de nombreux talents dans plus d’une quarantaine de métiers différents. 900 personnels d’appui à la recherche et à l’innovation contribuent à faire émerger et grandir des projets scientifiques ou entrepreneuriaux qui impactent le monde. Inria travaille avec de nombreuses entreprises et a accompagné la création de plus de 200 start-up. L'institut s'eﬀorce ainsi de répondre aux enjeux de la transformation numérique de la science, de la société et de l'économie.

Apply on company site

How to Get Hired at INRIA

Inria is a French public research institute (EPST) under joint supervision of the Ministry of Research and the Ministry of the Economy, employing around 2,800 staff across nine research centres in France plus Inria Chile, with headquarters at Le Chesnay-Rocquencourt near Versailles and Bruno Sportisse as Chairman and CEO since 2018.
All open positions are published on the custom Inria recruitment portal at jobs.inria.fr, with English and French interfaces, structured filters, and unique offer reference numbers in the format YYYY-NNNNN that you must quote in every document and email.

Read the full guide

How well do you match this role?

Check My Resume