NIFTY: Neural Object Interaction Fields
for Guided Human Motion Synthesis

Nilesh Kulkarni1,2   Davis Rempe3   Kyle Genova2   Abhijit Kundu2
Justin Johnson1   David Fouhey1   Leonidas Guibas2,4

University of Michigan1 Google2     NVIDIA3 Stanford University4
Arxiv, 2023


We address the problem of generating realistic 3D motions of humans interacting with objects in a scene. Our key idea is to create a neural interaction field attached to a specific object, which outputs the distance to the valid interaction manifold given a human pose as input. This interaction field guides the sampling of an object- conditioned human motion diffusion model, so as to encourage plausible contacts and affordance semantics. To support interactions with scarcely available data, we propose an automated synthetic data pipeline. For this, we seed a pre-trained motion model, which has priors for the basics of human movement, with interaction- specific anchor poses extracted from limited motion capture data. Using our guided diffusion model trained on generated synthetic data, we synthesize realistic motions for sitting and lifting with several objects, outperforming alternative approaches in terms of motion quality and successful action completion. We call our framework NIFTY: Neural Interaction Fields for Trajectory sYnthesis.

Motion generation with Interaction Field Guidance

Our motion synthesis model consists of an Object Interaction Field which guides the ouputs from the diffusion model during sampling. The diffusion model is conditioned on the initial human pose, object geometry and object position. The object interaction field takes as input the last pose of the generated motion, and uses guidance to push the pose towards the valid interaction manifold.

Synthetic Data Generation

Given a final interaction pose of a person sitting on a chair / table we use a pre-trained motion model to predict the past motion. Our generation follows a tree-like branching strategy and allows us to scalably create more data for an given interaction.

Motion Generation Results with NIFTY

Sitting on a chair

Lifting a stool

Additional results visit this site

Qualitative Comparisons

We conduct an user study to evaluate the quality of generated motions and compare to other baseline methods.

We compare against two baselines a) Conditional VAE, and b) Conditional MDM. NIFTY's outputs are consistently preferred over outputs from these methods. It is interesting to note that as compared to the sythetic training data NIFTY is equally preferred.
Sitting Interactions

Lifting Interactions


NIFTY: Neural Object Interaction Fields
for Guided Human Motion Synthesis

Kulkarni. N, Rempe. D, Genova K., Kundu A.,
Johnson J., Fouhey D., Guibas. L

Arxiv, 2023

[Paper] [BibTex]


We express our gratitude to our colleagues for the fantastic project discussions and feedback provided at different stages. We have organized them by institution (in alphabetical order) .

This work was partly done when NK was interning at Google Research. DR was supported by the NVIDIA Graduate Fellowship. This project page template is based on this page.