DynaMem

Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation

Videos

DynaMem in action

Here are sample trials from 3 lab environments and 2 home environments.

Method

Illustration of DynaMem

an illustration of DynaMem

We maintain a feature pointcloud as the robot memory. When the robot receives a new RGBD observation of the environment, it adds the newly observed objects and removes the points no longer existing.

To ground the object of interest described by the text query, the robot locates the point most similar to text query along with the last image it is observed. If the text is grounded in the image or the point has high similarity with the text, it will be considered as the location of the object of interest.

If the text is grounded the environment, the robot will navigate to the target object; otherwise, the robot memory will be projected into a value map and the robot explores the environment based on the value map.

Evaluation

Performance of DynaMem

performance and failure analysis of DynaMem and baselines

We evaluate DynaMem in 3 different environments, 10 queries from each environment. We select OK-Robot (with prescanned static robot memory) and Gemini (utilized following the pipeline proposed in OpenEQA) as baselines.

We find that both DynaMem and mLLM have a total success rate of 70%. This is a significant improvement over the OK-Robot system, which has a total success rate of 30%. Notably, DynaMem is particularly adept at handling dynamic objects in the environment: only 6.7% of the trials failed due to our system not being able to navigate to such dynamic objects in the scene.

Paper

DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulations

@article{liu2024dynamem,
  title={DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation},
  author={Liu, Peiqi and Guo, Zhanqiu and Warke, Mohit and Chintala, Soumith and Shafiullah, Nur Muhammad Mahi and Pinto, Lerrel},
  journal={arXiv preprint arXiv:2411.04999},
  year={2024}
}

Code

Get the code on github.