Project Overview
Project repository: InternManip-Eval
This is a standalone evaluation framework that I personally extracted and refactored from the InternManip project.
It is a universal, user-friendly, and efficient evaluation framework designed for embodied manipulation tasks across multiple benchmarks. Currently, it supports the Calvin, SimplerEnv, and GenManip benchmarks, and also includes ARX LIFT2 real-robot control (from the IROS offline competition, although the robot service startup dependencies are not included).
Highlights
-
Unified
- Integrates evaluation pipelines for different benchmarks into a single framework.
- Easily extensible to support additional benchmarks.
-
Easy to Use
- All evaluation settings are centralized in a single configuration file.
- One-command installation of Python dependencies.
- Adopts a client-server (C-S) evaluation architecture that decouples agents from environments, allowing each side to maintain its own dependency stack. Don't be intimidated by the C-S setup—the codebase supports automatic client-server evaluation startup out of the box, while also allowing manual deployment (particularly useful when deploying agents on different nodes or assigning custom ports to agent servers).
-
Efficient
- Supports distributed evaluation acceleration, while hiding the underlying implementation details from users.
Framework Structure