Back

InternManip_Eval

Project Overview

Project repository: InternManip-Eval

This is a standalone evaluation framework that I personally extracted and refactored from the InternManip project.

It is a universal, user-friendly, and efficient evaluation framework designed for embodied manipulation tasks across multiple benchmarks. Currently, it supports the Calvin, SimplerEnv, and GenManip benchmarks, and also includes ARX LIFT2 real-robot control (from the IROS offline competition, although the robot service startup dependencies are not included).


Highlights

  • Unified

    • Integrates evaluation pipelines for different benchmarks into a single framework.
    • Easily extensible to support additional benchmarks.
  • Easy to Use

    • All evaluation settings are centralized in a single configuration file.
    • One-command installation of Python dependencies.
    • Adopts a client-server (C-S) evaluation architecture that decouples agents from environments, allowing each side to maintain its own dependency stack. Don't be intimidated by the C-S setup—the codebase supports automatic client-server evaluation startup out of the box, while also allowing manual deployment (particularly useful when deploying agents on different nodes or assigning custom ports to agent servers).
  • Efficient

    • Supports distributed evaluation acceleration, while hiding the underlying implementation details from users.

Framework Structure

InternManip-Eval Framework Architecture
InternManip-Eval Framework Architecture