Skip to main content
Ctrl+K

đź“– UrbanVese Documentation (Fully Anonymous)

  • Isaac Lab
  • Isaac Sim
Ctrl+K

User Pipeline Overview

  • User Workflow Guide

Installation

  • Installation via Pre-Built Binaries
  • UrbanVerse-100K Asset Caching
  • UrbanVerse Scene Caching
  • Installation Verification

Gallery

  • Scenes with Dynamic Agents
  • Mobile Manipulators
  • Demo VR Interaction in UrbanVerse
  • UrbanVerse-100K Gallery
  • UrbanVerse-Gen Gallery
  • UrbanVerse-160 Scenes Gallery
  • CraftBench Gallery
  • Real-World Long-Horizon Navigation
  • Real-World Side-by-Side Comparisons
  • YouTube Video Collection

Quickstart Guide

  • Use UrbanVerse-100K with APIs
  • Real-to-Sim Scene Generation with UrbanVerse-Gen
  • Use Built-in UrbanVerse Scenes
  • Use Built-in CraftBench Scenes
  • Train Your Robots in UrbanVerse
  • Test Your Robots on CraftBench

Robot Learning with UrbanVerse

  • Simulation Paradigms
  • General Configurations
  • Robot Configurations
  • Reinforcement Learning in UrbanVerse
    • Working with Urban Scenes
    • Defining Robot Actions
    • What Your Policy Sees: Observations
    • Designing the Reward Function
    • When Episodes End: Termination Conditions
    • Progressive Learning: Curriculum Strategies
    • Environment Events and Initialization
  • Imitation Learning in UrbanVerse
    • Introduction to Imitation Learning
    • Collecting Expert Demonstrations
    • Demonstration Dataset Format
    • BC Training Configuration
    • UrbanVerse IL APIs
    • End-to-End Example
    • Evaluating BC Policies
    • Best Practices and Tips

Developer Guide (Advanced)

  • Collecting Data in UrbanVerse
  • VR Teleoperation Interfaces
  • Real-world Deployment: Unitree Go2 Quadruped Example
  • Adding New Assets to UrbanVerse-100K
  • Adding New Robots to UrbanVerse

API Reference

  • UrbanVerse-100K API
  • UrbanVerse-Gen API
  • Navigation Reinforcement Learning API
  • Navigation Imitation Learning API
  • Assets API
  • Teleoperation API
  • Configuration API

UrbanVerse Community

  • Coding Standards
  • Contributing to UrbanVerse
  • Bug Reports
  • Communication Channels

References

  • License
  • Third-Party Licenses

Navigation Imitation Learning API

Contents

  • Import
  • Data Collection
    • Collect Data
    • Import External
  • Training
    • Train BC Policy
  • Inference
    • Load BC Policy
  • Evaluation
    • Evaluate BC Policy

Navigation Imitation Learning API#

The Imitation Learning API provides functions for collecting demonstrations, training Behavior Cloning (BC) policies, and evaluating them.

Import#

import urbanverse as uv

Data Collection#

Collect Data#

uv.navigation.il.collect_data(
    scene_paths: list[str],
    robot_type: str,
    output_dir: str,
    control_mode: str = "teleop_keyboard",
    max_episodes: int = 20,
    episode_length: int = 300,
    goal_sampling: str = "random",
) -> str

Collect expert demonstrations by manually controlling the robot in UrbanVerse scenes.

Parameters:

  • scene_paths (list[str]): List of USD scene file paths. Demonstrations will be collected across these scenes

  • robot_type (str): Robot embodiment identifier. Supported values: - "coco_wheeled": COCO wheeled delivery robot - "nvidia_carter": NVIDIA Carter holonomic robot - "turtlebot3": TurtleBot3 differential drive - "unitree_go2": Unitree Go2 quadruped - "anymal": ANYmal quadruped - "unitree_g1": Unitree G1 humanoid - "booster_t1": Booster T1 humanoid

  • output_dir (str): Directory path where demonstration data will be saved. Will be created if it doesn’t exist

  • control_mode (str, optional): Teleoperation interface. Default: "teleop_keyboard" - "teleop_keyboard": Keyboard controls - "teleop_joystick": Joystick interface - "teleop_vr": VR headset and controllers (Meta Oculus Quest 3)

  • max_episodes (int, optional): Maximum number of episodes to collect. Default: 20

  • episode_length (int, optional): Maximum steps per episode. Default: 300

  • goal_sampling (str, optional): Goal position sampling strategy. Default: "random" - "random": Random goals from drivable regions - "curriculum": Gradually increasing goal distances - "fixed": Fixed goal positions

Returns:

  • str: Path to the created demonstration dataset directory

Example:

demo_dir = uv.navigation.il.collect_data(
    scene_paths=[
        "/path/to/UrbanVerse-160/CapeTown_0001/scene.usd",
        "/path/to/UrbanVerse-160/Tokyo_0002/scene.usd",
    ],
    robot_type="coco_wheeled",
    output_dir="demos/capetown_tokyo",
    control_mode="teleop_keyboard",
    max_episodes=30,
)

Import External#

uv.navigation.il.import_external(
    demo_source: str,
    output_dir: str,
    mapping_cfg: dict,
) -> str

Convert external demonstration data into UrbanVerse’s standard BC dataset format.

Parameters:

  • demo_source (str): Path to external demonstration data file or directory

  • output_dir (str): Directory where converted demonstrations will be saved

  • mapping_cfg (dict): Configuration dictionary specifying data format and field mappings. Required keys depend on format:

    For ROS bag format: - "format": "rosbag" - "topic_obs": ROS topic name for observations (e.g., "/camera/rgb/image_raw") - "topic_act": ROS topic name for actions (e.g., "/cmd_vel") - "topic_goal": ROS topic name for goal positions (e.g., "/move_base_simple/goal") - "coordinate_transform": Coordinate transformation configuration - "temporal_sync": Temporal synchronization parameters

Returns:

  • str: Path to the created demonstration dataset directory

Example:

demo_dir = uv.navigation.il.import_external(
    demo_source="/path/to/real_world_demos.bag",
    output_dir="demos/imported_real_world",
    mapping_cfg={
        "format": "rosbag",
        "topic_obs": "/front_camera/image_raw",
        "topic_act": "/mobile_base/cmd_vel",
        "topic_goal": "/navigation/goal",
        "coordinate_transform": {
            "type": "ros_to_urbanverse",
            "frame_id": "map",
        },
    },
)

Training#

Train BC Policy#

uv.navigation.il.train_bc(
    demo_dir: str,
    robot_type: str,
    output_dir: str,
    train_cfg: dict | None = None,
) -> str

Train a behavior cloning policy from demonstration data.

Parameters:

  • demo_dir (str): Path to demonstration dataset directory (created by collect_data or import_external)

  • robot_type (str): Robot embodiment identifier (must match the robot type used for data collection)

  • output_dir (str): Directory where training outputs will be saved (checkpoints, logs, etc.)

  • train_cfg (dict, optional): Training configuration dictionary. If None, uses default configuration. Common keys: - "learning_rate": Learning rate (default: 1e-4) - "batch_size": Batch size (default: 256) - "train_epochs": Number of training epochs (default: 50-100) - "weight_decay": L2 weight decay (default: 1e-5) - "policy_architecture": Policy architecture configuration - "loss": Loss function configuration

Returns:

  • str: Path to the best model checkpoint file (typically output_dir/checkpoints/best.pt)

Training Outputs:

The training process creates the following files in output_dir:

  • checkpoints/best.pt: Best model checkpoint (lowest validation loss)

  • checkpoints/latest.pt: Latest model checkpoint

  • logs/training.log: Training log file

  • logs/tensorboard/: TensorBoard event files for visualization

  • config.yaml: Saved training configuration

Example:

checkpoint_path = uv.navigation.il.train_bc(
    demo_dir="demos/my_teleop_collection",
    robot_type="coco_wheeled",
    output_dir="outputs/bc_coco_policy",
    train_cfg={
        "learning_rate": 1e-4,
        "batch_size": 256,
        "train_epochs": 50,
    },
)

print(f"Trained policy saved to: {checkpoint_path}")

Inference#

Load BC Policy#

uv.navigation.il.load_bc_policy(
    checkpoint_path: str,
    robot_type: str,
    device: str = "cuda",
) -> Callable

Load a trained BC policy for inference.

Parameters:

  • checkpoint_path (str): Path to model checkpoint file (.pt file)

  • robot_type (str): Robot embodiment identifier (must match training robot type)

  • device (str, optional): Device to run inference on. Default: "cuda". Options: "cuda", "cpu"

Returns:

  • Callable: Policy function that takes observations and returns actions

Policy Function Signature:

def policy(observation: dict | np.ndarray) -> np.ndarray:
    """
    Predict action from observation.

    Args:
        observation: Observation dict or array containing:
            - "rgb": RGB image array
            - "goal_vector": Goal vector array
            - Optional robot state

    Returns:
        action: Predicted action array
    """

Example:

policy = uv.navigation.il.load_bc_policy(
    checkpoint_path="outputs/bc_coco_policy/checkpoints/best.pt",
    robot_type="coco_wheeled",
)

# Use policy for inference
obs = env.get_observation()
action = policy(obs)
env.step(action)

Evaluation#

Evaluate BC Policy#

uv.navigation.il.evaluate(
    policy: Callable,
    scene_paths: list[str],
    robot_type: str,
    num_episodes: int = 50,
    max_episode_steps: int = 300,
) -> dict

Evaluate a BC policy on UrbanVerse scenes and compute navigation metrics.

Parameters:

  • policy (Callable): BC policy function (loaded via load_bc_policy)

  • scene_paths (list[str]): List of USD scene file paths to evaluate on

  • robot_type (str): Robot embodiment identifier

  • num_episodes (int, optional): Number of evaluation episodes. Default: 50

  • max_episode_steps (int, optional): Maximum steps per episode. Default: 300

Returns:

  • dict: Dictionary containing evaluation metrics: - "SR" (float): Success Rate (fraction of episodes reaching goal) - "RC" (float): Route Completion (average fraction of path completed) - "CT" (float): Collision Times (average number of collisions per episode) - "DTG" (float): Distance-to-Goal (average final distance to goal in meters) - "episode_lengths" (list): List of episode lengths - "outcomes" (list): List of episode outcomes ("success", "collision", "timeout")

Evaluation Metrics:

  • Success Rate (SR): Fraction of episodes where the robot successfully reached the goal

  • Route Completion (RC): Average fraction of the planned route completed before termination

  • Collision Times (CT): Average number of collisions per episode (lower is better)

  • Distance-to-Goal (DTG): Average final distance to goal for unsuccessful episodes (in meters)

Example:

policy = uv.navigation.il.load_bc_policy(
    checkpoint_path="outputs/bc_coco_policy/checkpoints/best.pt",
    robot_type="coco_wheeled",
)

results = uv.navigation.il.evaluate(
    policy=policy,
    scene_paths=[
        "/path/to/CraftBench/scene_001/scene.usd",
        "/path/to/CraftBench/scene_002/scene.usd",
    ],
    robot_type="coco_wheeled",
    num_episodes=100,
)

print(f"Success Rate: {results['SR']:.2%}")
print(f"Route Completion: {results['RC']:.2%}")
print(f"Average Collisions: {results['CT']:.2f}")
print(f"Average Distance to Goal: {results['DTG']:.2f} m")

previous

Navigation Reinforcement Learning API

next

Assets API

Contents
  • Import
  • Data Collection
    • Collect Data
    • Import External
  • Training
    • Train BC Policy
  • Inference
    • Load BC Policy
  • Evaluation
    • Evaluate BC Policy

By The Anonymous UrbanVerse Project Developers

© Copyright 2025-2030, The Anonymous UrbanVerse Project Developers.