Use UrbanVerse-100K with APIs#

UrbanVerse-100K is a standalone, Python-friendly dataset of 102,530 metric-scale urban 3D assets, organized into 8 top-level, 59 mid-level, and 667 leaf-level categories. In addition, it provides 288 ground materials and 306 HDRI sky maps for realistic simulation.

Each 3D object asset is annotated with 33+ semantic, physical, and affordance attributes (e.g., mass, friction, traversability, material composition, affordances such as drivable or openable).

This section shows how to:

  • explore the dataset structure and category statistics,

  • load categories, UIDs, and annotations,

  • filter objects by attributes,

  • download object meshes, ground materials, and sky maps.

Install the UrbanVerse-100K API as:

pip install urbanverse-100k

and import it in Python as:

import urbanverse_100k as uvk

Interactive Dataset Overview (Sunburst Plot)#

UrbanVerse-100K ships with an interactive sunburst visualization summarizing the distribution of assets across the category hierarchy (l1, l2, l3), ground materials, and sky maps. The HTML file is bundled inside the package at:

  • assets/urbanverse_distribution.html

You can open it from a Jupyter notebook:

from IPython.display import IFrame

html_path = uvk.get_distribution_html()  # returns the absolute path to the HTML file
IFrame(src=html_path, width=900, height=600)

Or open it in your default browser:

import webbrowser

html_path = uvk.get_distribution_html()
webbrowser.open(f"file://{html_path}")

This is a convenient way to visually inspect the category hierarchy before using the Python API.

Quick Exploration of Categories and Components#

At a high level, UrbanVerse-100K is organized as follows:

Component

Description

Quantity

Format

3D Object Assets

Metric-scale 3D object models spanning hundreds of urban categories.

102,530

.glb

Object Thumbnails

Canonical-view thumbnail for each 3D object asset.

102,530

.png

Multi-View Object Renders

Four standardized renders per asset at 0°, 90°, 180°, 270°.

4 × 102,530

.png

Ground Materials (PBR)

4K photorealistic PBR ground materials for road, sidewalk, and terrain.

288

.mdl

Ground Material Thumbnails

Preview thumbnail for each ground material.

288

.png

HDRI Sky Maps

High-resolution 4K HDRI domes for realistic, full-environment lighting.

306

.hdr

HDRI Thumbnails

Thumbnail previews for each HDRI sky map.

306

.png

Per-Object Annotations

One annotation file per asset, each containing semantic, physical, and affordance attributes.

102,530

.json

Master Annotation File

Global index describing dataset statistics and the l1/l2/l3 hierarchy.

1

urbanverse_annotation.json

You can query the global statistics and hierarchy as:

about, stats, hierarchy = uvk.load_dataset_metadata()

print(about["version"])          # e.g., "1.0"
print(stats["number_of_assets"]) # e.g., 102,530
print(stats["number_of_classes_l1"], stats["number_of_classes_l2"], stats["number_of_classes_l3"])

# First few hierarchical entries (l1, l2, l3)
for row in hierarchy[:5]:
    print(row["l1"], ">", row["l2"], ">", row["l3"])

Loading Categories#

UrbanVerse-100K uses a three-level semantic hierarchy:

  • l1: top-level (e.g., "building", "road", "street user", "urban object", "amenity", "nature")

  • l2: mid-level (e.g., "commercial building", "transportation amenity", "vegetation")

  • l3: leaf-level, fine-grained categories (e.g., "traffic light", "trash bin", "apartment building", "electric car", "delivery robot")

API:

uvk.load_categories(
    level: str = "leaf",  # "top", "mid", or "leaf"
) -> list[str]

uvk.load_category_stats() -> list[dict]

Examples:

import urbanverse_100k as uvk

# All leaf-level categories (667 in v1.0)
leaf_categories = uvk.load_categories(level="leaf")
print("Leaf categories:", len(leaf_categories))
print(leaf_categories[:10])

# Coarser mid-level and top-level groups
mid_categories = uvk.load_categories(level="mid")
top_categories = uvk.load_categories(level="top")

print("Top-level categories:", top_categories)

To inspect counts per category:

stats = uvk.load_category_stats()
for row in stats[:5]:
    # Example: "street user > robot > delivery robot  :  42"
    print(f'{row["l1"]} > {row["l2"]} > {row["l3"]} : {row["count"]}')

Loading UIDs#

Each asset in UrbanVerse-100K is identified by a unique UID. We distinguish between:

  • object UIDs for 3D assets (e.g., an "electric car" or "traffic light"),

  • ground material IDs (e.g., "Asphalt016", "PavingStones040"),

  • sky map IDs (e.g., "urban_street_01", "autumn_field_puresky").

APIs:

uvk.load_object_uids(
    top_category: str | None = None,   # maps to l1, e.g., "street user"
    mid_category: str | None = None,   # maps to l2, e.g., "robot"
    leaf_category: str | None = None,  # maps to l3, e.g., "delivery robot"
    limit: int | None = None,
) -> list[str]

uvk.load_ground_ids(
    subset: str | None = None,  # "road", "sidewalk", or None
) -> list[str]

uvk.load_sky_ids() -> list[str]

Examples:

import urbanverse_100k as uvk

# 1) All object UIDs in the dataset
all_object_uids = uvk.load_object_uids()
print("Total objects:", len(all_object_uids))

# 2) Only "street user" assets (top category)
street_user_uids = uvk.load_object_uids(top_category="street user")
print("Street users:", len(street_user_uids))

# 3) Only "traffic light" assets (leaf category), capped at 20
tl_uids = uvk.load_object_uids(leaf_category="traffic light", limit=20)
print("Traffic lights (sample):", tl_uids[:5])

# 4) Ground material IDs (all, road-only, sidewalk-only)
all_ground_ids = uvk.load_ground_ids()
road_ground_ids = uvk.load_ground_ids(subset="road")
sidewalk_ground_ids = uvk.load_ground_ids(subset="sidewalk")

print("Ground materials:", len(all_ground_ids))
print("Road materials (e.g., 'Asphalt016', 'Road004'):", road_ground_ids[:5])
print("Sidewalk materials (e.g., 'PavingStones040', 'wet_arc_cobble'):", sidewalk_ground_ids[:5])

# 5) HDRI sky map IDs
sky_ids = uvk.load_sky_ids()
print("Sky maps:", len(sky_ids))
print("Examples:", sky_ids[:5])  # e.g., "urban_street_01", "stuttgart_suburbs", "venice_sunset"

Loading Annotations#

Object annotations expose the full set of semantic, physical, and affordance attributes per asset.

API:

uvk.load_object_annotations(
    uids: list[str] | None = None,
) -> dict[str, dict]
  • If uids is provided, only those objects are returned.

  • If uids=None, annotations for all objects are returned (and cached locally after first download).

Example: inspect a subset of "electric car" assets:

import urbanverse_100k as uvk

# Sample a few "electric car" assets
car_uids = uvk.load_object_uids(leaf_category="electric car", limit=3)

annotations = uvk.load_object_annotations(car_uids)
print("Loaded annotations for:", list(annotations.keys()))

uid = car_uids[0]
ann = annotations[uid]

# Print a few key attributes
print("UID:", ann["uid"])
print("CLASS_NAME:", ann["CLASS_NAME"])               # e.g., "electric car"
print("mass (kg):", ann["mass"])                     # e.g., 3000
print("friction:", ann["friction_coefficient"])      # e.g., 0.9
print("traversability:", ann["traversability"])      # e.g., "obstacle"
print("affordances:", ann["affordances"])            # e.g., ["drivable", "openable", ...]
print("materials:", ann["materials"])                # e.g., ["steel", "glass", "rubber", ...]
print("license:", ann["license_info"]["license"])    # e.g., "by"

Per-object Annotation Example in UrbanVerse-100K#

Each object in UrbanVerse-100K is described by 33+ semantic, physical, material, affordance, and metadata attributes. To help users understand the full structure, we provide a real annotation example below.

The following JSON block is an actual per-object annotation from UrbanVerse-100K (for UID 5a87daef2ee3489dba8b173290029513 — an electric car, a Tesla Cybertruck):

{
  "description_long": "This electric car features a sharply angular, geometric design with ...",
  "description": "A matte dark gray, angular electric pickup with ...",
  "description_view_0": "Front view shows a wide, flat hood ...",
  "description_view_1": "Left side view highlights ...",
  "description_view_2": "Rear view displays ...",
  "description_view_3": "Right side view mirrors ...",

  "category": "electric car",
  "height": 1.9,
  "max_dimension": 5.7,

  "materials": ["steel", "glass", "rubber", "plastic", "air"],
  "materials_composition": [0.7, 0.15, 0.08, 0.05, 0.02],

  "mass": 3000,
  "receptacle": false,
  "frontView": 0,
  "quality": 7,
  "movable": true,
  "required_force": 6000,
  "walkable": false,
  "enterable": true,

  "affordances": [
    "drivable", "openable", "closable",
    "pressable", "toggleable"
  ],

  "support_surface": true,
  "interactive_parts": [
    "door", "wheel", "window",
    "headlight", "taillight",
    "trunk", "charging port", "mirror"
  ],

  "traversability": "obstacle",
  "traversable_by": [],

  "colors": ["dark gray", "black", "red", "orange"],
  "colors_composition": [0.85, 0.1, 0.03, 0.02],

  "surface_hardness": "hard",
  "surface_roughness": 0.18,
  "surface_finish": "matte",
  "reflectivity": 0.18,
  "index_of_refraction": 1.52,
  "youngs_modulus": 200000,
  "friction_coefficient": 0.9,
  "bounciness": 0.05,
  "recommended_clearance": 1.5,

  "asset_composition_type": "single",

  "attribute_car_manufacturer": "Tesla",
  "attribute_car_model": "Cybertruck",
  "attribute_charging_port_location": "left rear quarter panel",
  "attribute_license_plate_design": "none",
  "attribute_badging_or_emblem": [],

  "uid": "5a87daef2ee3489dba8b173290029513",

  "near_synsets": {
    "electric.n.01": 0.5254,
    "pickup.n.01": 0.5421,
    "technical.n.01": 0.5443,
    "car_window.n.01": 0.5600,
    "bumper_car.n.01": 0.5702,
    "car.n.01": -1000.0,
    "car.n.02": -1000.0
  },

  "synset": "pickup.n.01",
  "wn_version": "oewn:2022",

  "annotation_info": {
    "vision_llm": "gpt-4.1-2025-04-14",
    "text_llm": "gpt-4.1-2025-04-14"
  },

  "license_info": {
    "license": "by",
    "uri": "AnonymousForDoubleBlindReview",
    "creator_username": "AnonymousForDoubleBlindReview",
    "creator_display_name": "AnonymousForDoubleBlindReview",
    "creator_profile_url": "AnonymousForDoubleBlindReview"
  },

  "filename": "5a87daef2ee3489dba8b173290029513.glb",
  "CLASS_NAME": "electric car",
  "foldername": null,

  "hshift": -90.0,
  "length": 6.12585,
  "width": 2.41919,
  "is_building": false
}

Full List of Annotation Fields#

Below is a structured overview of all annotation fields found in UrbanVerse-100K, grouped by functionality.

1. Descriptions (LLM-generated) - description_long - description - description_view_0 … description_view_3

2. Semantic Category - category (same as CLASS_NAME) - synset (WordNet synset) - near_synsets (top-scoring related synsets) - wn_version

3. Geometry & Dimensions - height - length - width - max_dimension - scale - z_axis_scale - pose_z_rot_angle - hshift

4. Materials - materials (e.g., steel, glass) - materials_composition (fractions) - colors (semantic) - colors_composition

5. Physical Properties - mass - friction_coefficient - bounciness - surface_roughness - surface_hardness - surface_finish - reflectivity - index_of_refraction - youngs_modulus - max_dimension

6. Affordances & Interactions

  • movable

  • interactive_parts (doors, wheels, trunk, …)

  • affordances (drivable, openable, pressable, …)

  • enterable

  • walkable

  • support_surface

7. Traversability

  • traversability (e.g., obstacle)

  • traversable_by (list of agent types)

8. Object Metadata

  • uid

  • filename

  • foldername (if grouped)

  • quality (asset quality score)

  • asset_composition_type

9. Additional Object-Specific Attributes (e.g., for ``electric car’’ category)

  • attribute_car_manufacturer

  • attribute_car_model

  • attribute_charging_port_location

  • attribute_license_plate_design

  • attribute_badging_or_emblem

10. Annotation Metadata

  • annotation_info (LLM models used)

11. Licensing Information

  • license_info:

    • license (e.g., by)

    • creator_username

    • creator_display_name

    • creator_profile_url

    • uri

Filtering by Attributes#

With annotations loaded, you can filter assets using standard Python/NumPy/Pandas tooling. Below are some example filters based on real attribute patterns in UrbanVerse-100K.

First, if you prefer ultra-high quality assets, you can filter objects using the UrbanVerse-100K quality score (0–10). We provide an example below.

Example: high-quality buildings:

annotations = uvk.load_object_annotations()

high_quality_buildings = [
    uid
    for uid, ann in annotations.items()
    if ann.get("CLASS_NAME") == "building"
    and ann.get("quality", 0) >= 8
]

print("High-quality buildings (quality ≥ 8):", len(high_quality_buildings))
print(high_quality_buildings[:10])

Second, you can also filter objects based on their semantic category and physical attributes, as you would like for your own use cases. We provide some examples below.

Example: light, movable street furniture:

annotations = uvk.load_object_annotations()

movable_light_uids = [
    uid
    for uid, ann in annotations.items()
    if ann.get("l2") == "facilities amenity"
    and ann.get("mass", 1e9) < 50.0
    and ann.get("movable", False) is True
]

print("Movable light facilities (e.g., benches, picnic tables):", len(movable_light_uids))
print(movable_light_uids[:10])

Example: static obstacles suitable as clutter:

static_obstacles = [
    uid
    for uid, ann in annotations.items()
    if ann.get("traversability") == "obstacle"
    and ann.get("movable", False) is False
]

print("Static obstacles (e.g., walls, bollards, stone blocks):", len(static_obstacles))

Example: drivable vehicles with realistic mass:

vehicles = [
    uid
    for uid, ann in annotations.items()
    if ann.get("l1") == "street user"
    and ann.get("CLASS_NAME") in {"electric car", "car", "bus", "truck"}
    and "drivable" in ann.get("affordances", [])
    and 500.0 < ann.get("mass", 0) < 5000.0
]

print("Drivable vehicles:", len(vehicles))
print(vehicles[:10])

You can define your own filters using any combination of semantic labels and physical attributes, for example:

  • choose only low-friction surfaces (e.g., surface_roughness < 0.2),

  • filter for objects at human scale (e.g., 1.0 <= height <= 2.5),

  • focus on objects with specific affordances (e.g., "sit-able" furniture, "pressable" buttons),

  • or restrict to licensed subsets (e.g., only assets with license_info["license"] == "by").

Downloading Assets#

UrbanVerse-100K assets are stored on Hugging Face and fetched lazily. The API downloads them to your local cache (typically under $URBANVERSE_ASSET_ROOT) and returns the corresponding file paths.

APIs:

uvk.load_objects(
    uids: list[str],
    download_processes: int = 1,
    include_thumbnails: bool = True,
    include_renders: bool = False,
) -> dict[str, dict]

uvk.load_ground_materials(
    ids: list[str],
    download_processes: int = 1,
    include_thumbnails: bool = True,
) -> dict[str, dict]

uvk.load_sky_maps(
    ids: list[str],
    download_processes: int = 1,
    include_thumbnails: bool = True,
) -> dict[str, dict]

For objects, the returned dictionary is structured as:

{
    "<uid>": {
        "glb": "/path/to/assets_glb/<uid>.glb",
        "thumbnail": "/path/to/assets_thumbnails/<uid>.png",
        "renders": {
            "0":   "/path/to/assets_renders/<uid>_view_000.png",
            "90":  "/path/to/assets_renders/<uid>_view_090.png",
            "180": "/path/to/assets_renders/<uid>_view_180.png",
            "270": "/path/to/assets_renders/<uid>_view_270.png",
        },
    },
    ...
}

For ground materials:

{
    "<ground_id>": {
        "mdl": "/path/to/ground_materials_mdl/<subset>/<ground_id>.mdl",
        "thumbnail": "/path/to/ground_materials_thumbnails/<subset>/<ground_id>.png",
    },
    ...
}

where <subset> is typically "road" or "sidewalk" (e.g., "Asphalt016", "Road004", "PavingStones040", "wet_arc_cobble").

For sky maps:

{
    "<sky_id>": {
        "hdr": "/path/to/sky_maps_hdr/<sky_id>.hdr",
        "thumbnail": "/path/to/sky_maps_thumbnails/<sky_id>.png",
    },
    ...
}

with IDs such as "urban_street_01", "stuttgart_suburbs", "venice_sunset", "autumn_field_puresky", or "the_sky_is_on_fire".

Example: download and inspect a small random subset of objects:

import random
import multiprocessing
import urbanverse_100k as uvk

# Sample 50 random objects
all_uids = uvk.load_object_uids()
random.seed(42)
sample_uids = random.sample(all_uids, 50)

processes = multiprocessing.cpu_count()

object_paths = uvk.load_objects(
    uids=sample_uids,
    download_processes=processes,
    include_thumbnails=True,
    include_renders=True,
)

uid = sample_uids[0]
print("Paths for", uid)
print(object_paths[uid])

Example: download all road ground materials and a few sky maps:

road_ids = uvk.load_ground_ids(subset="road")
sky_ids = uvk.load_sky_ids()[:10]

ground_paths = uvk.load_ground_materials(road_ids)
sky_paths = uvk.load_sky_maps(sky_ids)

print("One road material:", next(iter(ground_paths.values())))
print("One sky map:", next(iter(sky_paths.values())))

Once meshes, materials, and sky domes are cached locally, subsequent calls reuse the local files, making development, visualization, and real-to-sim scene generation with UrbanVerse-Gen much faster.