src.core package¶

Submodules¶

src.core.memory module¶

Mario Kart DS (MKDS) Emulator I/O & Geometry Utilities¶

This module provides a high-level, vectorized interface for reading game state from a running DeSmuME emulator and performing common geometric operations used in visualization, control, and RL policy features for Mario Kart DS.

It wraps low-level memory reads (positions, directions, camera data, objects, checkpoints, clock) and exposes a compact API for tasks like projecting world-space points to screen-space, computing distances to checkpoints and obstacles, and deriving view matrices.

The implementation favors:

Deterministic caching at frame and game lifetimes to minimize emulator I/O, and
Torch tensor computations (CPU / CUDA / MPS) for fast, batched math.

Quick Start¶

>>> from desmume.emulator import DeSmuME
>>> import torch
>>> from your_module import (
...     read_position, read_direction, project_to_screen, read_forward_distance_checkpoint
... )
>>> emu = DeSmuME()
>>> # ... open ROM, load state, etc.
>>> device = torch.device("cpu")  # or "cuda", "mps"
>>> pos = read_position(emu, device=device)               # (3,)
>>> dir = read_direction(emu, device=device)              # (3,)
>>> screen = project_to_screen(emu, pos.unsqueeze(0), device=device)  # (1, 4)
>>> fwd_to_cp = read_forward_distance_checkpoint(emu, device=device)  # scalar tensor

Key Concepts & Conventions¶

Coordinate Systems¶

World Space: Right-handed, with Y as up. Many functions assume a canonical “up” vector of (0, 1, 0) and construct a right-handed orthonormal basis [right, up, forward].
Camera Space / Clip Space / NDC / Screen Space: _compute_model_view builds a model-view matrix from camera position/target; _project_to_screen applies perspective and viewport transforms to return pixel coordinates for a 256×192 screen (Nintendo DS top display).
Screen Origin: (0, 0) is top-left. X grows to the right; Y grows downward. This follows the standard raster convention and matches the (1 - ndc_y) transform used in projection.

Units¶

Positions & Scalars returned from memory are derived from MKDS fixed-point formats (FX32, etc.) via helpers like read_vector_3d_fx32 and read_fx32, and are exposed as Python floats / Torch tensors.
Angles: - Camera FOV is read from a 16-bit fixed-point angle and converted to radians

using: value * (2π / 0x10000).
Time: - read_clock() returns centiseconds (10 ms units).

Memory Map & Assets¶

Addresses: The module uses static addresses for key pointers (racer, course, objects, checkpoints, camera, clock). See constants: RACER_PTR_ADDR, COURSE_ID_ADDR, OBJECTS_PTR_ADDR, CHECKPOINT_PTR_ADDR, CLOCK_DATA_PTR, CAMERA_PTR_ADDR.
Course Files: - courses.json maps course IDs to directory names. - KCL (course_collision.kcl) and NKM (course_map.nkm) are loaded via

KCLTensor.from_file(…) and NKMTensor.from_file(…).

Caching Model¶

Two decorators reduce emulator I/O:

@frame_cache — Caches the function’s single return value per emulator tick (emu.get_ticks()). Recomputes only when the tick changes.
@game_cache — Caches the function’s single return value for the process lifetime (until interpreter exit).

⚠ Important: Both caches ignore argument values. If you call a cached function with different arguments within the same lifetime (same frame or same run), the first computed result is reused. In practice, pass stable arguments (e.g., a constant device) to avoid surprises.

Device Handling¶

Many functions accept a Torch device and return tensors allocated there. For best performance, use cuda (GPU) or mps (Apple Silicon) when available, and keep devices consistent across the call sites, especially for @game_cache results (KCL/NKM tensors are created on the device used at first call).

Public API Overview¶

Clock & Course¶

read_clock_ptr(emu) — Base pointer to clock data (cached for game lifetime).
read_clock(emu) — Current clock in 10 ms units (cached per frame).
get_current_course_id(emu) — Current course ID (byte).
get_course_path(id) — Course directory name from courses.json.
load_current_kcl(emu, device) — Parsed KCL collision mesh (game-cached).
load_current_nkm(emu, device) — Parsed NKM map data (game-cached).

Player & Objects¶

read_racer_ptr(emu) — Pointer to the player racer struct.
read_position(emu, device) — Player world position (3,).
read_direction(emu, device) — Player forward direction (3,).
read_objects(…), read_object_* helpers — Scans and queries object table. - safe_object decorator returns None for deleted objects.

Camera & Projection¶

read_camera_ptr(emu) — Pointer to camera struct.
read_camera_fov(emu) — FOV in radians.
read_camera_aspect(emu) — Aspect ratio (W/H).
read_camera_position(emu, device) — Camera world pos (3,) with elevation.
read_camera_target_position(emu, device) — Camera look-at (3,).
read_model_view(emu, device) — 4×4 model-view matrix.
project_to_screen(emu, points, device) — Projects (N,3) to (N,4): [x_px, y_px, clip_z, normalized_depth].
z_clip_mask(x) — Mask for points within Z-near/far bounds (camera space).

Checkpoints¶

read_checkpoint_ptr(emu) — Pointer to checkpoint manager.
read_current_checkpoint(emu), read_current_key_checkpoint(emu), read_current_lap(emu) — Indices for current progress.
read_ghost_checkpoint(emu), read_ghost_key_checkpoint(emu) — Ghost state.
read_checkpoint_positions(emu, device) — (C, 2, 3) segment endpoints.
read_next_checkpoint(emu, checkpoint_count) — Next index (wraps).
read_next_checkpoint_position(emu, device), read_current_checkpoint_position(emu, device) — (2,3) endpoints.
read_facing_point_checkpoint(emu, direction, device) — Intersection of a ray (from player, given direction) with next checkpoint line in XZ.
read_forward_distance_checkpoint(emu, device), read_left_distance_checkpoint(emu, device), read_direction_to_checkpoint(emu, device) — Distances/steering angle.

Obstacles (Walls / Offroad)¶

read_facing_point_obstacle(emu, position, direction, device) — Samples a cone of rays around the forward direction to find the nearest hit against wall/offroad triangles. Returns a point or None.
read_forward_distance_obstacle(emu, device), read_left_distance_obstacle(emu, device), read_right_distance_obstacle(emu, device) — Scalar distances to nearest obstacles along canonical forward/left/right rays. Return +inf when no hit.

Return Types & Shapes¶

Positions / Directions: torch.Tensor with shape (3,).
Batches of points: (N, 3).
Screen Projection: (N, 4) → [x_px, y_px, clip_z, normalized_depth].
Checkpoints: (C, 2, 3) → per checkpoint two endpoints [p1, p2].
Distances / Angles: 0-D or 1-D scalar torch.Tensor (depending on operation).

Errors & Edge Cases¶

Deleted / Ignored Objects: safe_object-wrapped functions return None when the object is deleted; callers must handle None.
No Geometry: When there are no wall/offroad triangles or raycasts miss, obstacle distance functions return +inf (as a tensor).
Empty Projections: _project_to_screen returns an empty tensor when given no points; invalid (behind-camera) points may still project with negative clip_w.

Performance Notes¶

Caching eliminates redundant memory reads across a frame / game run.
Geometry routines (ray casting, distances, projection) are vectorized in Torch; prefer GPU/MPS devices when available.
Keep devices consistent across calls that share cached state (e.g., KCL/NKM).

Project player and next checkpoint endpoints to screen:

>>> pts = torch.vstack([read_position(emu, device),    # (1,3)
...                     read_next_checkpoint_position(emu, device)]).reshape(-1, 3)
>>> screen_pts = project_to_screen(emu, pts, device)
>>> screen_pts[:, :2]  # pixel coordinates

Compute lateral vs forward distance to next checkpoint:

>>> d_left  = read_left_distance_checkpoint(emu, device)
>>> d_front = read_forward_distance_checkpoint(emu, device)

Find nearest obstacle straight ahead:

>>> d_obs = read_forward_distance_obstacle(emu, device)
>>> float(d_obs) if torch.isfinite(d_obs) else float("inf")

Implementation Notes¶

_compute_orthonormal_basis builds a right-handed frame from a forward vector and an up-like reference (default (0,1,0)), normalizing each axis.
_compute_model_view constructs a 4×4 model-view matrix in row-major with basis rows [right, up, forward] and a translated origin.
_project_to_screen creates a simple perspective matrix using vertical FOV and aspect ratio; returns pixel coordinates using constants SCREEN_WIDTH = 256 and SCREEN_HEIGHT = 192.

Compatibility¶

Tested with DeSmuME Python bindings and Torch. Some ops may vary by backend (e.g., MPS lacks a few linear algebra kernels); this module sticks to widely supported APIs.

src.core.memory.frame_cache(func: Callable[[Concatenate[desmume.emulator.DeSmuME, P]], R]) → Callable[[Concatenate[desmume.emulator.DeSmuME, P]], R][source]¶

Decorator that caches a function’s return value once per emulator tick.

The wrapped function will only be re-executed when emu.get_ticks() changes. Useful for expensive reads that don’t change within a single frame.

Parameters:: func – A function whose first argument is a DeSmuME instance.
Returns:: A wrapped function with identical signature that returns a cached result per tick.

src.core.memory.game_cache(func: Callable[[Concatenate[desmume.emulator.DeSmuME, P]], R]) → Callable[[Concatenate[desmume.emulator.DeSmuME, P]], R][source]¶

Decorator that caches a function’s return value for the process lifetime.

The wrapped function executes once and its result is reused thereafter. Appropriate for data that remains constant across a run (e.g., course files).

Parameters:: func – A function whose first argument is a DeSmuME instance.
Returns:: A wrapped function with identical signature that returns a cached result.

src.core.memory.z_clip_mask(x: torch.Tensor) → torch.Tensor[source]¶

Compute a boolean mask for points within the view frustum Z range.

Parameters:: x – Tensor of shape (N, 3+) where x[:, 2] is the camera-space Z.
Returns:: A boolean tensor of shape (N,) where True indicates Z is between -Z_FAR and -Z_NEAR.

src.core.memory.read_clock_ptr(emu: desmume.emulator.DeSmuME)[source]¶

Read the base pointer to the game’s clock data structure.

Parameters:: emu – Emulator instance.
Returns:: Integer address of the clock data struct.

src.core.memory.read_clock(emu: desmume.emulator.DeSmuME)[source]¶

Read the current game clock value.

The value is read from the clock data structure and multiplied by 10, resulting in units of 10 ms (centiseconds).

Parameters:: emu – Emulator instance.
Returns:: Integer time in 10 ms units.

src.core.memory.get_current_course_id(emu: desmume.emulator.DeSmuME)[source]¶

Read the current course ID from memory.

Parameters:: emu – Emulator instance.
Returns:: Integer course ID (byte).

src.core.memory.get_course_path(id: int, lookup_path: str = './src/misc/courses.json')[source]¶

Resolve a course ID to the local filesystem path for its assets.

Parameters:: id – Course ID.
Returns:: String path relative to ./private/courses/ for the given course.
Raises:: AssertionError – If the course ID is not present in the lookup table.

src.core.memory.load_current_kcl(emu: desmume.emulator.DeSmuME, device)[source]¶

Load and parse the KCL collision file for the current course.

Cached for the lifetime of the process.

Parameters:

emu – Emulator instance.
device – Torch device (e.g., ‘cpu’, ‘cuda’, ‘mps’) to store tensors on.

Returns:

KCLTensor with triangle and prism data on the specified device.

src.core.memory.load_current_nkm(emu: desmume.emulator.DeSmuME, device)[source]¶

Load and parse the NKM map file for the current course.

Cached for the lifetime of the process.

Parameters:

emu – Emulator instance.
device – Torch device to store tensors on.

Returns:

NKMTensor with NKM section tensors (e.g., checkpoints) on the specified device.

src.core.memory.read_racer_ptr(emu: desmume.emulator.DeSmuME, addr: int = 35106040)[source]¶

Read the pointer to the player’s racer object.

Parameters:

emu – Emulator instance.
addr – Memory address where the racer pointer is stored.

Returns:

Integer address of the racer structure.

src.core.memory.read_position(emu: desmume.emulator.DeSmuME, device)[source]¶

Read the player’s world-space position.

Parameters:

emu – Emulator instance.
device – Torch device for the returned tensor.

Returns:

torch.Tensor of shape (3,) representing (x, y, z) in world units.

src.core.memory.read_direction(emu: desmume.emulator.DeSmuME, device)[source]¶

Read the player’s forward direction vector (world-space).

Parameters:

emu – Emulator instance.
device – Torch device for the returned tensor.

Returns:

torch.Tensor of shape (3,) representing the forward direction.

src.core.memory.read_objects_array_max_count(emu: desmume.emulator.DeSmuME, addr: int = 35108232)[source]¶

Read the maximum number of objects in the global object array.

Parameters:

emu – Emulator instance.
addr – Base address of the object array metadata.

Returns:

Signed integer max count.

src.core.memory.read_objects_array_ptr(emu: desmume.emulator.DeSmuME, addr: int = 35108232)[source]¶

Read the pointer to the global object pointer array.

Parameters:

emu – Emulator instance.
addr – Base address of the object array metadata.

Returns:

Signed integer address of the object pointer array.

src.core.memory.read_object_offset(emu: desmume.emulator.DeSmuME, id: int)[source]¶

Compute the memory offset of an object entry within the array.

Parameters:

emu – Emulator instance.
id – Object index.

Returns:

Integer byte offset to the object’s metadata entry.

src.core.memory.read_object_ptr(emu: desmume.emulator.DeSmuME, id: int)[source]¶

Read the object instance pointer for a given object ID.

Parameters:

emu – Emulator instance.
id – Object index.

Returns:

Integer address of the object struct (0 if null).

src.core.memory.read_object_flags(emu: desmume.emulator.DeSmuME, id: int)[source]¶

Read the object’s flags (type/category bits, state, etc.).

Parameters:

emu – Emulator instance.
id – Object index.

Returns:

Unsigned short flags value.

src.core.memory.read_object_position_ptr(emu: desmume.emulator.DeSmuME, id: int)[source]¶

Read the pointer to an object’s position vector in memory.

Parameters:

emu – Emulator instance.
id – Object index.

Returns:

Integer address for the object’s position struct (0 if deleted).

src.core.memory.read_object_is_ignored(emu: desmume.emulator.DeSmuME, id: int)[source]¶

Determine if an object should be ignored (null or ignored-flag set).

Parameters:

emu – Emulator instance.
id – Object index.

Returns:

True if object ptr is 0 or ignored bit is set; False otherwise.

src.core.memory.read_object_is_deleted(emu: desmume.emulator.DeSmuME, id: int)[source]¶

Check if the object has been deleted (position pointer is null).

Parameters:

emu – Emulator instance.
id – Object index.

Returns:

True if deleted; False otherwise.

src.core.memory.safe_object(func)[source]¶

Decorator that skips object reads when the object appears deleted.

The wrapped function receives (emu, id, *args, **kwargs). If the object is deleted (null position pointer), the wrapper returns None.

src.core.memory.read_object_position(emu: desmume.emulator.DeSmuME, id: int, *args, **kwargs)[source]¶: Internal wrapper used by safe_object to guard deleted objects.

src.core.memory.read_map_object_type_id(emu: desmume.emulator.DeSmuME, id: int, *args, **kwargs)[source]¶: Internal wrapper used by safe_object to guard deleted objects.

src.core.memory.read_map_object_is_coin_collected(emu: desmume.emulator.DeSmuME, id: int, *args, **kwargs)[source]¶: Internal wrapper used by safe_object to guard deleted objects.

src.core.memory.read_racer_object_is_ghost(emu: desmume.emulator.DeSmuME, id: int, *args, **kwargs)[source]¶: Internal wrapper used by safe_object to guard deleted objects.

src.core.memory.read_objects(emu: desmume.emulator.DeSmuME)[source]¶

Scan the global object table and group object indices by category.

Categories:

‘map_objects’
‘racer_objects’
‘item_objects’
‘dynamic_objects’

Returns:: Dict[str, list[int]] mapping category name to list of indices.

src.core.memory.read_camera_ptr(emu: desmume.emulator.DeSmuME, addr: int = 35105356)[source]¶

Read the pointer to the active camera structure.

Parameters:

emu – Emulator instance.
addr – Address where the camera pointer is stored.

Returns:

Integer address of the camera struct.

src.core.memory.read_camera_fov(emu: desmume.emulator.DeSmuME)[source]¶

Read the current camera field-of-view (radians).

The FOV value is stored as a 16-bit fixed-point angle; it is converted to radians.

Parameters:: emu – Emulator instance.
Returns:: Floating-point FOV in radians.

src.core.memory.read_camera_aspect(emu: desmume.emulator.DeSmuME)[source]¶

Read the camera aspect ratio from memory.

Parameters:: emu – Emulator instance.
Returns:: Float aspect ratio (width/height).

src.core.memory.read_camera_position(emu: desmume.emulator.DeSmuME, device)[source]¶

Read the camera world position, including elevation offset.

Parameters:

emu – Emulator instance.
device – Torch device for the returned tensor.

Returns:

torch.Tensor shape (3,) representing camera (x, y, z).

src.core.memory.read_camera_target_position(emu: desmume.emulator.DeSmuME, device)[source]¶

Read the camera’s target/look-at position in world space.

Parameters:

emu – Emulator instance.
device – Torch device for the returned tensor.

Returns:

torch.Tensor shape (3,) target (x, y, z).

src.core.memory.read_model_view(emu: desmume.emulator.DeSmuME, device)[source]¶

Compute and cache the camera model-view matrix for the current frame.

Parameters:

emu – Emulator instance.
device – Torch device for returned matrix.

Returns:

torch.Tensor shape (4,4) model-view matrix.

src.core.memory.project_to_screen(emu: desmume.emulator.DeSmuME, points: torch.Tensor, device, screen_dim=(256, 192))[source]¶

Convenience wrapper to project points using the current camera state.

Parameters:

emu – Emulator instance.
points – Tensor shape (N,3) of world-space points.
device – Torch device.

Returns:

Tensor shape (N,4) in screen space (see _project_to_screen).

src.core.memory.read_checkpoint_ptr(emu: desmume.emulator.DeSmuME, addr: int = 35083772)[source]¶

Read the pointer to the checkpoint manager/state.

Parameters:

emu – Emulator instance.
addr – Address where the checkpoint pointer is stored.

Returns:

Integer address for checkpoint data.

src.core.memory.read_current_checkpoint(emu: desmume.emulator.DeSmuME)[source]¶

Read the index of the current checkpoint.

Parameters:: emu – Emulator instance.
Returns:: Unsigned byte checkpoint index.

src.core.memory.read_current_key_checkpoint(emu: desmume.emulator.DeSmuME)[source]¶

Read the current key checkpoint index (special/lap-related).

Parameters:: emu – Emulator instance.
Returns:: Signed byte key checkpoint index.

src.core.memory.read_ghost_checkpoint(emu: desmume.emulator.DeSmuME)[source]¶

Read the recorded ghost’s current checkpoint index.

Parameters:: emu – Emulator instance.
Returns:: Signed byte ghost checkpoint index.

src.core.memory.read_ghost_key_checkpoint(emu: desmume.emulator.DeSmuME)[source]¶

Read the recorded ghost’s current key checkpoint index.

Parameters:: emu – Emulator instance.
Returns:: Signed byte ghost key checkpoint index.

src.core.memory.read_current_lap(emu: desmume.emulator.DeSmuME)[source]¶

Read the current lap number.

Parameters:: emu – Emulator instance.
Returns:: Signed byte lap index (0-based).

src.core.memory.read_next_checkpoint(emu: desmume.emulator.DeSmuME, checkpoint_count: int)[source]¶

Compute the next checkpoint index (wrapping to 0 at the end).

Parameters:

emu – Emulator instance.
checkpoint_count – Total number of checkpoints.

Returns:

Integer index of the next checkpoint.

src.core.memory.read_previous_checkpoint(emu: desmume.emulator.DeSmuME, checkpoint_count: int)[source]¶

src.core.memory.read_checkpoint_positions(emu: desmume.emulator.DeSmuME, device)[source]¶

Build a tensor of checkpoint segment endpoints in 3D.

Reads NKM and KCL, extracts floor geometry, and converts checkpoint pairs from 2D to 3D using nearest floor elevation.

Parameters:

emu – Emulator instance.
device – Torch device.

Returns:

Tensor shape (C, 2, 3) where C is number of checkpoints, containing [p1, p2] endpoints per checkpoint.

src.core.memory.read_next_checkpoint_position(emu: desmume.emulator.DeSmuME, device)[source]¶

Get the 3D endpoints of the next checkpoint segment.

Parameters:

emu – Emulator instance.
device – Torch device.

Returns:

Tensor shape (2,3) representing the next checkpoint’s [p1, p2].

src.core.memory.read_previous_checkpoint_position(emu: desmume.emulator.DeSmuME, device)[source]¶

Get the 3D endpoints of the next checkpoint segment.

Parameters:

emu – Emulator instance.
device – Torch device.

Returns:

Tensor shape (2,3) representing the next checkpoint’s [p1, p2].

src.core.memory.read_current_checkpoint_position(emu: desmume.emulator.DeSmuME, device)[source]¶

Get the 3D endpoints of the current checkpoint segment.

Parameters:

emu – Emulator instance.
device – Torch device.

Returns:

Tensor shape (2,3) representing current checkpoint’s [p1, p2].

src.core.memory.read_facing_point_checkpoint(emu: desmume.emulator.DeSmuME, direction: torch.Tensor, device)[source]¶

Raycast from the player along a direction to the next checkpoint line (XZ).

Parameters:

emu – Emulator instance.
direction – Tensor shape (3,) direction vector.
device – Torch device.

Returns:

Tensor shape (3,) point of intersection in world coordinates.

src.core.memory.read_forward_distance_checkpoint(emu, device)[source]¶

Compute forward distance from player to the next checkpoint line.

Parameters:

emu – Emulator instance.
device – Torch device.

Returns:

Scalar torch.Tensor distance.

src.core.memory.read_left_distance_checkpoint(emu, device)[source]¶

Compute leftward distance from player to the next checkpoint line.

Parameters:

emu – Emulator instance.
device – Torch device.

Returns:

Scalar torch.Tensor distance.

src.core.memory.read_direction_to_checkpoint(emu: desmume.emulator.DeSmuME, device)[source]¶

Compute a steering angle toward the next checkpoint from forward/left distances.

Angle is computed as atan(forward / left).

Parameters:

emu – Emulator instance.
device – Torch device.

Returns:

Scalar torch.Tensor angle in radians.

src.core.memory.read_facing_point_obstacle(emu: DeSmuME, position: torch.Tensor | None = None, direction: torch.Tensor | None = None, device=None, **sample_kwargs)[source]¶

Raycast toward walls/offroad and return the nearest hit point.

Samples a cone of directions around the provided (or player) direction, and finds the nearest intersection against wall and offroad triangles.

Parameters:

emu – Emulator instance.
position – Optional world position (3,). Defaults to player’s position.
direction – Optional direction (3,). Defaults to player’s forward vector.
device – Torch device.

Returns:

torch.Tensor shape (3,) hit point, or None if no intersections.

src.core.memory.read_closest_obstacle_point(emu: DeSmuME, position: torch.Tensor | None = None, direction: torch.Tensor | None = None, device=None, **sample_kwargs) → torch.Tensor | None[source]¶

src.core.memory.read_forward_distance_obstacle(emu: desmume.emulator.DeSmuME, device=None, **sample_kwargs) → torch.Tensor[source]¶

Compute forward distance to the nearest wall/offroad obstacle.

Parameters:

emu – Emulator instance.
device – Torch device.

Returns:

Scalar torch.Tensor distance; +inf if no hit.

src.core.memory.read_left_distance_obstacle(emu: desmume.emulator.DeSmuME, device=None, **sample_kwargs) → torch.Tensor[source]¶

Compute leftward distance to the nearest wall/offroad obstacle.

Parameters:

emu – Emulator instance.
device – Torch device.

Returns:

Scalar torch.Tensor distance; +inf if no hit.

src.core.memory.read_right_distance_obstacle(emu: desmume.emulator.DeSmuME, device=None, **sample_kwargs) → torch.Tensor[source]¶

Compute rightward distance to the nearest wall/offroad obstacle.

Parameters:

emu – Emulator instance.
device – Torch device.

Returns:

Scalar torch.Tensor distance; +inf if no hit.

src.core.memory.read_checkpoint_distance_altitude(emu: desmume.emulator.DeSmuME, device) → torch.Tensor[source]¶

Compute the altitude (height) of the triangle formed by player and checkpoint endpoints.

Uses the two checkpoint endpoints and the player’s position to form sides a and b, then returns the triangle altitude via triangle_altitude(a, b).

Parameters:

emu – Emulator instance.
device – Torch device.

Returns:

Scalar torch.Tensor altitude value.

src.core.memory.read_touching_prism_type(emu: desmume.emulator.DeSmuME, attr_mask: Callable[[torch.Tensor], torch.Tensor], device) → bool[source]¶

src.core.memory.read_mat_c(emu: desmume.emulator.DeSmuME, device=None)[source]¶

src.core.memory.read_pos_c(emu: desmume.emulator.DeSmuME, device=None)[source]¶

src.core.memory.read_driver_pos_c(emu: desmume.emulator.DeSmuME, device=None)[source]¶

src.core.metric module¶

Pluggable metric collectors and fitness scoring for MKDS training.

This module defines a small, picklable interface for episode-level metrics that can be attached to the emulator loop without modifying the core trainer. Each metric implements a three-phase lifecycle:

reset() — called once at the start of an episode to clear state.
update(emu, device) — called every frame to accumulate data.
collect() — called once at episode end to return scalar summaries.

Design notes:

Picklability: All metric implementations are top-level classes with minimal state (floats/ints/tensors) so they can be sent to worker processes via multiprocessing (spawn or fork).
Independence: Metrics do not modify emulator state or controls; they only observe via functions provided by src.core.memory.
Units:
- Distances are in MKDS “world units” (derived from FX32).
- Time read_clock(emu) is returned in centiseconds (10 ms units).
- If you compute speed as distance / centiseconds, multiply by 100 to convert to per-second rates.

Helper functions reset_all(…) and collect_all(…) operate on a list of Metric instances to simplify orchestration. A FitnessScorer protocol is provided to decouple metric collection from scalar fitness scoring.

Typical usage:

>>> metrics = [DistanceMetric(), OffroadMetric()]
>>> reset_all(metrics)
>>> while episode_running:
...     # run emulator step, inference, controls, etc.
...     for m in metrics:
...         m.update(emu, device=device)
>>> summary = collect_all(metrics)  # {"distance": ..., "offroad_dist": ...}
>>> fitness = default_fitness_scorer(summary)  # plug into selection

class src.core.metric.Metric[source]¶

Bases: object

Interface for any metric collector used during training.

Each metric observes emulator state every frame and exposes a scalar summary at episode end. Implementations should avoid heavy allocations or device transfers in update, and should keep internal state simple so the object remains cheap to pickle across processes.

abstractmethod reset() → None[source]¶

Reset internal state at the start of an episode.

Implementations must clear any rolling counters, cached positions, or flags so that a fresh episode starts from a known baseline. This method is called exactly once per episode, prior to the first update.

abstractmethod update(emu: desmume.emulator.DeSmuME, device) → None[source]¶

Record or accumulate values for the current frame.

This method is called once per emulator tick/frame. It should read any required game state via src.core.memory helpers and update internal counters accordingly.

Parameters:

emu – Live DeSmuME emulator instance for the current process.
device – Optional Torch device hint if your reads/ops require a specific device. Many src.core.memory functions accept this.

abstractmethod collect() → dict[str, float][source]¶

Return scalar summary values at the end of the episode.

Implementations should convert any tensor values to Python floats and return them under stable, unique keys. The convention in this module is to return exactly one key/value pair per metric.

Returns:: A dictionary mapping the metric’s name to a scalar float value.

class src.core.metric.DistanceMetric[source]¶

Bases: Metric

Signed checkpoint-to-checkpoint progress along the course.

This metric treats the checkpoint chain as a polyline and accumulates signed distance when crossing from one checkpoint segment to an adjacent one:

Moving forward (current ➔ next) adds the midpoint-to-midpoint segment length.
Moving backward (current ➔ previous) subtracts that length.

This approach is:

Robust to FPS: updates only on checkpoint transitions, not per-frame arc length integration.
Direction-aware: penalizes reversing or losing progress.
Course-agnostic: relies only on NKM checkpoints you already load.

Note

The accumulated value is a coarse approximation of path length since it uses straight-line distances between segment midpoints, not the exact racing line.

reset() → None[source]¶: Clear progress counters and current checkpoint ID.

update(emu: DeSmuME, device: DeviceLikeType | None = None) → None[source]¶

Accumulate signed progress on checkpoint transitions.

Logic:

On first call, caches curr/prev/next checkpoint IDs and returns.
If the player remains on the same checkpoint, does nothing.
If the checkpoint index changes:
- Compute the midpoint of the previous segment and the midpoint of the newly-entered segment.
- Add their distance if we moved forward (curr ➔ next).
- Subtract if we moved backward (curr ➔ prev).
- Update the cached curr/prev/next IDs.

Parameters:

emu – Emulator instance.
device – Optional Torch device used by memory readers.

Raises:

AssertionError – If midpoints could not be computed (should not happen under normal circumstances given valid NKM data).

collect() → dict[str, float][source]¶

Return the signed checkpoint progress accumulated this episode.

Returns:

“distance”: Signed scalar world-units progressed, positive when moving forward through checkpoints, negative when moving backward.

Return type:

A dict with a single key

class src.core.metric.SpeedMetric(distance_metric: DistanceMetric)[source]¶

Bases: Metric

Average episode speed computed from a DistanceMetric.

This metric divides the total signed distance reported by a companion DistanceMetric by the elapsed clock time between the first observed checkpoint and the episode end.

Important

read_clock(emu) returns centiseconds. As implemented, the resulting speed is in world-units per centisecond. Multiply by 100.0 to obtain world-units per second.
This is an episode-level average; it does not reflect instantaneous speed or time spent stationary before the first checkpoint crossing.

reset()[source]¶: Clear the time baseline and current speed.

update(emu: DeSmuME, device: DeviceLikeType | None = None) → None[source]¶

Update the average speed from current distance and clock.

Behavior:

Before any checkpoint is observed by DistanceMetric, captures a start_time and waits.
Afterward, divides the current distance_metric.dist by (read_clock(emu) - start_time) (centiseconds).

Parameters:

emu – Emulator instance.
device – Unused; present for interface parity.

collect() → dict[str, float][source]¶

Return the episode-average speed.

Returns:

“speed”: Average speed in world-units per centisecond. Multiply by 100 to convert to world-units per second.

Return type:

A dict with a single key

class src.core.metric.OffroadMetric[source]¶

Bases: Metric

Approximate distance traveled while the kart is offroad.

This metric toggles an internal is_offroad flag based on the ground prism’s collision type under the player (using read_touching_prism_type with an attribute mask matching known offroad IDs {2, 3, 5}).

When the kart transitions:

Onroad ➔ Offroad: caches the current position.
Offroad ➔ Onroad: adds the straight-line displacement between the cached entry position and the current position to offroad_dist.

Caveat:

The accumulated distance is a straight-line approximation between entry and exit points; it does not integrate arc length over time while offroad. For tight zig-zags offroad, it will underestimate true path length.

reset() → None[source]¶: Clear cached position, offroad distance, and offroad state.

update(emu: DeSmuME, device: DeviceLikeType | None = None)[source]¶

Update offroad tracking based on current surface attribute.

Parameters:

emu – Emulator instance.
device – Torch device forwarded to memory helpers.

Behavior:

If this is the first frame, caches position and returns.
Determines if the current surface is offroad via an attribute mask: (attr == 3) | (attr == 2) | (attr == 5).
If the offroad state is unchanged, does nothing.
On entering offroad, caches the current position.
On exiting offroad, adds the straight-line distance between the cached entry position and the current position to offroad_dist, then updates is_offroad.

collect() → dict[str, float][source]¶

Return the approximate offroad distance traveled.

Returns:

“offroad_dist”: Approximate straight-line distance covered while offroad during the episode, in world units.

Return type:

A dict with a single key

src.core.metric.collect_all(metrics: list[Metric])[source]¶

Collect scalar summaries from a list of metrics.

This is a convenience that merges per-metric outputs into a single flat dictionary. It assumes each metric returns exactly one key/value pair and uses the first item of the returned dict.

Parameters:: metrics – List of initialized metric instances that have already been reset() and update(…)-ed for the episode.
Returns:: A dictionary mapping metric names to scalar floats, e.g. {“distance”: 123.4, “offroad_dist”: 5.6}.

Example

>>> collect_all([DistanceMetric(), OffroadMetric()])
{'distance': 0.0, 'offroad_dist': 0.0}

src.core.metric.reset_all(metrics: list[Metric])[source]¶

Reset all metrics in-place.

Calls reset() on each metric in order. Useful at the start of every episode to ensure clean state.

Parameters:: metrics – List of metric instances to be reset.

class src.core.metric.FitnessScorer(*args, **kwargs)[source]¶

Bases: Protocol

Callable protocol for converting metric dicts into scalar fitness.

A FitnessScorer consumes the merged output of collect_all(…) and returns a single float suitable for selection and ranking in an evolutionary or RL loop.

Example

>>> def scorer(m): return m['distance'] - 10.0*m.get('offroad_dist', 0.0)
>>> isinstance(scorer, FitnessScorer)
True

src.core.metric.default_fitness_scorer(metrics: dict[str, float]) → float[source]¶

Return a distance-only fitness score.

This default scorer simply returns the “distance” metric. It is intended as a minimal baseline and may raise a KeyError if the “distance” key is absent.

Parameters:: metrics – Flat dictionary of scalar metrics.
Returns:: The value associated with the “distance” key.
Raises:: KeyError – If “distance” is not present in metrics.

src.core.model module¶

class src.core.model.NodeGene(nid, ntype)[source]¶: Bases: object

class src.core.model.ConnGene(in_id, out_id, w, enabled=True)[source]¶: Bases: object

class src.core.model.Genome(n_inputs, n_outputs, device: DeviceLikeType | None = None)[source]¶

Bases: object

mutate_weight()[source]¶

mutate_add_conn()[source]¶

mutate_add_node()[source]¶

class src.core.model.EvolvedNet(*args: Any, **kwargs: Any)[source]¶

Bases: Module

forward(x)[source]¶

class src.core.model.JSONEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]¶

Bases: JSONEncoder

default(o)[source]¶

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return super().default(o)

src.core.model.load_genome(file_path: str | Path)[source]¶

src.core.model.save_genome(genome: Genome, file_path: str | Path)[source]¶

src.core.train module¶

Parallel trainer for Mario Kart DS agents using DeSmuME, multiprocessing, shared memory frame streaming, and optional live GTK visualization.

This module orchestrates end-to-end evaluation and evolution of a population of neural network controllers (NEAT-style) for Mario Kart DS. It supports three execution modes per evaluated individual:

Headless (run_process) — fast evaluation with no display.

Display worker (run_window_process) — renders frames and writes them into a per-process shared-memory buffer; no GTK loop.

Display host (run_window_host_process) — renders frames, writes them into shared memory, and owns the GTK window that tiles and presents all display-enabled workers in real time.

Key concepts¶

Shared memory frames: Each display-enabled process writes an RGBX framebuffer of shape (SCREEN_HEIGHT, SCREEN_WIDTH, 4) (dtype np.uint8) to a POSIX shared-memory segment named f"emu_frame_{id}". The host window process opens these buffers read-only for display tiling.
Overlays: Optional per-frame overlays are computed off the main emulation loop by a single background thread fed via a queue. Overlays are composited in the worker before writing to shared memory using src.visualization.window.on_draw_memoryview().
Statistics / fitness: Each process records split times and distances at track checkpoints as a dict[int, list[tuple[float, float]]] mapping checkpoint_id -> [(delta_time, distance_at_split), ...]. A simple fitness function sums the recorded distances.
Batching & evolution: run_training_session() evaluates a subset (batch) of the population in parallel (bounded by num_proc), aggregates stats, then train() evolves the population.

Threading & processes¶

The DeSmuME emulator is created and used inside each process that runs it.
The GTK main loop must run in a single process. This module designates one display-enabled process per batch as the host that creates the window and drives GTK via GLib.timeout_add.
Overlays are computed by a single background thread (daemon) within each display-enabled process to keep the emulation loop responsive.

Shared-memory lifetime¶

Creation: Call safe_shared_memory() to create (or replace) a named shared-memory segment.
Ownership: Workers open their frame buffer (emu_frame_{id}) by name and keep a persistent SharedMemory handle as long as they render frames.
Teardown: After processes finish, the parent should close and unlink per-process frame segments to avoid resource-tracker warnings.

Examples

Run 10 generations with a population of 32 where only one sample is displayed each batch and overlays with IDs 0, 3, and 4 are enabled:

>>> if __name__ == "__main__":
...     train(
...         num_iters=10,
...         pop_size=32,
...         show_samples=[False],   # broadcast later per batch
...         overlay_ids=[0, 3, 4],
...     )

Notes

This module expects the mariokart_ds.nds ROM to be available in the working directory and a valid savestate at index 3.
on_draw_memoryview expects an emulator-provided RGBX memory buffer (4 bytes/pixel), and returns a premultiplied ARGB32 array suitable for Cairo.
MODEL_KEY_MAP defines a simple thresholded policy: values >= 0.5 are pressed, and the accelerator button is always pressed when any action is taken.

class src.core.train.EmulatorProcessConfig[source]¶

Bases: TypedDict

id: int¶

sample: Genome¶

host: bool¶

show: bool¶

class src.core.train.EmulatorBatchConfig[source]¶

Bases: TypedDict

size: int¶

display_shm_names: list[str]¶

device: DeviceLikeType | None¶

overlay_ids: list[int]¶

metric_factories: list[Callable[[], Metric]]¶

class src.core.train.CheckpointRecord[source]¶

Bases: object

id: int¶

times: list[float]¶

dists: list[float]¶

src.core.train.safe_shared_memory(name: str, size: int)[source]¶

Create or replace a named POSIX shared-memory segment.

This helper guarantees that a shared-memory block with the given name exists with the requested size. If a stale block exists (e.g., from an earlier crashed run), it is closed and unlinked before creating a fresh one.

Parameters:

name – Symbolic name of the shared memory region (e.g., "emu_frame_0").
size – Size in bytes to allocate for the region.

Returns:

An opened handle to the new shared-memory block. The caller owns the handle and is responsible for closing it (and unlinking at teardown time).

Return type:

multiprocessing.shared_memory.SharedMemory

Raises:

ValueError – If size <= 0.
OSError – If the OS cannot allocate or map the segment.

Side Effects:

May unlink an existing segment of the same name.
Creates a new segment in the system shared-memory namespace.

src.core.train.initialize_emulator() → desmume.emulator.DeSmuME[source]¶

Initialize and prime a DeSmuME emulator instance.

Loads the MKDS ROM, restores a savestate (slot 3), mutes audio, and cycles once to ensure memory is initialized. Then spins until the emulator reports it is running.

Returns:: A ready-to-use emulator instance positioned at the savestate.
Return type:: DeSmuME

Notes

This function blocks until emu.is_running() returns True.
The ROM path "mariokart_ds.nds" and savestate index are hard-coded.

src.core.train.initialize_window(emu, config: EmulatorProcessConfig, batch_config: EmulatorBatchConfig) → SharedEmulatorWindow | None[source]¶

Create and initialize the tiled GTK window for live visualization.

Computes a near-square grid (n_rows × n_cols) based on the number of display-enabled processes, instantiates a renderer bound to emu, and returns a SharedEmulatorWindow configured to read from the provided shared-memory frame names.

Parameters:

emu – Active DeSmuME instance (used to build the renderer).
display_count – Number of display-enabled workers to tile.
shm_names – List of shared-memory segment names ("emu_frame_{id}").

Returns:

GTK window object ready to be shown.

Return type:

SharedEmulatorWindow

Side Effects:

Initializes a GTK/Cairo renderer via AbstractRenderer.

src.core.train.initialize_overlays(config: EmulatorProcessConfig, batch_config: EmulatorBatchConfig) → Queue | None[source]¶

Start a background overlay thread and return its work queue.

Given a list of overlay IDs, looks them up in AVAILABLE_OVERLAYS, starts a single daemon thread that consumes DeSmuME instances from a queue and applies the overlays. The queue is returned to the caller to submit per-frame overlay requests.

Parameters:

overlay_ids – List of overlay identifiers to enable (indexes into AVAILABLE_OVERLAYS).
device – Torch device on which overlay computations (if any) should run.

Returns:

If overlay_ids is non-empty, a Queue into which the caller should put(emu) once per frame, and put(None) on shutdown. Returns None when overlay_ids is empty.

Return type:

Queue | None

Notes

The overlay worker catches exceptions per overlay and propagates a summarized error message on failure via safe_thread().
Overlays are executed off the emulation thread to avoid jitter.

src.core.train.handle_controls(emu: desmume.emulator.DeSmuME, logits: torch.Tensor, max_frame_with_noise: int = 0)[source]¶

Apply model outputs to emulator controls with a simple threshold policy.

All values >= 0.5 are considered pressed for the corresponding MODEL_KEY_MAP entry. Additionally, when any action is pressed, the accelerator (mapped to MODEL_KEY_MAP[5]) is also pressed to keep the kart moving.

Parameters:

emu – Active emulator instance whose keypad state will be updated.
logits – 1D tensor of action activations aligned with MODEL_KEY_MAP.

Side Effects:

Calls emu.input.keypad_update(0) and emu.input.keypad_add_key(...) multiple times.

src.core.train.initialize_model(emu: desmume.emulator.DeSmuME, config: EmulatorProcessConfig, batch_config: EmulatorBatchConfig)[source]¶

src.core.train.safe_thread(func, proc_id, thread_id=0)[source]¶

Wrap a function for background execution with nicer error reporting.

The returned wrapper calls func(*args, **kwargs) and converts any exception into a concise message identifying the logical process and thread of origin.

Parameters:

func – Callable to wrap.
proc_id – Integer process identifier for error messages.
thread_id – Integer thread identifier for error messages.

Returns:

A new callable with identical signature that raises a concise Exception on failure.

Return type:

Callable

src.core.train.send_window_end_signal(id)[source]¶

Zero a per-process frame buffer to signal the host window to exit.

Parameters:: id – Process index whose frame buffer should be cleared.

Side Effects:

Writes zeros into the shared frame emu_frame_{id}, which is used by the host window’s polling logic to detect end-of-batch.

src.core.train.get_forward_func(emu: DeSmuME, model: EvolvedNet, device: DeviceLikeType | None = None)[source]¶

Build a closure that performs one model step.

The returned callable reads emulator memory for sensor inputs, constructs the model input vector, computes the control logits. When a terminal condition is reached (e.g. clock > 10000), it returns a NoneType value instead of logits.

Parameters:

emu – Active emulator instance to read game state from.
model – Evolved network to evaluate (expects 6 inputs → action logits).
device – Torch device on which tensors are constructed and the model runs.

Returns:

A no-argument function that returns either a 1D tensor of action logits or NoneType signaling the end of this individual’s run.

Return type:

Callable[[], torch.Tensor | None]

Sensor model:

Distances: forward/left/right obstacle distances are read and mapped through tanh(1 - d / s1) with s1 = 60.0 to compress range.
Angles: direction to the next checkpoint as (cos θ, sin θ, -sin θ).

Notes

Checkpoint bookkeeping appends tuples of (delta_time, distance_at_split).
This function reads directly from emulator memory via utility helpers.

src.core.train.run_training_batch(batch_population: list[Genome], show_samples: list[bool], training_stats: DictProxy[int, dict[str, float]], training_stats_lock, batch_config: EmulatorBatchConfig)[source]¶

Evaluate a batch of genomes concurrently, optionally with live display.

One process in the batch is promoted to the display host (first True in show_samples) and creates a tiled GTK window that reads from the per-process shared-memory frames listed in shm_names. Additional True entries run as display workers; False entries run headless.

Parameters:

batch_pop – Slice of the population to evaluate in this batch.
show_samples – Per-individual flags controlling display mode; exactly one True is chosen as the host (the first True), additional Trues are workers; all False means fully headless batch.
overlay_ids – Enabled overlay identifiers.
lock – IPC lock for synchronized writes to pop_stats.
pop_stats – Manager dict where each process writes a stats dict under its local batch index.

Side Effects:

Creates per-process shared-memory frame segments for display-enabled individuals.
Spawns processes with appropriate targets and joins them.

src.core.train.run_training_session(pop: list[Genome], num_proc: int | None = None, show_samples: list[bool] = [True], overlay_ids: list[int] = [], metric_factories: list[Callable[[], Metric]] = [], device: DeviceLikeType | None = None) → dict[int, dict[str, float]][source]¶

Evaluate the full population in parallel batches and collect statistics.

The population is partitioned into batches of size min(num_proc, remaining). Each batch is launched via run_training_batch(), returning when all processes in the batch complete and their stats have been merged.

Parameters:

pop – Full population of genomes to evaluate.
num_proc – Maximum number of concurrent processes. If None, uses os.cpu_count() - 1.
show_samples – List of booleans determining which individuals in each batch should display; a one-element list (e.g., [False]) is broadcast to the batch size on each iteration.
overlay_ids – Overlay identifiers to enable in display-enabled processes.

Returns:

Mapping of global population index to that individual’s checkpoint stats dict.

Return type:

dict[int, dict[int, list[tuple[float, float]]]]

Notes

This function uses a multiprocessing.Manager dict so that per-process stats can be retrieved without explicit pipes or queues.
Shared-memory frame buffers are currently not unlinked here; consider cleaning them in a higher-level teardown if needed.

src.core.train.fitness(pop_stats: dict[int, dict[str, float]], scorer: FitnessScorer) → list[float][source]¶

Compute scalar fitness from per-checkpoint stats.

The current fitness heuristic sums the recorded distances across all checkpoint splits for each individual.

Parameters:: pop_stats – Mapping of population index to that individual’s stats dict (checkpoint_id -> [(delta_time, distance_at_split), ...]).
Returns:: Fitness values ordered by population index.
Return type:: list[float]

src.core.train.train(num_iters: int, pop_size: int, log_interval: int = 1, top_k: int | float = 0.1, device: DeviceLikeType | None = None, scorer: FitnessScorer = <function default_fitness_scorer>, load_checkpoint_path: str | Path | None = None, **simulation_kwargs)[source]¶

Main evolutionary training loop (selection + mutation).

Repeats:

Evaluate the current population via run_training_session().
Rank by fitness.
Keep the best, and refill the population by mutating uniformly sampled parents from the top-k set.

Parameters:

num_iters – Number of generations to run.
pop_size – Number of individuals per generation.
log_interval – Print progress every N generations.
top_k – Either the number of top individuals to sample parents from, or a fraction in (0, 1] interpreted as a proportion of the population.
**simulation_kwargs – Passed through to run_training_session().

Side Effects:

Prints best fitness per log_interval.
Mutates and replaces the population in place each generation.

src.core.train.make_distance_metric()[source]¶

src.core.train.main()[source]¶

src.core package¶

Submodules¶

src.core.memory module¶

Mario Kart DS (MKDS) Emulator I/O & Geometry Utilities¶

Quick Start¶

Key Concepts & Conventions¶

Coordinate Systems¶

Units¶

Memory Map & Assets¶

Caching Model¶

Device Handling¶

Public API Overview¶

Clock & Course¶

Player & Objects¶

Camera & Projection¶

Checkpoints¶

Obstacles (Walls / Offroad)¶

Return Types & Shapes¶

Errors & Edge Cases¶

Performance Notes¶

Implementation Notes¶

Compatibility¶

src.core.metric module¶

src.core.model module¶

src.core.train module¶

Key concepts¶

Threading & processes¶

Shared-memory lifetime¶

Module contents¶