Maze Light-Tracer Game

Navigate the dungeons of a procedurally generated maze with real-time traced light effects. Noise textures and interactive light sources mark the way.

This small game combines a procedurally generated dungeon maze with noise textures and fully ray traced light effects. Lighting is not statically precomputed but dynamic, as all lights sources can be moved by the player for example to mark a junction of interest.

Certain parts of this project base on learnings from the previous generic CPU ray tracer implementation.

Apart from playing with ray traced lighting, the performance goal is to provide a smooth real-time experience without GPU support. For example, achieving even up to 100 frames per second or overall FHD output resolution should easily be possible on contemporary hardware.

Usage and Gameplay

The goal is to navigate the dark dungeon maze by leveraging light sources to find the final emitting black hole. Collect light sources along the way to increase the power of the own torch. Sacrifice the visible range by strategically dropping lights again at will, in order to leave beacons as breadcrumbs that mark (un)visited paths for backtracking.

The locally built binary can be directly invoked:

./maze-tracer [-b] [-f] [-s seed] [-x width] [-y height]
-s seed
Define the level scene to draw as unsigned integer. Randomly chosen if omitted. Each level is fully reproducible by this number in any case.
-x width, -y height
Output resolution in pixels. The window is padded with black bars if both values do not match up.
-f
Disable subsampling, trace the full frame for pixel-correct results. This will reduce aliasing but increases the internal resolution by a factor of four. See also the USE_UPSAMPLE option.
-b
Do not accept any movement input, draw a predefined set of scenes per level and exit. This benchmark mode should produce comparable framerate performance statistics for different settings.

Depending on the USE_WASD option, movement is controlled by the corresponding input or arrow keys. Picking up and dropping lights is done via e or space, respectively. Pressing F5 will restart the level, F3 skips to generating the next level, Esc for exit.

Installation

Building the binary is done by the provided Makefile and only requires a standard GCC C++17 toolchain with X11 library development headers (e.g., libx11-dev and libxext-dev). A default non-debug build is done simply by:

make

For development, debugging, and benchmarking, there are also targets that wrap valgrind, callgrind, gdb, and clang-tidy. Documentation via doxygen can be created by make docs.

For best performance, a profile-guided build can be triggered by make profiled. The corresponding profiles of a run in benchmarking mode must be created by make profile beforehand.

Technical Background

In a first independent step, a maze is procedurally generated from the given seed, using a variant of the recursive division algorithm. This datastructure is the basis for generating the actual level scene of “dungeons”, which consists of tracing primitives such as planes, textures, and lights. For better orientation, a level-wide Perlin noise provides structured textures.

The fundamentals of the threaded processing pipeline are also derived from more or less well-known approaches:

Given this simplified intuition, the pipeline consists of a linear queue-based producer/consumer threading model. The queue of work items that connects the four main threads is actually a single-item sized stack that maintains the most recent frame context. Enqueue operations thus succeed in any case – possibly replacing an outdated task but producing “frame drop” events. Note that multiple frames can still be handled at once to a reasonable extent. The tracing and interpolation steps are in turn processed by their own thread pool, with one thread for each of the four frame corners.

+---------+      +-------+               +-------+               +-------+              
|Movement/|   +->|Trace A|--+         +->|Light A|--+         +->|Frame A|--+           
|Clip/    |   |  |-------|  |         |  |-------|  |         |  |-------|  |           
|Objects/ |   +->|Trace B|--+         +->|Light B|--+         +->|Frame B|--+           
|Lights   |   |  |-------|  |         |  |-------|  |         |  |-------|  |           
+---------+   +->|Trace C|--+         +->|Light C|--+         +->|Frame C|--+           
 ^       |    |  |-------|  |         |  |-------|  |         |  |-------|  |           
 |       |    +->|Trace D|--+         +->|Light D|--+         +->|Frame D|--+           
 |       |    |  +-------+  |         |  +-------+  |         |  +-------+  |           
 |       v    |             v         |             v         |             v           
 |      +-+  +---------------+  +-+  +---------------+  +-+  +---------------+  +------+
 |      |Q|->| Tracer Thread |->|Q|->| Lights Thread |->|Q|->| Render Thread |->|Draw X|
 |      +-+  +---------------+  +-+  +---------------+  +-+  +---------------+  +------+
+------+                                                             |                  
|Main: |<----------------------Pipeline duration feedback------------+          +------+
|Delay/|                                                                        |Input |
|Input |<------------------------------Key presses------------------------------|Thread|
+------+                                                                        +------+
                                                                                        
|--Scene-----|------Objects------|---------Light---------|--Texture--|--Sample--|-X11--|

In Xlib, the default is to “auto-repeat” keystrokes, i.e., to instantly send press and release events. This is hardly suitable for a smooth but precise movement – also in any non-axis-aligned direction. Unfortunately, the corresponding XAutoRepeatOff keyboard setting is not per-window but affects the whole X display. In order to workaround this, a special thread continuously watches the XQueryKeymap state and accumulates keypress durations.

Configuration Details

Apart from the commandline arguments for resolution, the following settings in config.hpp might be of main interest:

raster_per_second
Determines the movement speed on the basis of keypress duration.
min_fps, max_fps
Interval of the target framerate. Directly defines the delay between updates, which gets incrementally adjusted according to the sliding average of current pipeline performance.
USE_WASD
Use WASD+E instead of arrow keys for movement.
USE_UPSAMPLE
Enable bilinear interpolation when subsampling, nearest neighbour otherwise.
USE_TEXTURE_SATURATION
Saturation of texture colors, 1.0 for full intensity but varying lighting.
USE_LIGHT_PICKUP
Overall “game mode”: Explore the level with highest visibility, enable pickup and dropping of lights, and grow/shrink player light when doing so.
NO_CLIP

Tweaking Appearance

camera_z, viewport_z
Camera and viewport raster z-distance. Determines the DOV and FOV, respectively.
USE_LIGHTS
Generate light sources in levels.
USE_LIGHTBOX
Try to optimize light volume boxes by cutting down scene overlaps.
USE_TEXTURE
Use uniform, colored, or Perlin noise textures, respectively.
USE_PERLIN_INTERPOLATE
Enable polynomial noise interpolation, avoiding blocky artifacts.
USE_TEXTURE_GRADIENT
Blend textures between levels (“doors”).

Float SIMD Instructions

SIMD vector instruction optimizations on the main vertex datastructure are configured by USE_SIMD. When enabled, explicit GCC x86 extensions are used, in contrast to the immintrin.h intrinsics wrappers, which are not constexpr-compatible.

Not all target architectures might profit equally from an increased level of SIMD tweaks, some might even perform noticeably worse – YMMV. The -b benchmark mode can help with finding the local sweet spot.

Output control

USE_LOG_STATS
Log verboseness per-frame timing statistics, 03. Useful for determining which part of the pipeline is limiting.
SQUEEZE_FACTOR
When less than 1, overlap frames by premature enqueuing, thereby increasing the chance of more than one frame being processed at once. Increases the pipeline saturation when >> 4 threads are supported.
USE_X
Enable X image output and key input, should only be disabled for “headless” benchmarking.
USE_PPM
If X is disabled, write frames as PPM images for debugging instead.
USE_X_SHM
Enable X11 SHM extension for the frame raster image, avoiding the need for copy to the X server connection. Can have a noticeable performance impact. When disabled, the -lXext linker flag can be omitted.

Code & Download