On this page

  1. The Problem
  2. Three-Layer Architecture
  3. The Eleven Shot Types
  4. Narrative Phases & Pacing
  5. Camera Smoothing & Damping
  6. Depth of Field & Focus

The Problem

In a traditional game camera follows a rule: aim at the player, or orbit them in a controlled way. In PolyFish, there is no player. There are only creatures - fish, dolphins, manatees - living their lives. The camera must become a nature documentary director, following action that you don't control, choosing shots that tell a story, and responding to unscripted behavior in real time.

You can't predetermine every camera cut. The creatures' behavior is emergent and AI-driven. A dolphin might hunt. A school of fish might scatter in panic. A manatee might drift lazily through kelp. The system must watch, recognize the drama, and frame it beautifully - all without human intervention.

The Core Challenge: How do you write a camera system that feels like it was choreographed by a cinematographer, when the cast's dialogue is pure chaos?

Three-Layer Architecture

The cinematic camera system is built on three independent, communicating layers:

Layer 1: DocumentaryDirector (Editorial Brain)

The director decides what shot to take and when. It runs a narrative state machine with five phases (ESTABLISHING, INTRODUCE, DEVELOP, CLIMAX, RESOLVE), each with a weighted pool of shot types. The director samples from the pools, respects phase transitions, and modulates duration variance (±20%) to keep pacing alive. It asks the Cinematographer for technical help, but stays in control of the narrative.

Layer 2: Cinematographer (Camera Operator)

The cinematographer handles the mechanics of framing a shot. Given a target creature and a shot type, it computes the exact camera position, look-at point, and field of view. It applies smoothing, velocity damping to prevent jitter, and direction continuity to avoid 180° flips. It respects the physical constraints of the scene - never clipping the seafloor, never going outside world bounds - and adjusts framing in real time as the subject moves.

Layer 3: EcosystemScout (Behavior Detector)

The scout watches all creatures and detects interesting moments: feeding, reproduction, predation, escape fleeing, clustering, and idle behavior. It tags each creature with a narrative interest score that reflects whether something dramatic is happening nearby. The director uses this to decide which creature to focus on. If a predator is hunting, that's a 10/10. If a fish is just grazing, that's a 2/10.

3
Layers
5
Narrative phases
11
Shot types
±20%
Duration variance

The Eleven Shot Types

Each shot type has a distinct geometry: a specific distance from the subject, an angle of approach, and a framing rule. The cinematographer computes all positions in world space, then applies temporal damping so cuts don't feel jarring.

1. ESTABLISHING_WIDE

High orbit, looking down at 45°. The camera is far back (5–7× subject size) and elevated 30° above the subject. Used at the start of a scene to show the broader environment - kelp forest, open water, other creatures nearby.

2. HERO_PORTRAIT

Classic rule-of-thirds: subject framed in the upper-left third of the screen, filling about 40% of the viewport. Distance is close (1.5–2.5× subject size). The camera orbits perpendicular to the subject's heading for a profile view that shows body shape.

3. CHASE_FOLLOW

Quartering angle behind a moving subject. The camera is behind and slightly to one side, so we see the subject's rear 3/4 view as it moves away. Perfect for hunting sequences or fleeing prey. Feels dynamic and kinetic.

4. SIDE_TRACK

Parallel tracking shot. The camera moves alongside the subject, maintaining distance and staying perpendicular to motion. Used for swimming sequences where you want to see the creature move across the frame.

5. GROUND_HIDE

Static low-angle from the seafloor or a rock outcrop. The camera sits still and watches the creature swim overhead. Used for moments when the camera is "hiding" and observing. Feels observational and naturalistic.

6. SNELLS_WINDOW

Looking upward from below toward the surface. Named after Snell's window, the optical phenomenon where you see the refracted sky when looking up through water. The camera is below the subject, looking up at a slight angle. Rare and striking.

7. SLOW_REVEAL

Begins with an extreme close-up on a detail (eye, fin, scales), then pulls back over 3–4 seconds to reveal the full creature. Dramatic and intimate. Often used at the start of a new narrative beat.

8. FLY_THROUGH

A dolly move that flies the camera between two points (e.g., through a gap in kelp, past a rock formation). The subject is near the path but not directly centered. Creates a sense of exploration and discovery.

9. REACTION_CUT

Quick, tight shot of a secondary subject. While the main action happens elsewhere, cut to another creature's reaction - a nearby fish flinching, a predator's head snapping up. Builds tension and interconnects the narrative.

10. MACRO_DETAIL

Extreme close-up with shallow depth of field. The camera is 20-30cm away from a detail (mouth, fins, eye). Only that detail is in focus; the background is bokeh. Feels intimate and scientific.

11. KELP_EDGE

Positioned on the periphery of the kelp forest, low, looking up and inward at roughly 75 degrees. Frames the kelp canopy above with light filtering through - a very cinematic environmental shot that grounds the viewer in the habitat.

Try the interactive demo below to see how each shot type positions the camera relative to a subject in the center of the scene:

Narrative Phases & Pacing

A documentary is not just a sequence of random shots. It's a story arc. The camera system drives this through five phases, each with different pacing and shot selection:

ESTABLISHING (2-3 shots)

Show the location. The shot pool includes ESTABLISHING_WIDE, SIDE_TRACK, SNELLS_WINDOW, and HERO_PORTRAIT to ensure creatures appear early. Durations are long (12-20 seconds per shot) so the viewer can absorb the space. No dramatic moments yet - just environment.

INTRODUCE (2-4 shots)

Introduce the main subject. The pool draws from HERO_PORTRAIT, SIDE_TRACK, SLOW_REVEAL, and MACRO_DETAIL. Durations are moderate. The camera gets closer, more interested in the creature's form.

DEVELOP (3-5 shots)

The action intensifies. More cuts, tighter framing. SIDE_TRACK is double-weighted in the pool, joined by CHASE_FOLLOW, SLOW_REVEAL, MACRO_DETAIL, and GROUND_HIDE. The camera follows behavior - if the creature hunts, stay close. If it flees, chase it.

CLIMAX (1-2 shots)

The moment of maximum drama. Kept deliberately short so the system cycles back to contemplation. The pool favors SIDE_TRACK, CHASE_FOLLOW, REACTION_CUT, and MACRO_DETAIL. The camera is almost twitchy - cutting on action, on gazes, on movement.

RESOLVE (2-3 shots)

The moment passes. Fade back to calm. SIDE_TRACK, SLOW_REVEAL, HERO_PORTRAIT, SNELLS_WINDOW, and ESTABLISHING_WIDE fill the pool. Durations relax. The camera settles, the pace breathes, and then the cycle begins again.

Duration Variance: Each shot's base duration is modified by ±20% via a random multiplier (0.8 to 1.2). This prevents the rhythm from feeling mechanical. 2-second shots become 1.6–2.4 seconds. The variance is seeded per-shot so it's deterministic but appears organic.

In the demo below, click "regenerate" to see a new random narrative timeline. Notice how the phase transitions happen at fixed times, but the shots within each phase are random samples from the phase's weighted pool:

Camera Smoothing & Damping

If the camera snapped instantly to each new framing, it would feel nauseating - constant jarring cuts. Instead, the cinematographer uses velocity damping to smooth the transition between shots over 0.5–1.0 seconds.

The key insight: store the camera's velocity (not just position), and when a new shot is chosen, don't aim the velocity directly at the desired position. Instead, gradually blend the velocity toward the new direction, using a spring damping coefficient (0.1–0.2). This prevents bouncing and overshooting.

const cameraDamping = 0.15; // 15% toward target per frame
const toTarget = targetPos.clone()
  .sub(camera.position).normalize();
cameraVelocity.lerp(toTarget, cameraDamping);
camera.position.add(cameraVelocity.multiplyScalar(moveSpeed));

Additionally, the system enforces direction continuity. If the camera is moving left-to-right, and a new shot would require a 180° flip, the system instead orbits the long way around. This avoids the disorienting "jump-cut" feeling where the scene suddenly flips.

const currentDir = camera.position
  .clone().sub(subject.position).normalize();
const desiredDir = desiredCameraPos
  .clone().sub(subject.position).normalize();

// If flip would be >90°, orbit the other way
if (currentDir.dot(desiredDir) < 0) {
  desiredCameraPos = subject.position.clone()
    .add(desiredDir.negate().multiplyScalar(distance));
}

Finally, the system applies orbit angle biasing based on the subject's heading. If a fish is swimming left, prefer shots from the left side (profile shots). If it's swimming toward the camera, bias toward head-on shots. This keeps framing consistent with the subject's direction of travel.

Depth of Field & Focus

The cinematographer also controls focus. Different shot types demand different apertures:

DEEP (f/16)

Everything in sharp focus. Used for ESTABLISHING_WIDE and FLY_THROUGH, where you want to see the entire scene. Aperture is very small.

MEDIUM (f/5.6)

Subject sharp, background soft. The default for HERO_PORTRAIT, CHASE_FOLLOW, SIDE_TRACK, and REACTION_CUT. Isolates the creature from clutter.

SHALLOW (f/2)

Subject detail sharp, background completely blurred. Used for MACRO_DETAIL and SLOW_REVEAL, where we want extreme focus separation.

The depth-of-field effect is implemented via a BokehPass in the Three.js post-processing pipeline. The focus distance is animated smoothly as the camera moves, so focus tracks the subject naturally without snapping.

const dofProfiles = {
  'DEEP': { focus: 50, aperture: 16, bladeCount: 0 },
  'MEDIUM': { focus: 15, aperture: 5.6, bladeCount: 6 },
  'SHALLOW': { focus: 5, aperture: 2, bladeCount: 8 }
};

const profile = dofProfiles[currentShotType];
bokehPass.uniforms.focus.value = THREE.MathUtils.lerp(
  bokehPass.uniforms.focus.value,
  profile.focus,
  0.1 // smooth over 10 frames
);
0.15
Camera damping
f/16–f/2
DOF aperture range
3
DOF profiles
0.5–1.0s
Transition time

Work in progress. The cinematic camera is the system Matt is most excited to keep pushing on. The architecture is solid and the shot types work well individually, but the overall pacing and narrative flow still need refinement. Transitions between phases can feel abrupt, subject selection sometimes misses the most interesting action, and the system doesn't yet have a strong sense of dramatic timing. It's close - but "close" in cinematography is the difference between a nature documentary and a surveillance camera. This is where the next round of iteration will focus.