Protocol InternalsSlot State Machine

SP8D’s slot state machine is the engine that powers its lock-free, high-performance message passing. This page provides a deep, practical, and visual guide to the slot lifecycle, state transitions, fairness, and recovery—essential for advanced users, implementers, and anyone extending or debugging the protocol.

Slot State Machine: The Heart of SP8D

Who should read this? Advanced users, protocol implementers, and anyone debugging or extending SP8D internals. This page is your canonical, up-to-date reference for the slot state machine: the core of SP8D’s lock-free protocol.


Quick Reference: Slot States & Transitions

StateCode ValueDescriptionAllowed TransitionsAtomic Operation
Empty0Slot is available for claim→ ClaimedCAS (Producer)
Claimed1Slot is being written/read→ Ready, → EmptyCAS (Producer/Consumer/Sweeper)
Ready2Slot contains a message→ ClaimedCAS (Consumer)
  • CAS: Compare-and-swap via Atomics.compareExchange
  • Generation Tag: Incremented on reclaim/wraparound to prevent ABA problems and enables safe wraparound ABA

What is a Slot?

A slot is a fixed-size region in the shared buffer that holds a single message. Each slot has:

  • Status: Empty, Claimed, or Ready
  • Generation (Cycle) Tag: Prevents ABA problems and enables safe wraparound
  • Claim Timestamp: For diagnostics and recovery
  • Payload: The actual message data

Slot Lifecycle & State Transitions

Diagram: Slot state machine: transitions between Empty, Claimed, and Ready, with atomic operations and sweeper recovery.


Visual: Message Flow Through the Slot State Machine

Diagram: Step-by-step message flow: producer claims and writes, consumer claims and reads, sweeper reclaims if needed.


Step-by-Step: State Transitions in Code

Below is a simplified TypeScript walkthrough of the slot state machine, based on the actual sp8d-core.ts implementation:

// Pseudocode for slot state transitions
function producerClaim(slot) {
  // Atomically claim an empty slot
  if (Atomics.compareExchange(slot.status, 0, 1) === 0) {
    // Write payload, set ready
    slot.payload = ...;
    Atomics.store(slot.status, 2); // Ready
  }
}
 
function consumerClaim(slot) {
  // Atomically claim a ready slot
  if (Atomics.compareExchange(slot.status, 2, 1) === 2) {
    // Read payload, mark empty and increment generation
    const data = slot.payload;
    slot.generation++;
    Atomics.store(slot.status, 0); // Empty
    return data;
  }
}
 
function sweeperReclaim(slot, now) {
  // If slot is stuck in Claimed too long, reclaim
  if (slot.status === 1 && now - slot.claimTimestamp > sweepTimeoutMs) {
    slot.generation++;
    Atomics.store(slot.status, 0); // Empty
    // Update diagnostics
  }
}
⚠️

Always use atomic operations for state transitions. Never write directly to status except via Atomics.


State Definitions

  • Empty (STATUS_EMPTY = 0): Slot is available for a producer to claim.
  • Claimed (STATUS_CLAIMED = 1): Slot is being written (by producer) or read (by consumer). No other agent may access.
  • Ready (STATUS_READY = 2): Slot contains a message, ready for a consumer to claim.

Atomic Operations & Fairness

  • Claiming a slot: Producers and consumers use Atomics.compareExchange to move a slot from Empty→Claimed or Ready→Claimed, ensuring lock-free, wait-free access.
  • Generation Tag: Incremented on each wraparound or reclaim, preventing stale reads/writes and enabling robust recovery.
  • Head/Tail Pointers: Each segment tracks its own head (producer) and tail (consumer) for round-robin fairness.

Sweeper: Automatic Recovery

If a slot is stuck in Claimed (e.g., a thread dies mid-operation), the sweeper detects and reclaims it after a timeout (sweepTimeoutMs):

  • Increments generation tag
  • Marks slot as Empty
  • Updates diagnostics (reclaimed, errors)

Without the sweeper, a single stuck thread could permanently block a slot, reducing throughput and breaking fairness.


Visual: Slot Array Anatomy

Diagram: SP8D segment ring buffer. Head points to the next slot to write, Tail to the next slot to read. Slots contain status, generation, timestamp, and payload.


Best Practices & Gotchas

  • Never skip status transitions: Always use atomic CAS for state changes.
  • Monitor diagnostics: High conflicts or reclaimed counts may indicate contention or stuck agents.
  • Tune sweepTimeoutMs: Too low may cause false reclaims; too high may delay recovery.
  • Use generation tags: Always check generation if implementing custom consumers/producers.

Advanced Scenarios: Multi-Segment, MPMC, and Contention

  • Multi-Segment Scaling: Each segment is an independent ring buffer, enabling scalable MPSC/MPMC patterns. Producers and consumers are mapped to segments for load balancing.
  • Contention Handling: High contention is mitigated by round-robin head/tail pointers and atomic slot claims. Monitor conflicts in diagnostics to detect hotspots.
  • Fairness: The protocol ensures round-robin fairness by advancing head/tail pointers per segment. Starvation is prevented by the sweeper and diagnostics.
// Example: Mapping producer/consumer IDs to segments
const segment = segments[agentId % segments.length];
// Each agent operates on its assigned segment for reduced contention

For high-throughput workloads, increase the number of segments to reduce contention and improve fairness.


Troubleshooting

  • Stuck slots: Check for high reclaimed or errors in diagnostics.
  • Starvation: Ensure all agents are making progress; use diagnostics to detect lagging segments.
  • Protocol violations: Use .validate() to check for invalid slot states or generations.

Troubleshooting & Debugging Checklist

  • Stuck slots: Check diagnostics for high reclaimed or errors. Use .validate() to inspect slot states.
  • Starvation: Ensure all agents are making progress; lagging segments may indicate contention or misconfiguration.
  • Protocol violations: Use .validate() and review slot generation tags for inconsistencies.
  • Performance issues: Monitor conflicts and tune segment count and sweepTimeoutMs.
  • Custom extensions: Always use atomic operations and check generation tags when implementing custom logic.
⚠️

Use the diagnostics API and .validate() method regularly during development and in production monitoring.


Sweeper & Recovery: Flowchart and Checklist

Diagram: Sweeper flowchart: detects stuck slots and reclaims them after timeout.

Recovery Checklist:

  • Monitor diagnostics for high reclaimed or errors counts
  • Tune sweepTimeoutMs for your workload
  • Ensure generation tags increment on reclaim
  • Validate slot state after recovery (use .validate())
⚠️

Setting sweepTimeoutMs too low can cause false reclaims; too high can delay recovery. Always test with your real workload.


Reference Implementation: Annotated Code

See the canonical implementation in sp8d-core.ts:

// Example: Producer claim logic (simplified)
if (Atomics.compareExchange(slot.status, STATUS_EMPTY, STATUS_CLAIMED) === STATUS_EMPTY) {
  // Write payload, set ready
  slot.payload = ...;
  Atomics.store(slot.status, STATUS_READY);
}
 
// Example: Consumer claim logic (simplified)
if (Atomics.compareExchange(slot.status, STATUS_READY, STATUS_CLAIMED) === STATUS_READY) {
  // Read payload, mark empty and increment generation
  const data = slot.payload;
  slot.generation++;
  Atomics.store(slot.status, STATUS_EMPTY);
}

Where to Go Next