Checkpoints

A checkpoint is a named snapshot of an instance's disk state at a specific point in time. Checkpoints enable rollback, branching, and parallel execution.

What is a Checkpoint?

When you create a checkpoint, Firecase flushes all pending writes to S3 and records the current manifest version. The checkpoint is a pointer to that manifest version — it doesn't copy any data. This makes checkpoint creation nearly instant regardless of disk size.

Since chunks in S3 are immutable and manifests are versioned, the checkpoint's data is preserved even as the instance continues to write new data.

Creating a Checkpoint

Bash
curl -X POST https://api.firecase.ai/instances/{id}/checkpoints \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"label": "after-setup"}'

This triggers:

  1. A flush of all dirty data from the VM to S3
  2. A new manifest version is uploaded
  3. A checkpoint record is created pointing to that manifest version

Checkpoint Properties

PropertyDescription
idUUID of the checkpoint
instance_idThe instance this checkpoint belongs to
manifest_versionThe manifest version this checkpoint points to
labelOptional human-readable label
disk_size_bytesDisk size at checkpoint time
created_atWhen the checkpoint was created

Rollback

Rolling back restores the instance's disk to a previous checkpoint:

Bash
curl -X POST https://api.firecase.ai/instances/{id}/rollback \
  -H "Authorization: Bearer $API_KEY" \
  -d '{"checkpoint_id": "..."}'

The VM must be stopped before rolling back. After rollback:

  • The instance's manifest points to the checkpoint's version
  • All data written after the checkpoint is abandoned (but chunks still exist in S3)
  • Starting the VM will boot with the checkpoint's disk state

Forking from a Checkpoint

You can fork an instance to create a completely new instance from a checkpoint:

Bash
curl -X POST https://api.firecase.ai/instances/{id}/fork \
  -H "Authorization: Bearer $API_KEY" \
  -d '{"name": "my-fork", "checkpoint_id": "..."}'

The forked instance:

  • Gets a new instance ID and manifest lineage
  • Shares all chunks with the original (copy-on-write)
  • Can be started, modified, and checkpointed independently
  • Has zero data-copy cost — forking is instant

Copy-on-Write in Detail

When you fork or rollback, no data is physically copied. Both the original and the fork reference the same chunks in S3:

Text
Original Instance (v5):     Forked Instance (v1):
┌────┬────┬────┬────┐       ┌────┬────┬────┬────┐
│ A  │ B  │ C  │ D  │       │ A  │ B  │ C  │ D  │
└─┬──┴─┬──┴─┬──┴─┬──┘       └─┬──┴─┬──┴─┬──┴─┬──┘
  │    │    │    │              │    │    │    │
  ▼    ▼    ▼    ▼              ▼    ▼    ▼    ▼
 [S3 chunks — shared by both instances]

When the fork writes to chunk B, a new chunk B' is created:

Text
Original Instance:           Forked Instance:
┌────┬────┬────┬────┐       ┌────┬────┬────┬────┐
│ A  │ B  │ C  │ D  │       │ A  │ B' │ C  │ D  │
└─┬──┴─┬──┴─┬──┴─┬──┘       └─┬──┴─┬──┴─┬──┴─┬──┘
  │    │    │    │              │    │    │    │
  ▼    ▼    ▼    ▼              ▼    │    ▼    ▼
 [Original S3 chunks]              ▼
                            [New chunk B']

Chunks A, C, and D are still shared. Only modified chunks diverge.

Use Cases

Parallel Testing

Fork 10 instances from the same checkpoint, run different test suites in parallel, collect results:

Python
checkpoint = client.checkpoints.create(instance_id, label="pre-test")

forks = []
for suite in test_suites:
    fork = client.instances.fork(
        instance_id,
        name=f"test-{suite}",
        checkpoint_id=checkpoint.id
    )
    client.vms.start(fork.id)
    forks.append((fork, suite))

# Run tests in parallel across all forks
results = await asyncio.gather(*[
    run_test(fork.id, suite) for fork, suite in forks
])

Safe Experimentation

Checkpoint before risky operations. If something breaks, rollback:

Python
checkpoint = client.checkpoints.create(instance_id, label="before-experiment")

result = client.exec(instance_id, command=["./risky-script.sh"])
if result.exit_code != 0:
    client.vms.stop(instance_id)
    client.instances.rollback(instance_id, checkpoint_id=checkpoint.id)
    client.vms.start(instance_id)

Incremental Builds

Checkpoint after dependency installation, fork for each build:

Python
# One-time setup
client.exec(instance_id, command=["apt-get", "install", "-y", "build-essential"])
deps_checkpoint = client.checkpoints.create(instance_id, label="deps-installed")

# For each build, fork from deps checkpoint
for commit in commits:
    build_env = client.instances.fork(instance_id, checkpoint_id=deps_checkpoint.id)
    client.vms.start(build_env.id)
    client.exec(build_env.id, command=["git", "checkout", commit])
    client.exec(build_env.id, command=["make", "build"])

Limits

  • Maximum checkpoints per instance is controlled by the instance's profile (default: 10)
  • Checkpoint creation requires a running VM (to flush dirty data)
  • Rollback requires the VM to be stopped

Next Steps