Files
WebScraper/data_updating_rule.md

896 B

Abort-Safe Incremental JSONL Persistence Rule

Rule: Persist state using an append-only, fsync-backed JSONL log with atomic checkpoints.

Requirements

  • Write updates as single-line JSON objects (one logical mutation per line).
  • Append only (O_APPEND), never modify existing lines.
  • After each write batch, call fsync (or File::sync_data) before reporting success.
  • Treat a line as committed only if it ends with \n; ignore trailing partial lines on recovery.
  • Periodically create a checkpoint:
    • Write full state to state.tmp
    • fsync
    • Atomic rename to state.jsonl
  • On startup:
    • Load last checkpoint
    • Replay log lines after it in order
  • On abort/panic/crash:
    • No truncation
    • Replay guarantees no data loss beyond last fsynced line

Outcome

  • Crash/abort-safe
  • O(1) writes
  • Deterministic recovery
  • Minimal overhead