← Docs home

📋 Spec Kit

A specification is a family of documents that defines a system about to be built. Write the least amount of document required. The value is in the thinking the writing induces, not in the artifact it solidifies as.

Sections 1 and 2 are the laws. Read them once.
Section 3 is the build. Eleven steps, in order, top to bottom.
Sections 4 to 6 are the gates you check before you stop.

📚 Reference Specs — READ THESE FIRST

Two worked specifications ground this discipline. READ BOTH before you write a spec. They are the shape every step below is teaching you to produce.

references/context-reel.memory.spec.md — READ IT. The context-reel spec the snippets in Section 3 are drawn from.

The snippets quoted throughout Section 3 are excerpts from these files. Read the full specs to see each step in its finished context.

1. 🧭 Pick the Lightest Document

A spec is a family of documents, not one thing. Pick by the ambiguity you need to kill.

1.1 The Decision Matrix

Situation	Document
Regulated, contractual, safety-critical	SRS
Non-trivial build, design consensus needed	Design doc
Cross-team or contentious, durable consensus	RFC
One load-bearing decision and its rationale	ADR
What to build and why it matters to users	PRD
Trivial, obvious, no real trade-offs	Write code

1.2 Do This

Name the document by what it decides.
Default to a design doc for a month of work or a cross-team change.
Reserve the heavyweight SRS for audited or contractual work.

1.3 Do Not Do This

Do not reach for an SRS to ship a one-line change.
Do not skip the doc on a contentious decision because writing feels slow.
Do not journal. A spec is normative and forward-looking, not a diary.

2. ⚖️ The Laws You Write Under

Three idioms make a requirement testable. Name them once near the top, then obey them everywhere.

2.1 RFC 2119 Keywords

Use MUST, MUST NOT, SHOULD, SHOULD NOT, MAY in ALL CAPS.
Lowercase "must" carries no weight. Only the capital keyword is normative.
SHALL is the synonym of MUST. Pick one per document.

Paste this clause under Status so the keywords bind:

The key words MUST, MUST NOT, SHOULD, SHOULD NOT, and MAY in this
document are to be interpreted as described in RFC 2119 and RFC 8174
when, and only when, they appear in all capitals.

2.2 EARS: One Shape Per Requirement

Ubiquitous     The <system> MUST <response>.
Event-driven   When <trigger>, the <system> MUST <response>.
State-driven   While <state>, the <system> MUST <response>.
Unwanted       If <condition>, then the <system> MUST <response>.
Optional       Where <feature is present>, the <system> MUST <response>.

2.3 Given, When, Then

Acceptance criteria turn a requirement into a binary test.
Write the observable outcome, never the implementation.
One to three criteria per story. Atomic. Self-contained.

3. 🦴 The Build, Step by Step

Write the sections in this order, top to bottom. Each step names the section, the rule, and a snippet from the context-reel.spec.md reference.

The section order to produce:

Title and thesis
Status
Document Type
Context
Vocabulary
Goals
Non-Goals
Product Requirements
Proposed Design
Cross-Cutting Concerns
Acceptance Criteria

Step 1 — Title and Thesis

One line that states the system's defining tension.

The thesis is your compression test. If you cannot say the system in one line, the scope is not clear enough to write yet.

# context-reel Spec

_N_ Models, 1 History.

Keep the title plain: <system> Spec.
Make the thesis the one sentence a reader could repeat back.

Step 2 — Status

Mark the draft state, then bind the keywords.

## Status

Draft.

The key words MUST, MUST NOT, SHOULD, SHOULD NOT, and MAY in this
document are to be interpreted as described in RFC 2119 and RFC 8174
when, and only when, they appear in all capitals.

Status is Draft while the system is unshipped.
The boilerplate is not decoration. It is what makes MUST mean MUST.

Step 3 — Document Type

Name which document this is. Justify the weight by elimination.

Say why not the heavier doc, and why not the lighter one. This is where you defend the cost of the page.

## Document Type

This is a design doc. It defines the product contract for context-reel.

context-reel is not safety-critical or contractual enough for an SRS.
It is not a single decision, so it is not an ADR. The ambiguity is
product shape, vocabulary, interaction model, and trust boundary.

Name the doc kind in the first sentence.
Name the ambiguity it kills in the last.

Step 4 — Context

The landscape and why now. Name the center of gravity.

## Context

context-reel is a chat workspace where multiple frontier models
share one history.

The chat is the product center. The editor, config roster, document
rail, and shortcut rail support the same shared history.

One or two short paragraphs. Link out for depth.
State what is central and what is support. Rank it here.
No history of the world. No background you find interesting but the reader does not need.

Step 5 — Vocabulary

Define every load-bearing term as a controlled set. One concept, one name.

This is the highest-leverage section. A blurred term becomes a blurred requirement. Define the words a reviewer could otherwise conflate, like provider versus model versus selected model.

## Vocabulary

**Provider** is the company or service that offers models, such as
OpenAI, Anthropic, Google, or xAI.

**Model** is one model offered by a provider, such as GPT 5.5.

**Configured model** is a saved roster entry naming a provider, one
model, a display name, and the key environment variable.

**Selected model** is the configured model selected for the next
submitted message.

Bold the term. One sentence under it.
Pick one name per concept and use only that name everywhere.
The vocabulary you set here becomes enforceable in Step 8.

Step 6 — Goals

Bulleted. Each goal is one measurable capability.

## Goals

- One shared history contains turns from multiple frontier models.
- The selected model answers the next message.
- Every model receives the same history as context when it answers.
- Provider API keys stay on the server.
- One concept has one name in the UI, code, tests, and documentation.

Three to nine bullets. If the list runs longer, you may be specifying two products. Split them.
Goals carry impact, not implementation.

Step 7 — Non-Goals

The section everyone skips. It is your scope fence.

A non-goal is something that could reasonably be in scope and is deliberately left out. It is not a negated requirement. "Shall not crash" is a requirement, not a non-goal.

## Non-Goals

- context-reel does not define agents.
- context-reel does not define a lead model.
- context-reel does not define swarm orchestration.
- context-reel does not treat README prose as the specification.

Name the scope a reader would otherwise assume you cover.
Each non-goal you write is a scope creep you stop before it starts.

Step 8 — Product Requirements

The craft. Group by surface. One requirement, one capability.

Four properties decide a requirement: necessary, unambiguous, singular, verifiable. Verifiable is the controlling one. If a test cannot mark it pass or fail, it is not a requirement.

8.1 Group by Surface

Use a sub-heading per area, then bullet the requirements under it. context-reel uses: Application, Chat, Models, Editor, Shortcuts, Streaming, Secrets, Markdown Rendering, Persistence, Documentation, Tests.

8.2 One Capability Each

Split every requirement that uses "and" or "or". It is two requirements wearing one number.
Delete "etc." and "but not limited to". An open list cannot be verified.

8.3 Pick the EARS Shape

- context-reel MUST send the full history with each chat request.
- If no model is selected, then context-reel MUST prevent submission.
- When the user submits a message, context-reel MUST append the user
  message to the history immediately.
- While a response is streaming, context-reel MUST provide a stop control.

8.4 Enforce the Vocabulary

The strongest move in this spec: turn Step 5 into requirements that fail on drift. Name the one true term, ban the synonyms, and make a test fail when the banned word appears.

- context-reel MUST use the term "selected model" for the configured
  model selected by the user.
- context-reel MUST NOT use agent, lead, active, primary, owner,
  captain, orchestrator, or target as synonyms for selected model.
- context-reel tests MUST fail if code, UI, or documentation
  introduces agent or lead-model vocabulary.

8.5 Make the Trust Boundary a Group

Pull every secret rule into one named block so the boundary is auditable in one place.

### Secrets

- context-reel MUST read provider API key values only on the server.
- context-reel MUST NOT send provider API key values to the browser.
- context-reel MUST NOT write provider API key values to IndexedDB,
  localStorage, sessionStorage, or history.

8.6 Do Not Do This

Do not free-write a requirement as a paragraph. The shape is the discipline.
Do not use a weak word: fast, robust, secure, user-friendly. Replace it with a measurable threshold.

Step 9 — Proposed Design

Prose. Overview first, then the parts that touch the trade-offs.

The design restates the requirements as one coherent picture. It adds no new rules. It names the boundaries the requirements imply.

## Proposed Design

context-reel keeps three top-level work areas mounted: editor, chat,
and config. View changes hide and show areas without destroying state.

Chat is the center. Editor, config, document rail, and shortcut rail
are downstream support surfaces.

Markdown rendering is a trust boundary. Output must be sanitized or
escaped before any Svelte {@html} injection.

Lead with the shape, then drill into the parts that carry risk.
State each trust boundary in plain words.
No code except one genuinely novel algorithm. Sketch and link the rest.

Step 10 — Cross-Cutting Concerns

The concerns a feature-by-feature read misses.

## Cross-Cutting Concerns

### Security
Provider API key values are secrets. They stay server-side.
Model output is untrusted. It must not become executable DOM.

### Privacy
The history is local user data. Transmit it only as part of an
explicit chat request.

### Observability
Streaming failures must be visible. Silent failure is not acceptable.

### Accessibility
Roster controls must have stable accessible names.

Cover security, privacy, observability, accessibility.
Each entry is an invariant, not a feature. Keep it short.

Step 11 — Acceptance Criteria

Given, When, Then. One block per feature. Binary outcomes.

Each block proves one requirement. The outcome is observable, never an implementation detail.

Given the roster contains multiple configured models
When the user selects one model
Then context-reel marks that model as the selected model
And context-reel does not label it agent, lead, active, or owner

Given a configured model uses OPENAI_API_KEY
When context-reel stores the roster in the browser
Then the browser storage contains OPENAI_API_KEY
And the browser storage does not contain the API key value

Given model output contains raw HTML with script behavior
When context-reel renders the message
Then the raw HTML does not execute
And the rendered DOM contains no executable script from that output

Map each block back to a requirement from Step 8.
Store the environment variable name. Never the value. The pair above is the testable shape of the secret boundary.

4. 🚫 Ban Weak Words

Weak words feel like requirements and verify like wishes. Replace each with a measurable threshold.

4.1 The Rewrite

❌ Vague	✅ Verifiable
The system should be fast.	The system MUST return p95 under 200 ms.
The UI should be user-friendly.	A user MUST complete checkout in 3 clicks.
Support many concurrent users.	Support 1,000 users at p95 under 500 ms.
Find by name, date, etc.	List every field. Delete "etc."

4.2 The Banned List

fast, quick, easy, user-friendly, flexible, robust, reliable, intuitive
secure, timely, adequate, sufficient, minimize, maximize, state-of-the-art

5. 🪤 Anti-Patterns and Tells

The recurring ways specs fail. Each is cheap to spot.

Over-specification. The "how" inside the "what". Nobody reads it.
Vagueness. Weak words with no threshold. Untestable.
Skipped non-goals. The scope fence left open.
Journaling. Lessons and confessions. A spec is normative, not a diary.
Stale doc. Drifted from the code, still trusted.
Wrong weight. An SRS for a trivial change, or no doc for a contentious one.

The tells:

A requirement contains "and", "or", "etc.", or a weak word. Rewrite it singular and measurable.
The doc passes 20 pages. Split the problem into sub-problems with their own docs.
The goals list passes nine bullets. You may be building two features.

6. ✅ Done When

Check every gate before you stop. Each is pass or fail.

The title states the system in one line.
Status carries the RFC 2119 clause.
Document Type names the doc kind and the ambiguity it kills.
Context names the center of gravity in two paragraphs or fewer.
Vocabulary defines every load-bearing term, one name each.
Goals are three to nine measurable bullets.
Non-Goals fence the scope a reader would otherwise assume.
Every requirement is singular, carries a capitalized keyword, and is verifiable.
No requirement contains "and", "or", "etc.", or a weak word.
The vocabulary is enforced by a requirement and a failing test.
The secret boundary is one auditable group.
Proposed Design adds no new rules and names each trust boundary.
Cross-Cutting Concerns cover security, privacy, observability, accessibility.
Each acceptance block maps back to a requirement and tests an observable outcome.

🏁 Closing

Pick the lightest document. Write one only when it earns its cost. Build it in order: title, status, type, context, vocabulary, goals, non-goals, requirements, design, concerns, acceptance. Name the idiom: RFC 2119, EARS, Given-When-Then. Ban the weak word. Verifiable or it is not a requirement. Then stop.