The expansion of MANUSCRIPT_MAP.md → Data & reproducibility and the methods-rigor lens of the peer-reviewer agent. Read this before drafting Methods, a data/code availability statement, or any task the prompt-router flags as [Methods].

The reproducibility test: could a competent stranger, with your paper and your deposited materials, regenerate your result? Methods are written to pass that test — not to summarize what you did, but to enable repetition. What cannot be repeated must be labeled "reported only," not dressed up as reproducible.

Reproducible vs reported-only — name which

State this plainly in MANUSCRIPT_MAP.md → Data & reproducibility and, where the venue expects it, in the manuscript. Conflating the two is a quiet overclaim.

	Definition	Obligation
Reproducible	Inputs (data) + procedure (code/protocol) are available; a third party can rerun and get the same result.	Deposit data + code; pin versions; document the run.
Replicable	An independent team, new data, same method, reaches a consistent conclusion.	Methods detailed enough to re-run from scratch.
Reported-only	A number/result the reader must take on trust (proprietary data, a closed-API model with no pinnable version, a one-off human evaluation).	Say so. Do not imply it can be regenerated when it cannot.

If part of the work is reproducible and part reported-only, the availability statement draws the line. "Offline-evaluation results are reproducible from the deposited code; the results against the proprietary model are reported only (closed API, version not pinnable)" is honest; silence implies more than the evidence supports.

Methods written for reproduction

A Methods section is a recipe a stranger can follow. Vague methods are the most common reason a reviewer cannot sign off. Specify, with exact values:

Category	Specify	Vague (bad) → Reproducible (good)
Quantities	counts, budgets, ratios, thresholds, durations, temperatures	"a small set of tasks" → "512 held-out tasks; gate threshold τ = 0.7"
Models	name, version/checkpoint, decoding configuration	"queried an LLM" → "GPT-4o (2024-08-06), temperature 0, top-p 1.0"
Harness	framework, version, tool inventory where it matters	"an agent loop" → "ReAct loop, max 10 steps, 7 tools (see Appendix A)"
Software	name and version, key parameters, random seed	"analyzed in Python" → "Python 3.11, scipy 1.11.3; seed = 42"
Settings	every non-default parameter that affects the result	"default settings" → list them; "default" is not reproducible across versions
Procedure	order of operations, baselines, replication (n), retries	"each task was run in triplicate (n = 3 independent rollouts)"

The rule of thumb: if changing it would change the result, report it. Field-specific reporting (prompts, decoding parameters, evaluation harness, seeds) lives in agent_docs/field/<discipline>.md — read it before discipline-specific methods.

Changing what the Methods say was done is a Protected Claim (CLAUDE.md) — it alters the record of the experiment. Confirm with the author; record in tasks/decisions.md.

Raw data is immutable

data/raw/ (and sources/, submitted/, *.frozen.*) are protected by protect-sources.sh — edits are blocked unless RESEARCH_APPROVED=1. This is not bureaucracy: if raw data can be edited in place, every downstream number is unverifiable.

Never edit data/raw/. Cleaning, transforming, excluding outliers → write to a new path (data/processed/, data/derived/) with a script that takes raw as input.
The pipeline raw → script → processed → result must be re-runnable. The script is the record of every transformation; "I cleaned it in a spreadsheet" is not reproducible.
Document exclusions. Any dropped sample/point is named, counted, and justified in Methods (ties to agent_docs/statistics.md — undisclosed exclusion is p-hacking).

Data & code availability statements

Most venues now require one. State what, where, and under what access:

Data availability — The processed datasets supporting this study are openly available
at <repository> under DOI <10.xxxxx/...> [a REAL minted DOI — never fabricate it;
leave [VALUE — verify] until the deposit exists]. Raw agent rollout logs are available from
the corresponding author on reasonable request owing to file size.

Code availability — Analysis code is archived at <Zenodo/OSF DOI> and mirrored at
<github.com/...> (commit <hash>). It reproduces all figures and Tables 1–3 from the
deposited processed data.

Discipline:

The deposit DOI is a real, minted identifier off the repository — subject to the same no-fabrication rule as any citation (agent_docs/citation-discipline.md; block-fabrication.sh blocks fake-shaped DOIs). If the deposit does not exist yet, write [VALUE — verify] / [CITE], not a placeholder DOI.
"Available on request" is the weakest form; prefer a public, versioned, DOI-bearing archive. If access is restricted, state the reason (privacy, size, third-party licence) — not as a default dodge.
Pin a commit hash or release tag so "the code" means a specific, frozen state.

Pre-registration

Where the design supports it (trials, prospective studies, confirmatory analyses), pre-registration separates confirmatory from exploratory claims and is the strongest defense against HARKing (agent_docs/statistics.md).

If the study was pre-registered: cite the registration (OSF / ClinicalTrials.gov / AsPredicted ID), and flag any deviation from the registered plan in Methods. A silent deviation is worse than no registration.
Analyses not in the registration are exploratory — label them as such; the honest verb is "suggests."
The kit does not invent a registration ID any more than a DOI — if you do not have it, flag it.

FAIR data

Aim for deposits that are Findable, Accessible, Interoperable, Reusable:

Findable — a persistent identifier (DOI) and rich metadata.
Accessible — retrievable by the identifier over a standard protocol; access conditions stated even when restricted.
Interoperable — open, non-proprietary formats (CSV/NetCDF over a vendor binary) with documented variables and units.
Reusable — a clear licence (CC-BY, CC0), provenance, and a data dictionary so the columns mean something to a stranger.

A spreadsheet with cryptic column names and no units is technically "available" and practically unusable. Reusable means a stranger can understand it.

Reporting standards (use the field's checklist)

Many fields have a community reporting standard; following one is the cheapest way to not miss a required element, and reviewers expect it. Examples (use the one that fits; check agent_docs/field/<discipline>.md):

Standard	For
PRISMA	systematic reviews & meta-analyses (flow diagram + checklist)
CONSORT	randomized controlled trials
STROBE	observational epidemiology (cohort/case-control/cross-sectional)
ARRIVE	animal research
MIAME / MINSEQE	microarray / sequencing data
TOP guidelines	journal-level transparency & openness

These are checklists, not prose generators — they tell you what must be present, not what to claim. Run the relevant one before submission; a missing item is a reviewer comment you can pre-empt.