Data Catalog
The data catalog describes the kinds of public records the site is expected to support before those records exist at scale. It keeps the data section from becoming a dumping ground for CSV files and ad hoc metrics. Each data publication needs a schema, source boundary, update cadence, and correction process before it becomes citable.
The public data surface should favor small, well-described records over broad claims. Aggregate records can support transparency, but they can also be misread if the population, exclusions, and collection window are unclear. Every dataset should state what it measures and what it does not measure.
Planned Data Surfaces
| Surface | Record Type | Required Context |
|---|---|---|
| Advisory Index | Advisory summaries and canonical URLs | Status, affected product, publication date, remediation state |
| Report Index | Public report summaries | Methodology, publication state, revision history |
| Tool Registry | Browser and downloadable tool metadata | Local-only flag, version, input boundary, support status |
| Schema Registry | JSON schemas and schema versions | Version, compatibility notes, validation command |
| Aggregate Metrics | Counts and timelines derived from public records | Population, time window, exclusions, bias notes |
Citation Rule
A data page should be citable without private context. The reader should be able to identify the schema, the source boundary, the record date, the update cadence, and the correction path from the public page. If a record cannot supply those fields, it should remain internal or be published as narrative context instead of structured data.
Correction Path
Data corrections should preserve trust by being explicit. A corrected record should identify the field changed, the reason for the change, the review date, and whether derived metrics were affected. Silent changes to public data are reserved for non-semantic formatting errors.