Dokmatiq DOKMATIQ

OAIS (ISO 14721)

International reference model for digital long-term archives — defines roles, information packages and functional entities an archive must fulfil to keep content readable and understandable for decades.

Also known as: Open Archival Information System, ISO 14721, long-term archive reference model

Short definition

OAIS (Open Archival Information System) is a reference model for digital long-term archives, published by the Consultative Committee for Space Data Systems (CCSDS) and standardised as ISO 14721. The origin: NASA needed a model for decades-long archiving of satellite data. Today OAIS is the conceptual framework for any archive that wants to keep content usable across technology generations.

OAIS does not prescribe technology — it defines roles, packages and functional entities. The concrete implementation (PDF/A, WORM storage, signatures, timestamps) remains open.

The three information packages

OAIS thinks in Information Packages (IPs) — the heart of the model:

PackageShortRole
Submission Information PackageSIPwhat the producer hands to the archive
Archival Information PackageAIPwhat is stored inside the archive
Dissemination Information PackageDIPwhat a consumer receives from the archive

An archive validates and transforms a SIP into an AIP (e.g. converts to PDF/A, enriches metadata, validates signatures). On request it builds DIPs from AIPs for consumers.

The six functional entities

OAIS partitions the archive into six Functional Entities:

  1. Ingest — accept SIPs, validate, transform into AIPs
  2. Archival Storage — store AIPs, monitor bit-level integrity
  3. Data Management — metadata, catalogues, search
  4. Preservation Planning — strategies for format migration, obsolescence tracking
  5. Access — deliver DIPs, enforce access rights
  6. Administration — policies, contracts, operations

A GoBD-compliant archive implements these functions organisationally — even though the German administrative circular doesn’t reference the model by name.

The three external roles

OAIS names three external actors:

  • Producer — creates content, submits SIPs
  • Consumer — uses the content, receives DIPs
  • Management — sets policies, bears responsibility

Archives often have internal differentiation (curators, IT operations), but under OAIS all of these are part of the archive.

Designated Community

A central OAIS concept: the Designated Community — the audience for whom the archive must keep content understandable. For a tax authority: auditors 10 years from now. For a library: researchers and the public. The Designated Community dictates what Representation Information (formats, schemas, dictionaries) must be archived alongside the content, so the data doesn’t become dead bytes.

OAIS and PDF/A

PDF/A is one of the most direct answers to OAIS requirements in the document domain:

  • Visual reproducibility — required by OAIS, delivered by PDF/A
  • Self-contained — all fonts, colour profiles, images embedded → no external knowledge required
  • Open standard — ISO 19005, readable for decades
  • For PDF/A-3: permits XML attachments, ideal for ZUGFeRD archiving

Dokmatiq produces PDF/A as AIP-grade output:

curl -X POST https://api.dokmatiq.com/v1/pdf/convert \
  -H "Authorization: Bearer $DOKMATIQ_KEY" \
  -H "X-Target-Profile: PDF/A-3b" \
  --data-binary "@incoming.pdf" | \
curl -X POST https://api.dokmatiq.com/v1/pdf/timestamp \
  -H "Authorization: Bearer $DOKMATIQ_KEY" \
  --data-binary @- \
  -o aip.pdf

The result is a PDF/A-3 with a qualified timestamp — cryptographically verifiable as to when it reached the archive.

OAIS, GoBD, DIN 31644

OAIS underpins further standards:

  • DIN 31644 (criteria for trustworthy digital long-term archives) — the German certification basis
  • nestor Seal — German certification programme for OAIS-compliant archives
  • Core Trust Seal / ISO 16363 — international audit standards

GoBD itself does not reference OAIS directly, but the GoBD requirement “legibility throughout the retention period” aligns factually with the OAIS principle of content understandable to the Designated Community.

Common pitfalls

  1. SIP = AIP — many archives simply store the incoming format; OAIS requires active transformation with metadata enrichment
  2. Missing Preservation Planning — without a format-migration strategy, content ages out in 10–15 years
  3. “Backup” mistaken for an archive — backup protects against data loss; an archive preserves content and understandability
  4. Ignored Designated Community — without it, the archive cannot know which context information to preserve alongside the data

Ready to use it via API?

Get started for free. No credit card. 100 documents per month included.