File format resources
A curated reference library of authoritative sources on file formats — official specs, codec documentation, and canonical tools. Use these when you need the source-of-truth detail rather than a summary.
Image format specifications
Official specs and reference implementations for the dominant image formats. These are the source-of-truth documents that conformant encoders and decoders implement.
- JPEG specification (ITU-T T.81)
The original JPEG standard from 1992. Defines DCT-based lossy compression and the JFIF file format. Still the canonical reference.
- PNG specification (W3C)
The current W3C PNG specification, including chunk definitions, colour space handling, and the alpha-channel model.
- WebP container specification (Google)
Official documentation for WebP's RIFF container, lossy and lossless modes, and animation chunks.
- AVIF specification (AV1 Image File Format)
The AOMedia spec for AVIF, defining how AV1-encoded images are wrapped in the HEIF container format.
- SVG 2 specification (W3C)
Current SVG specification covering paths, filters, animation, and accessibility features.
Audio and video codec references
Authoritative documentation for the audio and video codecs used in modern media pipelines.
- FFmpeg documentation
The de facto reference for audio and video conversion. FFmpeg is the open-source toolchain used in nearly every production pipeline that touches media files.
- ITU-T H.264 specification
The official spec for H.264 / AVC, the video codec that underpins MP4 video on essentially every device.
- AV1 specification (AOMedia)
The Alliance for Open Media's spec for AV1, the modern royalty-free video codec used in WebM and AVIF.
- FLAC format documentation (Xiph.org)
The FLAC reference, defining the lossless audio compression format and its container structure.
- ID3v2 specification
The ID3 metadata specification used for MP3 and other audio formats. Worth knowing if you care about how artist, title, and album art are encoded.
Document and office format specifications
Specifications for the office and document formats that dominate business workflows.
- PDF specification (ISO 32000)
The ISO standard for PDF. The Adobe-published draft is also widely available and is the canonical reference for the format's structure, fonts, and rendering model.
- Office Open XML (DOCX, XLSX, PPTX)
The ECMA-376 standard for Microsoft's Open XML formats. The spec is dense but it's the source of truth for DOCX, XLSX, and PPTX internals.
- OpenDocument Format (ODT, ODS, ODP)
OASIS's OpenDocument standard, the format used by LibreOffice and Apache OpenOffice. The open alternative to Microsoft's OOXML.
- EPUB 3 specification (W3C)
The current EPUB specification covering reflowable and fixed-layout ebooks, accessibility, and packaging.
- RFC 4180 (CSV)
The closest thing CSV has to a formal spec. Defines the comma-separated values format, escaping rules, and line-ending conventions.
Data and serialization formats
Specifications for the structured-data formats used in APIs, configuration, and data pipelines.
- JSON specification (RFC 8259)
The canonical JSON spec — short, readable, and the definitive reference for JSON syntax.
- YAML 1.2 specification
The current YAML spec. Important if you ever need to debug YAML's notorious type-inference edge cases.
- TOML specification
TOML's simple, readable config-file format. The full spec fits on a single page.
- XML 1.0 specification (W3C)
The original XML spec from 1998. Verbose but still the canonical reference for XML processing.
3D model formats
Specifications and documentation for the leading 3D model exchange formats.
- glTF 2.0 specification (Khronos)
The glTF 2.0 spec. Comprehensive coverage of geometry, PBR materials, animations, and the binary GLB format.
- OBJ format reference (Wavefront)
OBJ has no formal modern spec, but the Wikipedia article is the de facto reference for the plain-text format and its MTL companion.
- STL format reference
Reference for both ASCII and binary STL — the universal 3D printing format.
- USD (Universal Scene Description)
Pixar's open-source scene description framework, the foundation for USDZ (Apple's AR format). Increasingly relevant for film and AR pipelines.
Authoritative web references
MDN, web standards, and other long-trusted references on web-relevant file formats.
- MDN: Image file types and formats
Mozilla's reference on image formats supported in browsers — coverage, when to use each, and browser compatibility tables.
- MDN: Web video codec guide
Practical guide to web video codecs (H.264, H.265, VP9, AV1) and which container formats they fit into.
- MDN: WebVTT (Web Video Text Tracks)
Reference for the WebVTT subtitle format used by HTML5 video.
- Can I use
Browser compatibility tables for every web-relevant file format. The reference for “is AVIF supported in Safari yet?”-class questions.
Tools we use and recommend
Open-source tools and command-line utilities for power users who want to do conversions locally rather than via a web service.
- FFmpeg
The Swiss Army knife of audio and video conversion. Command-line, free, open-source, and used in essentially every production media pipeline.
- ImageMagick
Command-line image conversion and manipulation. Handles essentially every image format in existence and supports complex batch processing.
- libvips
A faster, lower-memory alternative to ImageMagick for image processing. The engine behind sharp, the most popular Node.js image library.
- LibreOffice
Open-source office suite that doubles as a powerful command-line document converter. Can convert between essentially every office format.
- Pandoc
Universal document converter — Markdown to anything, anything to anything. The right tool for converting between text-based formats.
- Calibre
Free ebook management and conversion. The standard tool for converting between EPUB, MOBI, AZW3, and other ebook formats.
- Tesseract OCR
Open-source OCR engine. The right tool for adding searchable text to scanned PDFs.
Want a topic added?
This page is curated, not exhaustive — we list the resources we actually use and recommend, not every page on the internet. If you know an authoritative source we've missed, email christopherfloied@outlook.com with the subject line Resource suggestion and a link.
Read our format guides