File format resources

A curated reference library of authoritative sources on file formats — official specs, codec documentation, and canonical tools. Use these when you need the source-of-truth detail rather than a summary.

Image format specifications

Official specs and reference implementations for the dominant image formats. These are the source-of-truth documents that conformant encoders and decoders implement.

Audio and video codec references

Authoritative documentation for the audio and video codecs used in modern media pipelines.

  • FFmpeg documentation

    The de facto reference for audio and video conversion. FFmpeg is the open-source toolchain used in nearly every production pipeline that touches media files.

  • ITU-T H.264 specification

    The official spec for H.264 / AVC, the video codec that underpins MP4 video on essentially every device.

  • AV1 specification (AOMedia)

    The Alliance for Open Media's spec for AV1, the modern royalty-free video codec used in WebM and AVIF.

  • FLAC format documentation (Xiph.org)

    The FLAC reference, defining the lossless audio compression format and its container structure.

  • ID3v2 specification

    The ID3 metadata specification used for MP3 and other audio formats. Worth knowing if you care about how artist, title, and album art are encoded.

Document and office format specifications

Specifications for the office and document formats that dominate business workflows.

  • PDF specification (ISO 32000)

    The ISO standard for PDF. The Adobe-published draft is also widely available and is the canonical reference for the format's structure, fonts, and rendering model.

  • Office Open XML (DOCX, XLSX, PPTX)

    The ECMA-376 standard for Microsoft's Open XML formats. The spec is dense but it's the source of truth for DOCX, XLSX, and PPTX internals.

  • OpenDocument Format (ODT, ODS, ODP)

    OASIS's OpenDocument standard, the format used by LibreOffice and Apache OpenOffice. The open alternative to Microsoft's OOXML.

  • EPUB 3 specification (W3C)

    The current EPUB specification covering reflowable and fixed-layout ebooks, accessibility, and packaging.

  • RFC 4180 (CSV)

    The closest thing CSV has to a formal spec. Defines the comma-separated values format, escaping rules, and line-ending conventions.

Data and serialization formats

Specifications for the structured-data formats used in APIs, configuration, and data pipelines.

3D model formats

Specifications and documentation for the leading 3D model exchange formats.

Authoritative web references

MDN, web standards, and other long-trusted references on web-relevant file formats.

  • MDN: Image file types and formats

    Mozilla's reference on image formats supported in browsers — coverage, when to use each, and browser compatibility tables.

  • MDN: Web video codec guide

    Practical guide to web video codecs (H.264, H.265, VP9, AV1) and which container formats they fit into.

  • MDN: WebVTT (Web Video Text Tracks)

    Reference for the WebVTT subtitle format used by HTML5 video.

  • Can I use

    Browser compatibility tables for every web-relevant file format. The reference for “is AVIF supported in Safari yet?”-class questions.

Tools we use and recommend

Open-source tools and command-line utilities for power users who want to do conversions locally rather than via a web service.

  • FFmpeg

    The Swiss Army knife of audio and video conversion. Command-line, free, open-source, and used in essentially every production media pipeline.

  • ImageMagick

    Command-line image conversion and manipulation. Handles essentially every image format in existence and supports complex batch processing.

  • libvips

    A faster, lower-memory alternative to ImageMagick for image processing. The engine behind sharp, the most popular Node.js image library.

  • LibreOffice

    Open-source office suite that doubles as a powerful command-line document converter. Can convert between essentially every office format.

  • Pandoc

    Universal document converter — Markdown to anything, anything to anything. The right tool for converting between text-based formats.

  • Calibre

    Free ebook management and conversion. The standard tool for converting between EPUB, MOBI, AZW3, and other ebook formats.

  • Tesseract OCR

    Open-source OCR engine. The right tool for adding searchable text to scanned PDFs.

Want a topic added?

This page is curated, not exhaustive — we list the resources we actually use and recommend, not every page on the internet. If you know an authoritative source we've missed, email christopherfloied@outlook.com with the subject line Resource suggestion and a link.

Read our format guides