So, I'm a big fan of metaformats with generalized tooling support. Think of e.g....

ProgramMax · 2025-06-25T22:48:35 1750891715

Strangely, I was familiar with AIFF and RIFF files but never made the connection that they're both IFF. I hadn't known about IFF before your post. Thank you :)

W3C requires that we do not break old, conformant specs. Meaning if the next PNG spec would invalidate prior specs, they won't approve it. By extension, an old, conformant program will not suddenly become non-conformant.

I could see a group of people formalizing IFFv2, and adapting PNG to it. But that would effectively be PNGIFF, not PNG. It would be a new spec. Because we cannot break the old one.

That might be fine. But it comes with a new set of problems, like adoption.

Soooo I like the idea but it would probably be a separate thing. FWIW, it would actually be nice to make a formal IFF spec. If there was no governing body that owns it, we can find an org and gather interest.

I doubt W3C would be the right org for it. ISO subgroup??

saintfire · 2025-06-25T23:14:19 1750893259

They pretty much say the same thing halfway through. Don't change PNG but adapt IFF to work with PNG's flavour of IFF.

ProgramMax · 2025-06-26T00:15:35 1750896935

Right. Sorry, that was supposed to be a "yes, and..." to provide some additional context.

account42 · 2025-06-26T08:47:20 1750927640

We really shouldn't be making new standards with big endian byte order.

It's also questionable how much you actually benefit from common container formats like this since you need to know the application specific format contained anyway in order to do anything useful with it. It also causes problems where "smart" programs treat files in ways that make no sense, e.g. by offering to extract a .docx file just because it looks like a .zip

derefr · 2025-06-26T19:36:02 1750966562

> you need to know the application specific format contained anyway in order to do anything useful with it

One neat thing about IFF is that all of its "container" chunk types (LIST, FORM, CAT) are part of the standard; the expectation is that domain-specific chunk types should [mostly] be leaf nodes. As such, IFF is at least "legible" in the same way that XML or JSON or Lisp is legible (and more than e.g. ELF is legible): you're meant to decompose an object graph into individual IFF chunks for each object in the graph. Which translates to IFF files being "browseable", rather than dead-ending in opaque tables that require some other standard to tell how how they're even row-delimited.

Another neat thing is that, like with namespaced XML element names, chunk names — at least the "public" ones — are meant to have globally-unique meanings, being registered in a global registry (https://wiki.amigaos.net/wiki/IFF_FORM_and_Chunk_Registry). This means that IFF tooling can "browse" an arbitrary unknown IFF document, find a chunk it does understand the meaning of, and usefully decode it (and maybe its descendants) for you.

Many more-complex IFF formats (e.g. the AV containers like RIFF) embed data of other media types as chunks of these registered types. Think "thumbnail in a video file" or "texture in a scene file." Your tooling doesn't need to know the semantics of the outer format, to be able to discover these registered inner chunks inside it, and browse/preview/extract them. (Or replace them one-for-one with another asset of the same type; or even, if they're inside a simple LIST chunk, add or remove instances of the asset from the list!)

Also, somewhat interestingly, given the way IFF is structured, there is no inherent difference between embedding a sub-resource "opaquely" vs embedding it "legibly" — i.e. if you embed a [headerless] IFF document as the value of a chunk in another IFF document, then that's exactly the same thing as nesting the root-level chunk(s) of that sub-document within the parent chunk. It's like how an SVG sub-document inside an XHTML document isn't a separate serialized blob that gets sucked out and parsed, but rather just additional tags in the XHTML document-string, around which a boundary of "this is a separate XML sub-document" gets drawn by some "DOM document builder" code downstream of the actual XML parser.

---

But besides the technical "it can be done" points, let me also speak more in terms of the motivation. Why would you want to?

Well, have you ever wanted to open up a complex file and pull its atomic-level assets out? Your first thought when hearing that was probably "that sounds like a nightmare" — and yes, today, it is.

But back in the 1980s, with the original growth of IFF-based formats, we temporarily lived in this wonderland where there were all these different browseable / explorable file formats, that could be cracked open with exactly the same tools.

Do you wonder how and why the game modding scene first came into existence? It was basically the result of games storing their asset packs in these simple-to-parse/generate file formats — where people could easily drop-in replace one of those assets with a new one with simple command-line tools, or even with a GUI, without worrying about matching asset sizes / binary offset patching / etc — let alone with any knowledge of how the container file format works.

Do you appreciate how macOS app bundles just have a browseable, hierarchical Resources directory inside them? Before app bundles, macOS applications held their resources in a "resource fork" — essentially a set of FourCC-tagged file extended-attributes (though actually, a single on-disk packfile that acted as a random-access key-value store of those xattrs). And both of these approaches (bundle Resources dirs, and resource forks) provided the same explorability / moddability as IFF files do. People would throw a macOS program into ResEdit and pull out its icons, its fonts, its strings, whatever — where those weren't program-domain-specific things, but rather were effectively items with standardized media types (their FourCC codes being effectively the predecessor of modern MIME types.)

For that matter, consider this quote from the IFF wiki page:

> There are standard chunks that could be present in any IFF file, such as AUTH (containing text with information about author of the file), ANNO (containing text with annotation, usually name of the program that created the file), NAME (containing text with name of the work in the file), VERS (containing file version), (c) (containing text with copyright information).

Now, remember that IFF decoders are almost always expected / coded to ignore chunks they don't understand. (Especially for IFF files encoded as a toplevel stream of heterogeneous chunk types.)

That means that not only can various format authors decide to use these standard chunks... but third-party editors can also just drop chunks like this into the things they edit! You know how Windows has that "name, author, version" etc info on the Properties sheet for some file types? That info could show up and be editable for any IFF-based file format — whether the particular format has an "allowance" for it or not.

(There's nothing special about IFF here, by the way. You could just as well drop "foreign-namespaced attributes" like this into an e.g. XML-based document format. The difference is a cultural one: the developers of XML-based document formats tend to have their XML decoders validate their documents for strict conformance to an XML schema; and XML schemas tend to be [but don't have to be!] designed as whitelists of the possible tags that can be used within any given parent nesting path. IFF, meanwhile, has never had anything like a schema-based document validation. Every document was best-effort parsed, like HTML4; and so every IFF-based format decoder is a best-effort decoder, like a web browser parsing HTML4. That very lack of schema-based validation, actually opens up a lot of use-cases for IFF.)

derefr · 2025-06-26T20:53:27 1750971207

(Separate reply for space)

> We really shouldn't be making new standards with big endian byte order.

IFF isn't a wire protocol standard for efficient zero-copy; and nor is it intended for file formats amenable to being streaming-parsed.

And that's okay! Not every format needs to be suited to efficient, scalable, concurrent, [other lovely words] message passing!

IFF has two major use-cases:

1. documents that are "loaded" in some program, where "loading" is expected to occur against a random-access block device; where each chunk will be visited in turn, with either its contents being parsed into an in-memory representation; its contents' slicing bounds being stored to later stream or random-access within (or the part of the file within those bounds being mmap(2)ed — same thing); or that chunk discarded, thus allowing the load operation to skip issuing any read ops for it or its descendants entirely.

This is the PNG use-case.

(Though, interestingly enough, since PNG has only one large chunk — the image data — PNG can be made into an "effectively-streamed format" simply by keeping that big chunk at the end of the IFF document. Presuming the stream length of the PNG file is known [as in a regular HTTP fetch], the "skeleton load" process for PNG can terminate after just having parsed its way through all the other tiny chunks — perhaps with a few minimal buffer waits to skip over unknown chunks — but with no need to buffer the entire image data chunk. [It adds the image-data-chunk length to the file pointer, realizes there's no more room for chunks in the stream, and so doesn't bother to buffer+seek past that final chunk.] The IFF parser then returns to the caller, passing it the slicing bounds of [among other things] the (still not-yet-fully-received) image-data chunk. And the caller can then turn around, and hand the same FILE pointer and those slicing bounds to its streaming renderer, letting it go to town consuming the stream as needed.)

IFF in its skeleton-loading model, would also be ideal for something like e.g. a font file (which has lots of little tables, which are either eagerly parsed, or ignored, by any given renderer.)

2. simple "read-rarely" packfile documents, that act sort of like little databases, but without any sort of TOC header part; where, when you want to grab something from the packfile, you re-navigate down through it from the root, taking the IOPS hit from all the seeks to each nesting-parent chunk's preceeding sibling chunks before hitting the descendant you want to navigate into.

This is the use-case of most IFFv1 file formats — most of them were made for use by programs that would grab this or that for the program's use either once at startup, or when the thing became relevant. (Think of the types of things a Windows executable embeds as "resources" — icons, translated strings, XAML declarative-MVC-view documents, etc.)

For a parallel, IFF here is to "using an entire archive-format library like tar or zip to store these assets for random access", as "spitting CSV/XML out using template strings" is to using a library to encode a table to a Parquet/ORC/etc. table.

The parallel is that in both cases, you're trading some performance and robustness, for massively reduced complexity and ease of implementation. Like with emitting CSV, you can slop together an IFF encoder right there inside your data-emitting logic — in any language that can write out binary files, and without even having access to the Internet, let alone adding a dependency on an encoder package in some package ecosystem. You can do it in C; you can do it in assembly; you can do it in a bash script; you can do it in BASIC; you can do it in a Windows batch file; you can do it in your single-file Python or Ruby or Perl script that lives in your repo. You can probably do it in a Makefile!

(Also, given how IFF parsing works [i.e. given that any given chunk's contents is in superposition of being either an opaque binary slice or a potential stream of child chunks, with a streaming event-based parser able to decide at each juncture whether to take that step of decoding the child chunks or to leave them as an undecoded binary for now], if you start to care about performance, you can just stick some memoization in front of your "fetch a key-path-lens KP from document D" function, and now you're building a just-in-time TOC. And obviously you can put TOC chunks in your IFF-based file formats if you want — though IMHO doing so kind of goes against the spirit of IFF.)

---

In neither of those use-cases does it really matter that lengths require reading four bytes one-at-a-time with left-shifts, rather than being able to just plop the four bytes into a register. These aren't cases where the parse overhead of the the structural glue between the data, will ever be non-trivial relative to the time it takes to consume the data itself.

And even if you did want to use IFF for something crazy, like as a substitute for Protobuf: did you know that most modern CPU ISAs have a byte-shuffle instruction that can transform big-endian into little-endian [among an unbounded number of other potential transformations] in a single cycle? Endian-ness did matter in protocol design for a while... but these days, unless you're e.g. a Google engineer designing a new SAN protocol, and optimizing it for message-handling overhead on your custom SDN L7 network-switch silicon that doesn't have a shuffle op... endian-ness is mostly irrelevant again!