File Format Specification

Version: 1.2.0 · Extension: .zt

Motivation

Most tensor formats equate "tensor" with "flat array of one dtype." This works for dense weights but breaks down for sparse matrices, quantized weight groups, or newer numeric types. These formats either can't express them or force you to flatten everything into separate arrays held together by naming conventions. See the introduction for a feature comparison across formats.

Design Goals

zTensor starts from a different premise: a tensor is a composite object, not a flat buffer. The format is built around three principles:

Simple. The file is a flat sequence of data blobs followed by a single metadata index. No nested containers, no back-patching, no code execution. A minimal reader is a few dozen lines.
Performant. Data blobs are 64-byte aligned for direct memory-mapping and SIMD access. Metadata is separated from bulk data, so a reader can enumerate every object's name, shape, and type without touching any data bytes.
Extensible. New object layouts, data types, and metadata fields can be added without breaking existing readers. The format uses optional fields and an open type system, not a fixed enum, so it evolves without version bumps.

Key Concepts

A .zt file contains named objects. An object is the format's abstraction for a tensor: instead of treating a tensor as a flat array of one dtype, an object is a composite with a shape, a format that describes its layout, and one or more components that hold the actual bytes.

A component is a single contiguous blob on disk. It has a storage type (dtype) that determines byte width, and an optional logical type (type) that says what those bytes mean. For a standard float32 object, dtype is "f32" and type is absent, so no interpretation is needed. For an FP8 object, dtype is "u8" (because FP8 is stored as raw bytes) while type is "f8_e4m3fn" (telling the reader how to decode them). This separation keeps the storage layer stable (readers always know how many bytes to read) while the logical layer can grow to cover new numeric formats without any changes to the container.

This design makes a dense object simple (one component named "data"), but also supports sparse matrices (separate values, indices, indptr components) and quantized weights (separate packed_weight, scales, zeros components) without special-casing in the container format. The object knows its own structure.

The rest of this document is organized from abstract to concrete: the manifest schema (what you write), the type system (how bytes are interpreted), the object formats (what layouts exist), and finally the binary layout (how it's arranged on disk).

1. Manifest Schema

The manifest is a CBOR-encoded map (RFC 7049) stored near the end of the file. Its location and how to find it are described in Binary Layout.

Root

{
  "version": "1.2.0",
  "attributes": {
    "framework": "PyTorch",
    "license": "Apache-2.0"
  },
  "objects": {
    "layer1.weight": { "..." : "..." },
    "layer1.bias": { "..." : "..." }
  }
}

Field	Type	Required	Description
`version`	string	Yes	Spec version (e.g., `"1.2.0"`).
`attributes`	map	No	Arbitrary key-value metadata for the whole file.
`objects`	map	Yes	Named object definitions. Keys are object names (e.g., `"layer1.weight"`).

Object

Each entry in objects describes one logical object.

Field	Type	Description
`shape`	`[uint64]`	Dimensions (e.g., `[1024, 768]`).
`format`	string	Layout: `dense`, `sparse_csr`, `sparse_coo`, `quantized_group`, etc. See Object Formats.
`attributes`	map	Optional per-object metadata.
`components`	map	One entry per data blob. Keys are role names (e.g., `"data"`, `"scales"`).

Component

A component points to a single contiguous blob on disk and describes how to interpret its bytes.

{
  "dtype": "u8",
  "type": "f8_e4m3fn",
  "offset": 1024,
  "length": 4096,
  "encoding": "raw",
  "digest": "sha256:8f4a..."
}

Field	Type	Default	Description
`dtype`	string	required	Storage type, one of the 13 fixed primitives. Determines byte width.
`type`	string	`null`	Logical type. When absent, the logical type equals `dtype`.
`offset`	uint64	required	Absolute byte offset in the file. Must be a multiple of 64.
`length`	uint64	required	Bytes on disk. For `"zstd"` encoding, this is the compressed size.
`uncompressed_length`	uint64	`null`	Original size before compression. Required when `encoding` is `"zstd"`.
`encoding`	string	`"raw"`	`"raw"` or `"zstd"`.
`digest`	string	`null`	Checksum of the stored (possibly compressed) bytes. Format: `"algorithm:hex"` (e.g., `"sha256:8f4a..."`).

2. Type System

Storage types (`dtype`)

A closed set of 13 hardware-native primitives. Every component must use one of these as its dtype. The dtype alone determines the byte width of each element.

Category	Types	Encoding
Float	`f64`, `f32`, `f16`, `bf16`	IEEE 754 / BFloat16
Signed integer	`i64`, `i32`, `i16`, `i8`	Two's complement
Unsigned integer	`u64`, `u32`, `u16`, `u8`	Unsigned
Boolean	`bool`	1 byte. `0x00` = false, `0x01` = true.

Logical types (`type`)

An open, extensible set that gives meaning to raw dtype bytes. When type is absent, the logical type is the same as dtype, and no extra interpretation is needed.

Simple types: one storage element per logical element (1:1):

`type`	`dtype`	Notes
`f8_e4m3fn`	`u8`	NVIDIA / OCP FP8
`f8_e5m2`	`u8`	OCP FP8
`f8_e4m3fnuz`	`u8`	AMD FP8
`f8_e5m2fnuz`	`u8`	AMD FP8

Compound types: multiple storage elements per logical element:

`type`	`dtype`	Ratio	Notes
`complex64`	`f32`	2:1	Interleaved `[real, imag]` pairs
`complex128`	`f64`	2:1	Interleaved `[real, imag]` pairs

Computing data size:

Simple: product(shape) * byte_size(dtype)
Compound: product(shape) * ratio * byte_size(dtype)

Readers that encounter an unrecognized type MAY fall back to loading raw dtype elements.

3. Object Formats

Each object's format field selects how its components are interpreted. All index components (indices, indptr, coords) across sparse formats MUST use dtype: "u64".

`dense`

Standard contiguous array in row-major (C-contiguous) order.

Component	Description
`data`	The data elements.

Readers SHOULD memory-map data when encoding is "raw".

`sparse_csr`

Compressed Sparse Row.

Component	Description
`values`	Non-zero elements.
`indices`	Column index for each value.
`indptr`	Row pointers (length = rows + 1).

`sparse_coo`

Coordinate list.

Component	Description
`values`	Non-zero elements.
`coords`	Coordinate indices, stored as a flat array of length `ndim * nnz` in Structure-of-Arrays order: all row indices first, then all column indices, etc.

`quantized_group`

Block-wise quantization (e.g., GPTQ). Packed weights with separate scale and zero-point arrays.

Component	Description
`packed_weight`	Quantized data (e.g., `i32` packing 8 x 4-bit values).
`scales`	Per-group scaling factors.
`zeros`	Per-group zero-points.

Quantization parameters (bits, group_size, packing) are stored in the object's attributes.

Example: 4-bit GPTQ, shape [4096, 4096]:

{
  "shape": [4096, 4096],
  "format": "quantized_group",
  "attributes": {
    "bits": 4,
    "group_size": 128,
    "packing": "8_per_i32"
  },
  "components": {
    "packed_weight": { "dtype": "i32", "offset": 1024,    "length": 8388608 },
    "scales":        { "dtype": "f16", "offset": 8389632, "length": 262144 },
    "zeros":         { "dtype": "f16", "offset": 8651776, "length": 262144 }
  }
}

4. Binary Layout

The on-disk format is append-only: data blobs are written sequentially, then the manifest is appended at the end. This makes writing simple (no seeking back to patch headers) and keeps all metadata in one place.

File structure

+---------------------------------------+ <-- Offset 0
| Magic Header (8 bytes)                |     "ZTEN1000"
+---------------------------------------+
|                                       |
| Component Blob A                      | <-- offset % 64 == 0
|                                       |
+---------------------------------------+
| Zero Padding (0-63 bytes)             |
+---------------------------------------+
| Component Blob B                      | <-- offset % 64 == 0
+---------------------------------------+
| ...                                   |
+---------------------------------------+
| CBOR Manifest (variable length)       |
+---------------------------------------+
| Manifest Size (8 bytes, uint64 LE)    |
+---------------------------------------+
| Magic Footer (8 bytes)                |     "ZTEN1000"
+---------------------------------------+ <-- EOF

Byte order

All multi-byte values in a .zt file are Little-Endian: structural integers (manifest size), component data, and all dtype elements. Writers on big-endian hosts MUST byte-swap before writing.

The only exception is CBOR's own internal length prefixes, which are Big-Endian per RFC 7049. This is handled transparently by any CBOR library.

Alignment and padding

Every component blob starts at an offset divisible by 64. This enables direct memory-mapping and SIMD access without copying.
Gaps between blobs are filled with 0x00 bytes.
The magic footer repeats the header (ZTEN1000) so readers can detect truncated files.

5. Reading a File

Algorithm

Seek to EOF - 16. Read 16 bytes.
Verify the last 8 bytes are ZTEN1000. If not, the file is corrupt or not a .zt file.
Decode the first 8 bytes as uint64 LE to get manifest_size.
If manifest_size > 1 GB, abort (prevents denial-of-service via oversized manifests).
Seek to EOF - 16 - manifest_size. Read manifest_size bytes.
Decode the buffer as CBOR to get the manifest.
To load a component:
- Seek to component.offset.
- Read component.length bytes.
- If encoding is "zstd", decompress (using uncompressed_length to pre-allocate).
- Interpret the resulting bytes as component.dtype elements.

Security rules

No code execution. Parsers MUST NOT evaluate or execute any data. No pickle, no eval.
Bounds checking. offset + length MUST NOT exceed the file size.
Decompression limits. When uncompressed_length is present, reject values exceeding a reasonable maximum before decompressing.
Padding bytes. Writers MUST set all padding to 0x00. Readers MAY ignore padding content.

Appendix: Version Policy

Minor version increments (e.g., 1.1 to 1.2) only add optional fields or new logical types. Readers MUST ignore unknown fields. Major version increments (e.g., 1.x to 2.x) may change the container structure or manifest schema.

Motivation​

Design Goals​

Key Concepts​

1. Manifest Schema​

Root​

Object​

Component​

2. Type System​

Storage types (dtype)​

Logical types (type)​

3. Object Formats​

dense​

sparse_csr​

sparse_coo​

quantized_group​

4. Binary Layout​

File structure​

Byte order​

Alignment and padding​

5. Reading a File​

Algorithm​

Security rules​

Appendix: Version Policy​