File Format Specification
Version: 1.2.0 · Extension: .zt
Motivation
Most tensor formats equate "tensor" with "flat array of one dtype." This works for dense weights but breaks down for sparse matrices, quantized weight groups, or newer numeric types. These formats either can't express them or force you to flatten everything into separate arrays held together by naming conventions. See the introduction for a feature comparison across formats.
Design Goals
zTensor starts from a different premise: a tensor is a composite object, not a flat buffer. The format is built around three principles:
- Simple. The file is a flat sequence of data blobs followed by a single metadata index. No nested containers, no back-patching, no code execution. A minimal reader is a few dozen lines.
- Performant. Data blobs are 64-byte aligned for direct memory-mapping and SIMD access. Metadata is separated from bulk data, so a reader can enumerate every object's name, shape, and type without touching any data bytes.
- Extensible. New object layouts, data types, and metadata fields can be added without breaking existing readers. The format uses optional fields and an open type system, not a fixed enum, so it evolves without version bumps.
Key Concepts
A .zt file contains named objects. An object is the format's abstraction for a tensor: instead of treating a tensor as a flat array of one dtype, an object is a composite with a shape, a format that describes its layout, and one or more components that hold the actual bytes.
A component is a single contiguous blob on disk. It has a storage type (dtype) that determines byte width, and an optional logical type (type) that says what those bytes mean. For a standard float32 object, dtype is "f32" and type is absent, so no interpretation is needed. For an FP8 object, dtype is "u8" (because FP8 is stored as raw bytes) while type is "f8_e4m3fn" (telling the reader how to decode them). This separation keeps the storage layer stable (readers always know how many bytes to read) while the logical layer can grow to cover new numeric formats without any changes to the container.
This design makes a dense object simple (one component named "data"), but also supports sparse matrices (separate values, indices, indptr components) and quantized weights (separate packed_weight, scales, zeros components) without special-casing in the container format. The object knows its own structure.
The rest of this document is organized from abstract to concrete: the manifest schema (what you write), the type system (how bytes are interpreted), the object formats (what layouts exist), and finally the binary layout (how it's arranged on disk).
1. Manifest Schema
The manifest is a CBOR-encoded map (RFC 7049) stored near the end of the file. Its location and how to find it are described in Binary Layout.
Root
{
"version": "1.2.0",
"attributes": {
"framework": "PyTorch",
"license": "Apache-2.0"
},
"objects": {
"layer1.weight": { "..." : "..." },
"layer1.bias": { "..." : "..." }
}
}
| Field | Type | Required | Description |
|---|---|---|---|
version | string | Yes | Spec version (e.g., "1.2.0"). |
attributes | map | No | Arbitrary key-value metadata for the whole file. |
objects | map | Yes | Named object definitions. Keys are object names (e.g., "layer1.weight"). |
Object
Each entry in objects describes one logical object.
| Field | Type | Description |
|---|---|---|
shape | [uint64] | Dimensions (e.g., [1024, 768]). |
format | string | Layout: dense, sparse_csr, sparse_coo, quantized_group, etc. See Object Formats. |
attributes | map | Optional per-object metadata. |
components | map | One entry per data blob. Keys are role names (e.g., "data", "scales"). |
Component
A component points to a single contiguous blob on disk and describes how to interpret its bytes.
{
"dtype": "u8",
"type": "f8_e4m3fn",
"offset": 1024,
"length": 4096,
"encoding": "raw",
"digest": "sha256:8f4a..."
}
| Field | Type | Default | Description |
|---|---|---|---|
dtype | string | required | Storage type, one of the 13 fixed primitives. Determines byte width. |
type | string | null | Logical type. When absent, the logical type equals dtype. |
offset | uint64 | required | Absolute byte offset in the file. Must be a multiple of 64. |
length | uint64 | required | Bytes on disk. For "zstd" encoding, this is the compressed size. |
uncompressed_length | uint64 | null | Original size before compression. Required when encoding is "zstd". |
encoding | string | "raw" | "raw" or "zstd". |
digest | string | null | Checksum of the stored (possibly compressed) bytes. Format: "algorithm:hex" (e.g., "sha256:8f4a..."). |
2. Type System
Storage types (dtype)
A closed set of 13 hardware-native primitives. Every component must use one of these as its dtype. The dtype alone determines the byte width of each element.
| Category | Types | Encoding |
|---|---|---|
| Float | f64, f32, f16, bf16 | IEEE 754 / BFloat16 |
| Signed integer | i64, i32, i16, i8 | Two's complement |
| Unsigned integer | u64, u32, u16, u8 | Unsigned |
| Boolean | bool | 1 byte. 0x00 = false, 0x01 = true. |
Logical types (type)
An open, extensible set that gives meaning to raw dtype bytes. When type is absent, the logical type is the same as dtype, and no extra interpretation is needed.
Simple types: one storage element per logical element (1:1):
type | dtype | Notes |
|---|---|---|
f8_e4m3fn | u8 | NVIDIA / OCP FP8 |
f8_e5m2 | u8 | OCP FP8 |
f8_e4m3fnuz | u8 | AMD FP8 |
f8_e5m2fnuz | u8 | AMD FP8 |
Compound types: multiple storage elements per logical element:
type | dtype | Ratio | Notes |
|---|---|---|---|
complex64 | f32 | 2:1 | Interleaved [real, imag] pairs |
complex128 | f64 | 2:1 | Interleaved [real, imag] pairs |
Computing data size:
- Simple:
product(shape) * byte_size(dtype) - Compound:
product(shape) * ratio * byte_size(dtype)
Readers that encounter an unrecognized type MAY fall back to loading raw dtype elements.
3. Object Formats
Each object's format field selects how its components are interpreted. All index components (indices, indptr, coords) across sparse formats MUST use dtype: "u64".
dense
Standard contiguous array in row-major (C-contiguous) order.
| Component | Description |
|---|---|
data | The data elements. |
Readers SHOULD memory-map data when encoding is "raw".
sparse_csr
Compressed Sparse Row.
| Component | Description |
|---|---|
values | Non-zero elements. |
indices | Column index for each value. |
indptr | Row pointers (length = rows + 1). |
sparse_coo
Coordinate list.
| Component | Description |
|---|---|
values | Non-zero elements. |
coords | Coordinate indices, stored as a flat array of length ndim * nnz in Structure-of-Arrays order: all row indices first, then all column indices, etc. |
quantized_group
Block-wise quantization (e.g., GPTQ). Packed weights with separate scale and zero-point arrays.
| Component | Description |
|---|---|
packed_weight | Quantized data (e.g., i32 packing 8 x 4-bit values). |
scales | Per-group scaling factors. |
zeros | Per-group zero-points. |
Quantization parameters (bits, group_size, packing) are stored in the object's attributes.
Example: 4-bit GPTQ, shape [4096, 4096]:
{
"shape": [4096, 4096],
"format": "quantized_group",
"attributes": {
"bits": 4,
"group_size": 128,
"packing": "8_per_i32"
},
"components": {
"packed_weight": { "dtype": "i32", "offset": 1024, "length": 8388608 },
"scales": { "dtype": "f16", "offset": 8389632, "length": 262144 },
"zeros": { "dtype": "f16", "offset": 8651776, "length": 262144 }
}
}
4. Binary Layout
The on-disk format is append-only: data blobs are written sequentially, then the manifest is appended at the end. This makes writing simple (no seeking back to patch headers) and keeps all metadata in one place.
File structure
+---------------------------------------+ <-- Offset 0
| Magic Header (8 bytes) | "ZTEN1000"
+---------------------------------------+
| |
| Component Blob A | <-- offset % 64 == 0
| |
+---------------------------------------+
| Zero Padding (0-63 bytes) |
+---------------------------------------+
| Component Blob B | <-- offset % 64 == 0
+---------------------------------------+
| ... |
+---------------------------------------+
| CBOR Manifest (variable length) |
+---------------------------------------+
| Manifest Size (8 bytes, uint64 LE) |
+---------------------------------------+
| Magic Footer (8 bytes) | "ZTEN1000"
+---------------------------------------+ <-- EOF
Byte order
All multi-byte values in a .zt file are Little-Endian: structural integers (manifest size), component data, and all dtype elements. Writers on big-endian hosts MUST byte-swap before writing.
The only exception is CBOR's own internal length prefixes, which are Big-Endian per RFC 7049. This is handled transparently by any CBOR library.
Alignment and padding
- Every component blob starts at an offset divisible by 64. This enables direct memory-mapping and SIMD access without copying.
- Gaps between blobs are filled with
0x00bytes. - The magic footer repeats the header (
ZTEN1000) so readers can detect truncated files.
5. Reading a File
Algorithm
- Seek to
EOF - 16. Read 16 bytes. - Verify the last 8 bytes are
ZTEN1000. If not, the file is corrupt or not a.ztfile. - Decode the first 8 bytes as
uint64 LEto getmanifest_size. - If
manifest_size > 1 GB, abort (prevents denial-of-service via oversized manifests). - Seek to
EOF - 16 - manifest_size. Readmanifest_sizebytes. - Decode the buffer as CBOR to get the manifest.
- To load a component:
- Seek to
component.offset. - Read
component.lengthbytes. - If
encodingis"zstd", decompress (usinguncompressed_lengthto pre-allocate). - Interpret the resulting bytes as
component.dtypeelements.
- Seek to
Security rules
- No code execution. Parsers MUST NOT evaluate or execute any data. No pickle, no eval.
- Bounds checking.
offset + lengthMUST NOT exceed the file size. - Decompression limits. When
uncompressed_lengthis present, reject values exceeding a reasonable maximum before decompressing. - Padding bytes. Writers MUST set all padding to
0x00. Readers MAY ignore padding content.
Appendix: Version Policy
Minor version increments (e.g., 1.1 to 1.2) only add optional fields or new logical types. Readers MUST ignore unknown fields. Major version increments (e.g., 1.x to 2.x) may change the container structure or manifest schema.