Skip to content

ArrowCodec

Defined in: packages/codecs/src/arrow-codec.ts:78

ArrowCodec - High-performance columnar data serialization.

USE CASES:

  • Embeddings / vectors: Float32Array[]
  • Query results: { ids: number[], scores: number[] }
  • Batch data: { names: string[], values: number[] }

FEATURES:

  • Zero-copy reads (very fast deserialization)
  • Columnar format (great for analytics)
  • Cross-language (Python pandas, Rust, etc.)
  • Streaming IPC format
  • ⚡️ Zero-copy view in serialize

NOT FOR:

  • Simple objects with few fields (use MsgPackCodec)
  • Raw binary data (use RawChunksCodec)

NOTE: We don’t implement deserializeChunks for Arrow. Arrow’s streaming format requires proper RecordBatchReader which is complex. For now, we rely on FrameBuffer merging for Arrow data.

const codec = new ArrowCodec();
// From simple object
const buffer = codec.serialize({
ids: [1, 2, 3],
scores: new Float32Array([0.9, 0.8, 0.7]),
});
// Deserialize always returns Table
const table = codec.deserialize(buffer);
const ids = table.getChild('ids')?.toArray();

new ArrowCodec(): ArrowCodec

ArrowCodec

readonly name: "arrow" = "arrow"

Defined in: packages/codecs/src/arrow-codec.ts:79

Human-readable codec name for debugging/logging.

Codec.name

deserialize(buffer): Table

Defined in: packages/codecs/src/arrow-codec.ts:109

Deserialize Arrow IPC format to Table.

Arrow handles zero-copy reading from buffer automatically.

Buffer

Table

Codec.deserialize


serialize(data): Buffer

Defined in: packages/codecs/src/arrow-codec.ts:86

Serialize Arrow Table, RecordBatch, or simple object to Arrow IPC format.

⚡️ OPTIMIZATION: Uses Buffer.from(view) instead of copying.

ArrowSerializable

Buffer

Codec.serialize