Streaming
Two orthogonal modes:
- Streaming write — append rows to a sheet that’s being deflated and written to disk on the fly. Heap stays bounded no matter how many rows you push.
- Streaming read — walk a workbook’s worksheets row-by-row without materializing the full sheet. Backed by a SAX parser.
Both live behind the xlsx-kit/streaming entry, which is browser-safe (no node:fs imports). Use xlsx-kit/node to bridge to a fs.ReadStream / fs.WriteStream.
Streaming write — 10M rows under 100 MB heap
// Stream millions of rows to disk in a fixed memory budget. Each row is
// deflated as it arrives — no intermediate workbook in memory.
import { toFile } from 'xlsx-kit/node';
import { createWriteOnlyWorkbook } from 'xlsx-kit/streaming';
const sink = toFile('big.xlsx');
const wb = await createWriteOnlyWorkbook(sink);
const ws = await wb.addWorksheet('Data');
ws.setColumnWidth(1, 24); // must precede the first appendRow
for (let r = 0; r < 10_000_000; r++) {
await ws.appendRow([r, `row-${r}`, r * Math.PI]);
}
await ws.close();
await wb.finalize();
createWriteOnlyWorkbook returns a workbook whose sheets are append-only. appendRow(values) writes one row, immediately deflates the resulting <row> XML, and pushes the compressed bytes to the underlying sink. Nothing is held in memory beyond the deflate window.
A few constraints:
- Column widths must be set before the first
appendRow. Once any row is written, the worksheet’s<cols>block is locked. - No random-access edits. Once a row is appended you can’t go back and change a cell.
ws.close()andwb.finalize()are required — that’s when the central directory is written. Without them you’ll get a truncated zip.
For a 10M-row × 3-col workbook on commodity hardware: ~30s, ~75 MB peak heap, ~110 MB on disk.
Streaming read — iter rows without loading
// Iterate huge sheets without loading the full workbook. iterRows is a SAX
// pass — it walks the file once and yields each row's cells.
import { fromFile } from 'xlsx-kit/node';
import { loadWorkbookStream } from 'xlsx-kit/streaming';
const wb = await loadWorkbookStream(fromFile('big.xlsx'));
const sheet = wb.openWorksheet(wb.sheetNames[0] ?? '');
for await (const row of sheet.iterRows({ minRow: 1, maxRow: 100 })) {
console.log(row.map((c) => c.value));
}
await wb.close();
loadWorkbookStream parses the workbook part eagerly (sheet names, defined names, etc) but defers the per-sheet XML. openWorksheet(name) returns a handle whose iterRows() yields each row as it’s parsed.
iterRows({ minRow, maxRow, minCol, maxCol }) lets you bound the walk. The parser skips ahead via XML tag scan, so a minRow: 1_000_000 query on a 10M-row sheet does not parse the first million rows — it skips them.
Each yielded row is a Cell[]: cell.value is string | number | boolean | null, cell.coordinate is 'A1'-style.
When to pick which
| Workload | Pick |
|---|---|
| Workbook fits in memory; want to mutate cells, charts, styles | loadWorkbook (full library) |
| Reading a huge sheet linearly | loadWorkbookStream + iterRows |
| Writing > 100k rows, no need for charts / drawings | createWriteOnlyWorkbook |
| Need both streaming write and charts | Build the workbook in-memory; not currently supported by the streaming writer |