pilcrow-typeset
The typesetting engine, as a library. The reference for any build pipeline that is not Astro, Eleventy, or Next.js.
What it is
Given a block of HTML, pilcrow-typeset opens a headless Chromium, lays the HTML out at the column width and font shorthand you tell it (or reads them from your own CSS), and replaces every <p> with per-line <span class="pt-line"> wrappers. The output is static HTML you serve as-is. No JavaScript runs at the reader.
Hyphenation runs in Node via Hyphenopoly's en-gb pattern set, which ships with the package. Soft hyphens are inserted before the typesetter sees the markup, so line breaks land at syllable boundaries rather than wherever the browser would otherwise have guessed.
Install
npm install pilcrow-typeset
npx playwright install chromium Playwright is a peer dependency; the Chromium download is roughly 170 MB on first install. Node 18.17.1 or newer. The package is published on npm.
Single-shot
For one-off documents — opens Chromium, typesets, tears down. The lifecycle is dominated by browser spin-up. Fine for a one-off; wasteful for a build job that processes many documents.
import { typeset } from 'pilcrow-typeset';
const { html, lineCount, paragraphCount } = await typeset(bodyHTML); Batch
For build integrations, keep one browser alive across many documents. The try/finally matters: if typeset() throws, close() still runs. A leaked Chromium will hang the build.
import { PlaywrightRenderer } from 'pilcrow-typeset';
const renderer = new PlaywrightRenderer();
await renderer.open();
try {
for (const doc of documents) {
const { html } = await renderer.typeset(doc.html, { dropCap: doc.dropCap });
writeFile(doc.outPath, splice(doc.outerHTML, html));
}
} finally {
await renderer.close();
} This is the shape both adapters use internally. See Eleventy and Next.js for the integration glue around the same renderer.
TypesetOptions
The empty-string and zero defaults are deliberate. They tell the renderer to read font, width, and line height from the CSS already loaded on the measurement page, so your stylesheet stays the source of truth for geometry.
| Field | Type | Default | Meaning |
|---|---|---|---|
fontShorthand | string | '' | CSS font shorthand (e.g. "18px ui-serif"). Empty reads the computed value from the page's CSS. |
maxWidth | number | 0 | Column width in CSS pixels. Zero falls back to the page's clientWidth. |
lineHeight | number | 0 | Line height in CSS pixels. Zero reads from computed style. |
postPath | string? | — | Identifier used in build warnings ("posts/foo"). |
dropCap | boolean? | true | Drop cap on the lede paragraph. Pass false to opt out. |
Return value
Both typeset() and renderer.typeset() resolve to the same shape:
{
html: string; // typeset HTML, ready to splice into your output
lineCount: number; // total pt-line spans produced
paragraphCount: number;// paragraphs the renderer touched
} The counts are useful for build logs and for deciding when a paragraph went unprocessed (a paragraph containing unsupported inline markup falls back to the flat path; the count tells you how many paragraphs reached the renderer at all).
Exports
typeset(html, options?)— single-shot convenience.PlaywrightRenderer— the renderer implementation; managed lifecycle.TypesetRenderer— the interface every renderer satisfies. Today there is one. When pretext ships server-side rendering upstream, the next implementation drops in here.TypesetOptions— the options struct documented above.hyphenateHTML(html)— the Node-side soft-hyphen injector, exposed for callers that want hyphenation without the Chromium pass.
Build environment
This package runs Playwright Chromium during your build. The build host therefore needs to allow installing and launching Chromium. Confirmed working on Vercel, Netlify, Cloudflare Pages, GitHub Actions, and GitLab CI. Locked-down CI environments without sandboxing or with no browser binaries are not supported. Build time scales with post count plus a fixed Chromium spin-up cost paid once per build.
Credit
The line-breaking primitive is pretext by Cheng Lou. Pretext is the multilingual text-measurement library that decouples layout from the DOM; it is the load-bearing wall under everything pilcrow-typeset exposes. The editorial layer above it (drop caps, sidenote-aware spans, en-gb hyphenation, the orphan guard) is the easy part. Most of the credit for what you see on the page belongs upstream.