Selection, Range & offsets

The browser exposes caret and selection in the DOM; your Rust model uses its own positions. Crossing the boundary without a single indexing convention is a frequent source of off-by-one and IME bugs.

Overview

Pair this page with Editor → Position & selection (model-level ideas) and JS ↔ WASM boundary (copy cost). Here the focus is numeric indices and Range endpoints.

UTF-16 vs UTF-8

JavaScript strings are indexed in UTF-16 code units. Rust String is UTF-8 bytes; character boundaries are not the same as JS “character” indices for astral code points and many IME sequences.

  • Prefer passing byte offsets in Rust and explicit scalar indices or UTF-16 offsets in JS—document which at every API.
  • Never assume range.startOffset in a text node equals your Rust char index without conversion.

Selection & Range in JS

window.getSelection() returns anchor/focus nodes and offsets into DOM text nodes and elements. Your WASM layer usually should not reimplement hit-testing; it consumes normalized positions you derive in JS (paths, node ids, or stable offsets into a flat buffer).

getTargetRanges & beforeinput

For beforeinput, getTargetRanges() can describe the range the browser intends to mutate—when supported. Coverage varies; see site scenarios on selection and input.

During IME composition, ranges may be provisional; reconcile after compositionend when possible.

Mapping to a Rust model

Common patterns: (1) maintain a stable block/inline id map in JS and send ids + offsets to WASM; (2) serialize a thin snapshot of the selection only; (3) use a single canonical cursor in Rust and project to DOM on render.

Mobile & touch selection

Touch handles and delayed selection updates differ from desktop. Tests should include real devices or automation that drives touch selection APIs where possible.