Scenario

Content Normalization: Paste, Whitespace, and DOM Hygiene

Architecting a consistent document state by neutralizing browser inconsistencies in HTML insertion and character encoding.

architecture
Scenario ID
scenario-content-normalization

Details

Overview

Every browser inserts different HTML when a user pastes or hits “Enter.” A robust editor must normalize this “Browser Soup” into a predictable internal schema to prevent data corruption and layout breakage.

Critical Normalization Zones

1. Paste Filter & Cleansing

When pasting from external sources (Word, Excel, Web), browsers inject massive amounts of hidden meta-data and proprietary CSS inside <style> blocks. Strict sanitization is required to strip non-standard attributes.

2. Whitespace &   Management

Browsers follow HTML rules, which collapse consecutive spaces into one. To maintain visual fidelity, editors often use Non-breaking spaces (&nbsp;).

  • Contamination: &nbsp; blocks CSS line-wrapping, causing layout overflows. This is severe in plaintext-only mode.
  • Conversion: Chrome/Edge frequently convert non-breaking spaces back to regular spaces during editing, causing intended alignment to collapse.

3. Empty Node Pruning

Rapid editing often leaves empty <span>, <b>, or <div> tags in the DOM. These “Ghost Tags” don’t affect visuals but break selection logic and node-count based features.

Normalization Strategy

The Parser Pipeline

Interrupt the paste or beforeinput event and run the incoming HTML through a DOMParser. Apply a strict whitelist of tags and attributes before allowing the insertion.

Whitespace Preservation (CSS over entities)

Prefer white-space: pre-wrap for preserving layouts rather than relying on &nbsp; chains. If manual intervention is required, use a beforeinput handler to insert \u00A0 only when a trailing space is detected.

Scenario flow

Visual view of how this scenario connects to its concrete cases and environments. Nodes can be dragged and clicked.

React Flow mini map

Variants

Each row is a concrete case for this scenario, with a dedicated document and playground.

Case OS Device Browser Keyboard Status
ce-0102-consecutive-spaces-collapsed Windows 11 Desktop or Laptop Any Chrome 120.0 US draft
ce-0117-nbsp-converted-to-space Windows 11 Desktop or Laptop Any Chrome 120.0 US draft
ce-0153-nbsp-line-break-prevention Windows 11 Desktop or Laptop Any Chrome 120.0 US draft

Browser compatibility

This matrix shows which browser and OS combinations have documented cases for this scenario. Click on a cell to view the specific case.

Confirmed
Draft
No case documented

Cases

Open a case to see the detailed description and its dedicated playground.

Related Scenarios

Other scenarios that share similar tags or category.

Tags: paste

Clipboard API paste does not work in contenteditable

When using the Clipboard API (navigator.clipboard.readText() or navigator.clipboard.read()) to programmatically paste content into a contenteditable region, the paste operation may fail or not work as expected.

2 cases
Tags: whitespace

Code block editing behavior varies across browsers

Editing text within code blocks (<pre><code>) in contenteditable elements behaves inconsistently across browsers. Line breaks, indentation, whitespace preservation, and formatting may be handled differently, making it difficult to maintain code formatting.

4 cases
Tags: paste

Image insertion behavior varies across browsers

When inserting images into contenteditable elements, the behavior varies significantly across browsers. Images may be inserted as <img> tags, as base64 data URLs, or may not be supported at all. The size, positioning, and editing behavior also differs.

2 cases
Tags: paste

List formatting is lost when editing list items

When editing text within list items, formatting such as bold, italic, or links may be lost or behave unexpectedly. The list structure itself may also be lost when certain operations are performed, such as pasting content or applying formatting.

3 cases

Comments & Discussion

Have questions, suggestions, or want to share your experience? Join the discussion below.