Phenomenon
On iOS Safari, when using voice dictation to input text into contenteditable elements, the system fires initial beforeinput and input events with the complete dictated text. After the initial input completes, the system re-fires beforeinput and input events with the text split into individual words, causing event handlers to execute multiple times for the same input.
Reproduction example
- Open a web page with a
contenteditableelement on iOS Safari (iPhone or iPad). - Focus the
contenteditableelement. - Activate voice dictation (long press spacebar or tap microphone icon on keyboard).
- Dictate text: โ๋ง๋์ ๋ฐ๊ฐ์ต๋๋คโ (or any multi-word phrase).
- Observe the
beforeinputandinputevents in the browser console or event log.
Observed behavior
Initial Dictation Sequence
- User activates dictation and speaks โ๋ง๋์ ๋ฐ๊ฐ์ต๋๋คโ
beforeinputevent fires with:inputType: 'insertText'data: '๋ง๋์ ๋ฐ๊ฐ์ต๋๋ค'isComposing: false
inputevent fires with the complete text โ๋ง๋์ ๋ฐ๊ฐ์ต๋๋คโ inserted into DOM
Duplicate Events Sequence (Bug)
- After a short delay (typically 100-500ms),
beforeinputevent fires again with:inputType: 'insertText'data: '๋ง๋์'isComposing: false
inputevent fires with โ๋ง๋์โ insertedbeforeinputevent fires again with:inputType: 'insertText'data: ' '(space character)isComposing: false
inputevent fires with space insertedbeforeinputevent fires again with:inputType: 'insertText'data: '๋ฐ๊ฐ์ต๋๋ค'isComposing: false
inputevent fires with โ๋ฐ๊ฐ์ต๋๋คโ inserted
Key Characteristics
- Composition events (
compositionstart,compositionupdate,compositionend) do NOT fire during dictation isComposingis alwaysfalsein all events- Events are re-fired after the initial input completes
- Text is split at word boundaries (spaces)
- The DOM state after duplicate events is the same as after initial events (no actual change)
- Event sequence becomes out of sync with DOM state
Expected behavior
- Initial
beforeinputandinputevents should fire once with the complete dictated text - Events should NOT be re-fired after completion
- If events are re-fired, they should reflect actual DOM changes (not duplicate insertions)
- Event sequence should remain synchronized with DOM state
- Composition events should fire during dictation (as they do on macOS Safari)
Impact
This can lead to:
- Duplicate processing: Event handlers execute multiple times for the same input
- State synchronization issues: Application state may become inconsistent with DOM state
- Performance issues: Unnecessary processing of duplicate events
- Undo/redo corruption: Undo stack may contain duplicate or incorrect entries
- Validation issues: Validation logic may run multiple times on the same input
- Formatting issues: Formatting logic may be applied incorrectly due to split text
- Event sequence confusion: Handlers expecting a single input event receive multiple events
Browser Comparison
- iOS Safari: Dictation does not fire composition events, events are re-fired after completion with text split into words
- iOS Chrome: Same behavior as Safari (uses WebKit engine, required by Apple)
- macOS Safari: Dictation fires composition events, events are not re-fired after completion
- Chrome/Edge/Firefox (Desktop): Dictation behavior varies but generally more consistent, no duplicate re-firing
Distinguishing Dictation Input
Important: There is no reliable way to detect dictation input in web applications on iOS. Web APIs do not provide dictation detection capabilities, and native iOS APIs like UITextInputContext.isDictationInputExpected are not available in web contexts.
Potential Indicators (Not Reliable)
- Absence of composition events (but this also occurs with Korean IME on iOS)
- Rapid insertion of multiple words
- Text appears to be split and re-inserted
- Events fire in quick succession with complete words
isComposingis alwaysfalse(but this is also true for Korean IME on iOS)
These indicators are not definitive and may produce false positives.
Event Sequence
The sequence of events when inputting โ๋ง๋์ ๋ฐ๊ฐ์ต๋๋คโ via dictation:
Phase 1: Initial Dictation Input
| Order | Event | inputType | data | DOM State (before) | DOM State (after) |
|---|---|---|---|---|---|
| 1 | beforeinput | insertText | โ๋ง๋์ ๋ฐ๊ฐ์ต๋๋คโ | "" | - |
| 2 | input | insertText | โ๋ง๋์ ๋ฐ๊ฐ์ต๋๋คโ | "" | "๋ง๋์ ๋ฐ๊ฐ์ต๋๋คโ โ |
Phase 2: Duplicate Events (Bug)
After initial input completes, after a delay of approximately 100-500ms, events are re-fired with text split into words:
| Order | Event | inputType | data | DOM State (before) | DOM State (after) |
|---|---|---|---|---|---|
| 3 | beforeinput | insertText | โ๋ง๋์โ | โ๋ง๋์ ๋ฐ๊ฐ์ต๋๋คโ | - |
| 4 | input | insertText | โ๋ง๋์โ | โ๋ง๋์ ๋ฐ๊ฐ์ต๋๋ค" | "๋ง๋์ ๋ฐ๊ฐ์ต๋๋คโ โ |
| 5 | beforeinput | insertText | โ โ | โ๋ง๋์ ๋ฐ๊ฐ์ต๋๋คโ | - |
| 6 | input | insertText | โ โ | โ๋ง๋์ ๋ฐ๊ฐ์ต๋๋ค" | "๋ง๋์ ๋ฐ๊ฐ์ต๋๋คโ โ |
| 7 | beforeinput | insertText | โ๋ฐ๊ฐ์ต๋๋คโ | โ๋ง๋์ ๋ฐ๊ฐ์ต๋๋คโ | - |
| 8 | input | insertText | โ๋ฐ๊ฐ์ต๋๋คโ | โ๋ง๋์ ๋ฐ๊ฐ์ต๋๋ค" | "๋ง๋์ ๋ฐ๊ฐ์ต๋๋คโ โ |
Key Characteristics
- Events 1-2: Complete text inserted at once (DOM actually changes)
- Events 3-8: Text re-fired word-by-word but DOM doesnโt change
- Composition events:
compositionstart,compositionupdate,compositionendevents do NOT fire in any phase - isComposing: All events have
isComposing: false - Delay between phases: 100-500ms delay between Event 2 and Event 3
Complete Event Monitoring
Code to monitor all events during iOS dictation input:
const element = document.querySelector('[contenteditable]');
const eventLog = [];
const eventsToMonitor = [
'compositionstart', 'compositionupdate', 'compositionend',
'beforeinput', 'input',
'keydown', 'keyup', 'keypress'
];
eventsToMonitor.forEach(eventType => {
element.addEventListener(eventType, (e) => {
const eventData = {
timestamp: Date.now(),
type: eventType,
inputType: e.inputType || null,
data: e.data || null,
isComposing: e.isComposing || false,
textContent: element.textContent
};
eventLog.push(eventData);
console.log(`[${eventType}]`, eventData);
}, { capture: true });
});
Events That Fire vs Do Not Fire
| Event Type | Fires? | Initial Input | Duplicate Events |
|---|---|---|---|
beforeinput | โ Yes | 1 time | 3 times |
input | โ Yes | 1 time | 3 times |
compositionstart | โ No | - | - |
compositionupdate | โ No | - | - |
compositionend | โ No | - | - |
keydown | โ No | - | - |
keyup | โ No | - | - |
keypress | โ No | - | - |
Notes and possible direction for workarounds
Event Handling Considerations
- Event handlers may execute multiple times for the same input
- Events without actual DOM changes (Events 4, 6, 8) should not be processed
- Check
textContentto determine if DOM actually changed
Undo/Redo Stack
- Recording duplicate events in undo stack creates duplicate undo entries
- Only record in undo stack when thereโs an actual DOM change
Additional Considerations
Selection State
- Selection state may be reset when duplicate events fire, so donโt use selection for duplicate detection; trust only
textContent
Undo/Redo Stack
- Recording duplicate events in the undo stack creates duplicate undo entries
- Using
textContent-based deduplication ensures only actual changes are recorded in the undo stack
Voice Control Simultaneous Use
- Enabling both Voice Control and Dictation in iOS settings may cause text to be inserted twice
- This case actually changes the DOM, so
textContent-based deduplication wonโt detect it - Recommend users enable only one
Test Environment
| iOS Version | Browser | Language | Reproduced |
|---|---|---|---|
| iOS 16.x | Safari | Korean | โ Confirmed |
| iOS 16.x | Safari | English | โ Confirmed |
| iOS 16.x | Chrome iOS | Korean | โ Confirmed |
| iOS 17.x | Safari | Korean | โ Confirmed |
| iOS 17.x | Safari | English | โ Confirmed |
| iOS 17.x | Chrome iOS | Korean | โ Confirmed |
| iOS 18.x | Safari | Korean | โ ๏ธ Unconfirmed |
| iOS 18.x | Safari | English | โ ๏ธ Unconfirmed |
Note: The same issue likely occurs across all iOS versions (shared WebKit engine). Issue appears to occur regardless of language.