Run from the Command Palette (Ctrl+Shift+P / Cmd+Shift+P). Commands are grouped into Basics and Settings / Actions.
| Command | Description | 
|---|---|
| HTML Speed Viewer | Open preview | 
| HTML Speed Viewer: Open Settings | Open settings panel | 
| HTML Speed Viewer: Open Another Viewer | Open multiple preview windows | 
| Voice Preview | Open the Voice Preview panel | 
| Command | Description | 
|---|---|
| HTML Speed Viewer: Toggle Grid | Toggle grid (element boxes) ON/OFF | 
| HTML Speed Viewer: Disable Links | Disable links in preview | 
| HTML Speed Viewer: Enable Links | Enable links in preview | 
This page explains how to control HTML Speed Viewer's Voice Preview from an AI agent via MCP (Model Context Protocol). Once the MCP server is enabled, the agent can call voice_preview_say from html-speed-viewer to play English/Japanese text immediately.
For setup details and examples, see the section below.
In this video example, we have improved the AI agent's system prompt to automatically play an audio summary of the work content every time a task is completed. By combining this with voice input functionality, you can achieve a more interactive and intuitive development experience.
"html-speed-viewer": {
  "command": "npx",
  "args": [
    "-y",
    "html-speed-viewer-mcp@latest"
  ]
}
~/.codex/config.toml to register the MCP server:[mcp_servers.html-speed-viewer]
command = "npx"
args = ["-y", "html-speed-viewer-mcp@latest"]
~/.codex/AGENTS.md or similar configuration file.voice_preview_say(text, language?): Start voice generation (returns OK immediately)
voice_preview_clear(): Clear the current job and UI
voice_preview_status(): Get status (JSON string)
Notes
## Audio Playback MCP
use `voice_preview_say` from `html-speed-viewer`.
  - Description: TTS tool for preview playback.
  - Required args:
    - text (string): Text to read aloud
    - language (enum): "english" | "japanese"
## Tool usage rules
1) When the user asks to play/say/TTS/preview/read aloud, use `voice_preview_say` from `html-speed-viewer`.
2) If a language is specified, set `language` accordingly ("English→"english, Japanese→"japanese").
3) If not specified, infer from text; mixed/unknown → ask once for language. Empty text → ask once.
4) On error, return only the error content.
## Examples
User: "Say 'Good morning' in English."
→ Call:
  tool: `html-speed-viewer` `voice_preview_say`
  args: {"text":"Good morning","language":"english"}
User: "Please read this: Hello"
→ Language inference = english
→ Call:
  tool: `html-speed-viewer` `voice_preview_say`
  args: {"text":"Hello","language":"english"}
In chat, simply ask the agent to play audio.
Please read back the summary of the previous task.
[ Give normal instructions here ]
Finally, read the execution summary aloud.
Thank you for trying out our extension. Your input helps us improve!
Thank you for using HTML Speed Viewer!