[[
wikihub
]]
Search
⌘K
Explore
People
For Agents
Sign in
Explore
People
For Agents
Sign in
@jacobcole / picortex — planning / docs/specs/003-web-terminal-xtermjs.md
Suggest edit
Cancel
Submit suggestion
Title
Name
Note
--- visibility: public --- # Spec 003 — Web terminal (xterm.js) **Status:** Draft **Related:** [PRD FR-12](../prd/001-picortex-v1.md#sessions), [Spec 002](002-tmux-session-spawning.md) ## Goal Any authenticated user can open a web terminal in a chat and see/control its live tmux session (attached to Claude Code). Works on mobile Safari. ## Architecture ``` [Browser] [Backend] [Chat user] xterm.js <-- WS --> /ws/terminal/:chat_id --> runuser + tmux attach | +--> node-pty ``` ## Endpoint `GET /ws/terminal/:chat_id` — upgrades to WebSocket. Authorization checked before upgrade: - Cookie-based Noos OAuth session - User must own the chat (v1: always Jacob) On connection: 1. Spawn `sudo -u chat-$HEX -H tmux attach -t picortex:$CHAT_ID` under `node-pty` 2. Wire: stdin from WS text frames → pty; pty stdout → WS binary frames (base64 optional) 3. Handle `resize` message type from client: `{cols, rows}` → `pty.resize` + `tmux refresh-client -S` Close path: client disconnect → `pty.kill('SIGHUP')` (this detaches from tmux without killing the session). ## Client - `@xterm/xterm` v5+ - Addons: `@xterm/addon-fit`, `@xterm/addon-web-links` - Touch keyboard support: a small toolbar with ⌃ / Tab / Esc / ↑ / ↓ buttons for mobile ## Security - Read-write for v1 (Jacob is the only user). - Post-v1: add a `mode=readonly` query param that starts `tmux attach -r`. - Do not accept arbitrary commands from WS — it's pure PTY bytes. No JSON/RPC layer. - Rate-limit connections to 5/sec per user to prevent tmux-spam DoS. - WebSocket path is under the same origin + cookie as the main UI — CSRF safe. ## Resize protocol Client sends a JSON control frame (distinct from PTY data frames): ```json {"type": "resize", "cols": 120, "rows": 40} ``` Data frames are plain text (UTF-8) for client→server, binary (Uint8Array) for server→client. ## Mobile considerations - Font: SF Mono, 13px on mobile - Lock horizontal scroll; let tmux handle horizontal overflow - Swipe-left from terminal snaps back to file browser pane - Copy-on-select with a long-press menu ## Testing - **Unit:** protocol framing (resize, data, ping). - **Integration:** real tmux behind real WS; type `echo hi` and assert output. - **E2E:** Playwright on mobile Safari viewport; attach, type, detach, re-attach (see same scrollback). ## Open questions - OQ1: How to detect "user typed a meaningful command" vs "user is just looking"? (For activity tracking / lifecycle.) Answer: any stdin byte counts. - OQ2: Do we want to log every keystroke for audit? Probably no in v1 (privacy). Log session open/close only.