Follow The Request
turn 1
User event
1
Controller decision
2
Private scratch draft
3
Visible result
4
START_ANSWER
Let vanilla Qwen write a normal answer.
ABORT_AND_REPLAN
Cancel the old draft and answer the new request.
HOLD_FLOOR
Treat mhm or go on as a backchannel, not a new task.
STOP
Stop generation and discard the active draft.
Show model and token internals
How to read this
story first, internals second
The trained multistream part is a traffic controller. It does not write the final prose. It reads the user event plus any in-progress assistant prefix, predicts one action, and the harness decides whether vanilla Qwen should write, continue, stop, or replace the scratch answer.
1User event
A normal message, backchannel, stop, or correction arrives, sometimes while text is streaming.
2Controller
The two-channel adapter produces a hidden state. The router converts it to one action.
3Scratch buffer
Vanilla Qwen writes content only when the action says to answer or replan.
4Commit gate
The harness decides what becomes visible, what is frozen, and what is discarded.
Browser input
event
Two-stream rows
User
Output
Qwen + adapter
LoRA r=8
h=4096
Router head
4096 -> 9
Scratch buffer
state
Commit gate
ready
Controller action
START_ANSWER
Readout equation
z = W_router h_output + b
What the user does
What the controller asks
What vanilla Qwen does
What the user sees
Token-row probe matrix
hidden state on Output target row