Why Two Models?
Different jobs, different tools.
- Claude Opus 4.5 generates the trees. This is where reasoning quality actually matters.
- GPT-5.1 turns those trees into quick tables and summaries so the UI feels snappy.
// Tree generation: Claude Opus (quality matters)
const stream = anthropic.messages.stream({
model: "claude-opus-4-5-20251101",
max_tokens: isExploreMode ? 512 : 1500,
});
// Table summary: GPT-5.1 (speed matters)
const tableResponse = await openai.responses.create({
model: "gpt-5.1",
max_output_tokens: 500,
text: { format: { type: "json_object" } },
});
Tradeoff: initial tree generation can take a few extra seconds.
Upside: you get deep, structured simulations that still feel instant because the structured bits arrive first.
Streaming So Users Do Not Stare at Nothing
Full trees are around 1500 tokens. That is about 8 to 12 seconds if you block.
I stream partial results and send progress events every ~200 characters, so the UI can show branches as they grow instead of one big reveal at the end.
const readableStream = new ReadableStream({
async start(controller) {
for await (const event of stream) {
if (event.type === "content_block_delta") {
fullText += event.delta.text;
if (fullText.length - lastProgressUpdate > 200) {
controller.enqueue(encoder.encode(JSON.stringify({
type: "progress",
chars: fullText.length
}) + "\n"));
}
}
}
const treeData = parseJSONWithRepair(jsonText);
controller.enqueue(encoder.encode(JSON.stringify({
type: "complete",
data: treeData
})));
}
});
Result: you see the first branches in about 800ms, not 10 seconds of nothing.
LLMs Break JSON. I Fix It Anyway.
Sometimes Claude sends almost-correct JSON — trailing commas, single quotes, unclosed braces.
Instead of giving users an error, I repair it:
function repairJSON(jsonText: string): string {
// Single quotes -> double quotes
text = text.replace(/'([^']+)'(\s*:)/g, '"$1"$2');
// Kill trailing commas
text = text.replace(/,(\s*[}\]])/g, '$1');
// Balance braces
while (braceCount > 0) {
text = text + '}';
braceCount--;
}
return text;
}
Parse success went from about 92 percent to about 99.5 percent. Less rage for everyone.
Rate Limiting With a Dev Escape Hatch
I do not want a Hacker News spike to nuke my Anthropic bill.
- Prod: 10 requests per hour per IP.
- Me: I still need to hammer this thing.
So there is a tiny dev escape hatch via URL param to localStorage:
// Frontend: store dev key with expiry
const devParam = urlParams.get('dev');
if (devParam) {
const expiry = Date.now() + (30 * 24 * 60 * 60 * 1000);
localStorage.setItem('parallel-lives-dev', JSON.stringify({ key: devParam, expiry }));
}
// API: skip rate limit for valid dev keys
const isDev = DEV_KEY && devKey === DEV_KEY;
if (!isDev) {
const rateLimitResult = rateLimit(`generate:${clientIP}`, RATE_LIMIT_CONFIG);
if (!rateLimitResult.allowed) {
return NextResponse.json({ error: "Rate limit exceeded" }, { status: 429 });
}
}
Simple, boring, and it works.
No White Flash on Dark Mode
SSR apps love flashing white before React hydrates. Parallel Lives runs in dark mode a lot, so that flash is painful at 2 AM.
I ship the user theme before React loads:
<html lang="en" className="dark" suppressHydrationWarning>
<head>
<script dangerouslySetInnerHTML={{
__html: `
(function() {
var saved = localStorage.getItem('parallel-lives-dark-mode');
if (saved === 'light') {
document.documentElement.classList.remove('dark');
document.documentElement.classList.add('light');
}
})();
`,
}} />
</head>
Theme applies before first paint. Zero flash.
Mobile Gestures That Feel Native
Desktop gets a floating draggable panel. Mobile gets a bottom sheet with flick-to-dismiss.
I track finger velocity myself so it does not feel like a janky web modal:
const velocityRef = useRef(0);
onTouchMove={(e) => {
const currentY = e.touches[0].clientY;
const currentTime = Date.now();
const timeDiff = currentTime - lastTouchTime.current;
if (timeDiff > 0) {
velocityRef.current = (currentY - lastTouchY.current) / timeDiff;
}
}}
onTouchEnd={() => {
// Fast downward flick = minimize
if (velocityRef.current > 0.5) {
setMobileSheetHeight(minSheetHeight);
}
}}
Feels closer to a native app than a dashboard.
Prompt Engineering That Actually Works
One generic "You are a helpful AI" prompt gives you generic futures.
I detect the scenario type and inject domain-specific guidance into the system prompt:
const scenarioGuidance: Record<string, string> = {
'job-offer': `
- Total comp breakdown: base, bonus, RSUs, 401k match
- Equity: calculate value at different valuations
- Benefits value: health insurance ($5-20k/year), PTO
`,
'startup': `
- Runway calculation: savings ÷ monthly burn = months
- Success rates: ~10% succeed, ~40% fail completely
`,
};
const scenarioType = detectScenarioType(templateId, decision);
const guidance = scenarioGuidance[scenarioType] || '';
systemMessage = systemPrompt.replace('{{SCENARIO_GUIDANCE}}', guidance);
This is what makes the app feel grounded instead of "AI fanfic about your life."