Backed by Y Combinator

Every agent run,
made smarter.

ReasonBlocks catches failures mid-run, cuts tokens, and compounds intelligence across every agent you deploy.

Benchmarks — SWE-bench Pro

Toggling ReasonBlocks on the same agent loop delivers measurable lifts.

Same agent code, same prompts, same models. The only variable is whether ReasonBlocks is wrapping the client. Run on SWE-bench Pro.

42%
Accuracy lift
Claude Sonnet
100%
Accuracy lift
Claude Opus
52%
Token reduction
per task
24%
Latency reduction
per task
SWE-bench ProClaude Sonnet, Opusvs. baseline (no ReasonBlocks)

Full methodology — token counts, latency measurement, agent loop — in the whitepaper.

Capabilities

Six capabilities. One runtime.

Each one targets a specific failure mode. None require changes to your agent code.

swipe to explore

01

Reasoning reuse

trace lookup190k traces indexed
newrefactor the auth module
candidate matches
#1842refactor auth.py modulematch
#1207move auth to middlewaresimilar
#0931rename auth_token envsimilar
Trace available#1842

Past run resolved the same shape of problem.
agent decides whether to use it (LLM keeps judgment)

02

Semantic file memory

semantic index218 files indexed
src/auth.py12.4kb cached

Loads AUTH_TOKEN via os.getenv, validates JWT, exposes authenticate() and refresh().

src/config/loader.py8.1kb cached

Parses .env and merges with defaults. Returns Config dataclass.

src/middleware.py6.7kb cached

Mounts auth + rate-limit middleware on FastAPI app.

03

Loop detection

trace monitorwindow: 6 turns
turn 22grep -r "auth_token" src/
turn 23grep -ri "AUTH_TOKEN" src/
turn 24rg "auth[_-]token" --hidden
turn 25grep -r "getenv" src/
04

Tool supervision

tool callslast 5 turns
turn 24readsrc/auth.py×3
turn 25search"token"×4
turn 26readsrc/auth.py
turn 27search"token"
05

Context compression

context windowbudget: 32k tokens
turns 1–18summarized — 47.2k 1.8k tokensfolded
turn 19Read "src/config/auth.py"6.4k
turn 20Bash "git diff src/"3.1k
turn 21Edit "src/config/auth.py"4.8k
06

Reasoning-aware context pruning

preview
reasoning tracemath word problem · 115 tokens

Problem: A factory makes 12 widgets per hour. After the first 4 hours, output halves— but this throttle only applies on weekdays. Today is Saturday. The factory runs for 6 hours. How many widgets are produced?

Step 1: Base rate is 12 widgets per hour, before any adjustments.

Step 2: After 4 hours, the throttle would halve output to 6 per hour for the rest of the shift.

Step 3: Critical: the throttle only applies on weekdays. Today is Saturday, a weekend day.

Step 4: Since it's a weekend, the throttle does not activate. The factory runs at the full 12 per hour for all 6 hours.

Step 5: Total widgets = 12 × 6 = 72.

Answer: 72 widgets.

Full trace115 tokens

A short math problem with a catch. The small words only on weekdays and does not activate are what make the answer 72. Drop them and you get 60.

Integrations

Drops into the agent framework you already use.

from langchain.agents import create_agent
from langchain_anthropic import ChatAnthropic
from reasonblocks import ReasonBlocks

rb = ReasonBlocks(api_key="rb_live_...")
mw = rb.middleware(agent_name="bugfixer", task="Refactor the auth module")

agent = create_agent(
    model=ChatAnthropic(model="claude-sonnet-4-5"),
    tools=your_tools,
    middleware=[mw],
)

with mw:
    agent.invoke({"messages": [("user", "Refactor the auth module")]})

Other framework? Drop us a line.