Claude Code Setup Log #9: Three Agent Techniques Working for Me

Three techniques this week:

Giving agents real-time X search through xurl, so they discover new agent techniques the same day those techniques ship.
Asking an agent to ask agents: dispatch a few subagents, judge their answers, hand me the best one.
Printing Press: print agent-native CLIs for any service in seconds, instead of wiring up token-bloated MCPs.

The thread across all three: each one collapses the cost of giving an agent a new capability. New techniques get discovered overnight. New patterns get applied via dispatch. New services come online via on-demand CLI generation.

1. Giving agents X.com search capabilities

X is where you see the experimentation in real time. New skills, new MCPs, new CLI factories, new prompt patterns ship daily. By the time something appears in a blog post or aggregator, it has already been productized by power users on X. Standard web search lags by days or weeks. Agent tooling is moving too fast for that lag to be acceptable.

I gave my Hermes agent read access to X through the official xurl CLI (shipped by @XDevelopers) and the Hermes xurl skill. The skill is a thin wrapper that translates natural-language questions into v2 search operators (from:, lang:en, -is:retweet, since:), runs the query, parses the JSON, and synthesizes the signal with handles cited.

The scheduled discovery loop

The xurl skill is not just manually invoked. A Hermes cron job scans my Claude Code, Codex, and Hermes session transcripts each day, identifies the pitfalls I hit (errors, repeated debugging cycles, manual fixes for things that should be automated), and queries X and GitHub for techniques or tools that solve those specific problems.

Findings flow into my second brain (gbrain) so the next day’s agent starts with yesterday’s lessons already absorbed. The scheduled task uses the xurl skill for X queries, the gh CLI for GitHub queries, and a small synthesis prompt to produce a daily “what to try next” report.

What it caught this week

The scheduled task flagged a thread from @mvanhorn arguing that even good MCPs cost roughly 35x more tokens than the equivalent CLI doing the same task, and proposing a tool called Printing Press that prints CLIs from any API spec. I tried Printing Press the same afternoon. It became item 3 of this post.

That is the loop working as designed. My agents discover the techniques I should be using before I would have stumbled on them through manual scrolling.

Operator lesson

Real-time signal sources (X, HN, GitHub trending) need to be wired in as agent capabilities, not as bookmarks I visit. The xurl skill plus the scheduled scan turns “I should keep an eye on X for new agent tools” into a passive system that updates my agent’s knowledge daily.

Links

xurl CLI: github.com/xdevplatform/xurl
xurl skill (Hermes): hermes-agent.nousresearch.com/docs/user-guide/skills/bundled/social-media/social-media-xurl

2. Asking an agent to ask agents

The technique: instead of asking Claude Code or Codex to do a task, ask it to dispatch a few subagents to do the task in parallel, judge their answers, and return the best one.

@VictorTaelin crystallized this on X this week:

“Don’t ask Codex to do stuff. Ask Codex to ask Codex to do stuff. Rejoice as you watch it handling and correcting all the dumb shit that it does and that you’d be dealing with otherwise.”

When someone asked how to actually prompt it, his reply was the whole implementation:

“ask 4 agents to do the following: …; select the best answer and report back to me.”

I had every substrate for this already. polyclaude council, claude-code’s /teams, dispatching-parallel-agents, parallel-pr-review, multi-agent-tdd-implementation. Each does some flavor of dispatch-and-evaluate. The gap was the default verb. I would say “fix this bug” or “draft this email” when I should have been saying “ask 4 agents to do this, judge their answers, give me the best one.”

The 4-subagent dispatch in practice

A real example from this week. I told Claude Code to spin up 4 subagents to draft the outreach copy for a close-lost re-engagement play. Specific instructions: anchor on the original close-lost reason from the CRM, no generic “just checking in” openers, three-touch sequence.

Two of the subagents came back with versions that ignored the constraint and produced generic templates anyway. Two opened with the original close-lost reason from the CRM, which is exactly what a good rep would do.

The orchestrator agent then created the best version based on my evaluation criteria (specificity to the original close reason, no generic AI phrases, three-touch sequence shape). It picked the best opener from one variant, the best follow-up from another, and assembled the final copy. That is the version we shipped.

Where else it earns its keep

Code bugs, especially the kind where one approach feels right but produces ugly downstream effects. I now ask 4 subagents to fix the same bug in different ways and have the orchestrator pick the most elegant solution. The dead-ends the other three attempts hit surface failure modes I would have walked into.

It also works for naming things. Variable, function, file, concept. Four subagents propose names, the orchestrator picks based on conventions encoded in CLAUDE.md.

The honest tradeoff

This pattern is definitely not token-efficient. Four subagents plus an orchestrator is roughly 5x the token spend of a single instance. Worth it when the single instance keeps getting it wrong. Not worth it when the task is straightforward.

The way I decide: if I would be cleaning up the agent’s first pass anyway, the dispatch pattern pays for itself in saved cleanup time. If the first pass usually works, skip it.

Links

Victor Taelin (@VictorTaelin) on X: x.com/VictorTaelin
The skills that already do this in Claude Code: polyclaude council, /teams, dispatching-parallel-agents, parallel-pr-review, multi-agent-tdd-implementation

3. Printing Press: agent-native CLIs from any service

I wanted my agent to have access to my Strava data so I could ask “summarize my runs from the last 30 days, compare to the prior 30, flag any trends in mileage or pace” without manually pulling export files.

Strava’s API exists. There is no official Strava MCP. The community-built Strava MCPs I tried were all token-inefficient (verbose schemas, bring-everything responses that bloat agent context). I had been ignoring this gap for months.

@mvanhorn and @trevin shipped Printing Press, which closes this gap. It is a CLI factory plus a CLI library that prints fast, local, SQLite-backed CLIs from any API spec, website, or browser-traffic capture.

The MCP token problem

Most MCPs waste tokens. Verbose schemas and bring-everything responses bloat the agent’s context, and even the well-designed MCPs cost roughly 35x more tokens than the equivalent CLI doing the same task (per @mvanhorn’s benchmarks). I work across PandaDoc internal tools, Snowflake, Salesloft, Recurly, and a dozen personal services. Every one that ships only as an MCP burns tokens I would rather spend on reasoning.

The factory pattern

The interesting thing about Printing Press is not the 50+ pre-printed CLIs in its library (Linear, ESPN, Flight GOAT, Contact Goat, and more). It is the factory itself.

Type /printing-press <service> in Claude Code. Printing Press figures out the service’s API, runs the OAuth dance (one-time), and generates a token-efficient CLI that works identically across Claude Code, Codex, and Hermes. The CLI is local, SQLite-backed for caching, and exposes the API as agent-readable subcommands.

Walking through the Strava example end-to-end

I typed /printing-press strava in Claude Code. One-time Strava OAuth login in a browser. About a minute later, I had a working CLI.

Now I can ask my agent “summarize my runs from the last 30 days, compare to the prior 30, and flag any trends in mileage or pace,” and it:

Calls the Strava CLI to pull activity logs for both 30-day windows.
Computes the comparison (total miles, average pace, longest run, frequency).
Identifies trends.
Writes a short summary into my daily Obsidian note.

That is a workflow I would never have built by hand because the token cost of going through a verbose Strava MCP would have made it not worth it. The CLI version uses a fraction of the tokens of an equivalent MCP query.

Plays nicely with the CLI-first stack

Printing Press fits cleanly alongside the CLI-first tooling I already have. Plaid CLI for finance data (shipped May 7, the substrate from Log #8). gog-cli for Google Workspace. gbrain CLI for my second brain. The agent dispatchers in Item 2.

Every service I touch is one prompt away from being agent-readable. That is a big shift from where things were six months ago, when every new service meant either writing a custom adapter or accepting a bloated MCP.

Links

Printing Press: printingpress.dev
Printing Press factory: github.com/mvanhorn/cli-printing-press
Matt Van Horn (@mvanhorn) on X: x.com/mvanhorn

The thread across all three

Three different techniques (passive discovery via xurl, dispatch-and-evaluate prompting, on-demand CLI factories), but they rhyme.

Each one collapses the cost of giving an agent a new capability:

Discovery cost drops to ~zero when xurl plus a scheduled scan finds new tools in real time.
Prompt cost drops to one verb when I just say “ask 4 agents” instead of crafting the perfect single prompt.
Access cost drops to about a minute when Printing Press prints a CLI for any service I have not wired up yet.

The compounding effect: my agent’s surface area is growing daily without me writing more glue code. New techniques get discovered overnight. New patterns get applied via dispatch. New services come online via Printing Press.

The pattern across all three: when an agent technique earns its keep, the next move is to make using it the default, not the exception. Better default discovery layer, better default prompting verb, better default tool wrapper. Spend the saved tokens on reasoning.

Last log: Setup Log #8: Three Agent Setups Working for Me.