Tollgate: Why I Built a Confirmation Layer to use with AI (and Human) agents (and Wrote My First Go Project to Do It)

15 min read

A few months ago, Claude deployed a Next.js project to Vercel without asking me.

I had been working through a stubborn build issue. The deployment kept failing on dependency resolution, and I had Claude helping me sort it out. We worked through the problem, isolated the issue, fixed it. Then, while working through the dependency alignments, Claude ran vercel deploy --prod to verify the fix worked.

It worked. The deploy succeeded. Production was now running the change.

The problem isn't that the deploy succeeded. The problem is that I never said "yes, please deploy." I had a rule in my CLAUDE.md asking for confirmation before any deploy. The rule didn't hold. In the flow of debugging, the agent decided the deploy was the natural next step and took it. The instruction got out-prioritized by the task.

The deploy was fine, it was what I was trying to do anyways, however the failure mode was the thing I couldn't stop looking at. If that had been a destructive operation, an irreversible one, a customer-impacting one it would have been too late by the time I found it it ran the command. The mistake wasn't Claude's so much as it was my workflow's. I was relying on an instruction in a Markdown file to be a hard guardrail, and instructions in a Markdown file are not hard guardrails. They are suggestions in a context window.

So I built tollgate.

What it is

Tollgate is a small command-line tool that puts a confirmation step in front of sensitive operations on your machine. You install it, it places shim binaries named git and gh ahead of the real ones on your $PATH, and from then on, any process (agent, script, or your own typing) that tries to run a watched command (git push, gh pr create, gh repo delete, and so on) gets stopped at a terminal prompt. You either approve it or you don't.

That's the entire concept. It's not a sandbox. It's not a firewall. It's not a permissions system. It's a tap on the shoulder at the moments that matter most. The friction is the feature.

The first version handles GitHub. The next will handle Vercel, closing the loop on the original anecdote. Eventually it'll cover whatever else turns out to need gating: package publish commands, infrastructure deploys, maybe some destructive rm-style operations. But the architecture is the same regardless of the target. Intercept the command, ask the human, log everything.

What it looks like

Very basic to be frank. When a call goes out I am presented with a prompt.

As expected, clicking allow then allows it through, clicking deny, denies the request.

Why structural beats instructional

The lesson I took from the Vercel deploy wasn't "Claude is dangerous" or "agents need to be more careful." It was simpler: the layer at which you enforce a rule determines whether it can be ignored.

If the rule lives in a markdown file the agent reads alongside its task, the rule competes with the task for attention. Sometimes the task wins. That's not a bug, that's how attention and instruction-following work in language models. It's also how attention works in humans. We override our own rules all the time when context makes them feel less important.

If the rule lives in the operating system, in the binary that gets called when an agent tries to do the thing, the rule cannot be ignored. The binary doesn't read the conversation. It doesn't know whose task this is. It doesn't care if the agent is being helpful. It just stops, asks, and waits.

That's the whole design philosophy. Tollgate isn't smarter than the agent. It doesn't understand intent. It just sits at a chokepoint and refuses to let things through without a human in the loop. Putting the safety check at the OS level rather than the prompt level was the entire point.

There's a side effect of this design that I want to name plainly: tollgate gates me too. When I run git push myself, I get the same prompt an agent would. Some people will read that as a flaw, since I'm getting interrupted on my own work. I read it as a trade off I am willing to deal with. A tool that only enforces its rules on agents would have to know which keystrokes came from agents and which came from humans, and there's no clean way to do that. Trying to draw that line introduces edge cases and false confidence. Treating every invocation the same makes the tool simple, predictable, and trustworthy.

If I want to bypass the prompts for a session of manual work, there are escape hatches: an environment variable, a session-level pause, a global toggle. I can opt out anytime. But the default is symmetric, and that's intentional. Asymmetric safety tools have a way of failing quietly at exactly the wrong moment.

Why Go

I'm primarily a Ruby on Rails engineer with strong Node and TypeScript secondary. I have not, prior to this project, written Go. I want to be honest about that, because it shapes the second half of this post. I also want to be honest about the choice: Go was the right tool for this job, despite being the unfamiliar one, and that was the entire reason to use it.

Tollgate runs on the hot path of every git invocation in your terminal. Every single one. If you type git status while tollgate is installed, the shim binary runs first, decides not to intercept (status isn't a watched command), and exec's into the real git. If that path is slow, tollgate becomes annoying. If it's fast, you forget tollgate is there at all.

Go's cold startup is around 5 milliseconds. Node is closer to 40. Python is 70 to 150. Multiplied across the dozens of times you run git in a session of real work, those differences are felt. Picking Go meant the unwatched path stays imperceptible, which is the only way the watched path's friction is acceptable.

The other piece is distribution. Tollgate ships as a single static binary. No runtime, no node_modules, no virtual environment, no version conflicts with whatever else the user has installed. v0.1 installs from source via git clone and make build, but the destination is a single binary on disk and nothing else. A Homebrew tap is on the v1.x roadmap, but even today the install story is dramatically simpler than it would be in any other language I considered. Half the reason small CLIs don't get adopted is that they're annoying to install. Go takes that problem and makes it disappear.

I considered Node. I considered Python. Both would have shipped faster, since I'd have been productive in either by the end of the first day. But the tool would have been worse: slower on the hot path, more painful to distribute, and harder to justify as "the right tool for the job" when readers asked.

The portfolio framing matters here too, and I'll be honest about that as well. Reaching for Go specifically, not because it was familiar but because the constraints of the work pointed there, is a clearer signal of judgment than building it in the language I already knew. Right tool for the job is a thing engineers say all the time. Living it is a thing engineers do less often. I wanted to hold myself accountable to it.

What I learned

I am not a Go developer now. I learned enough Go to ship one tool well, and I'm holding that line carefully. I'd rather be honest about the depth of my fluency than oversell what one project taught me. But the project did teach me things, and a few of them surprised me enough to be worth writing down.

Errors-as-values is verbose, but it's honest. Coming from Ruby's exception-based flow, I expected to find Go's if err != nil { return err } rhythm tiresome. I didn't. Every place a thing can fail is visible in the code. The verbosity isn't ceremony, it's documentation. After a couple of weeks, I stopped reaching mentally for try/catch and started thinking about which callers actually need to handle which errors. That's a clearer mental model than "anything could blow up at any time, hope your rescue blocks are in the right places."

Test-driven development fits Go's culture as much as it fits Ruby's. This was a welcome dsicovery. Ruby has a celebrated TDD tradition, but the language is so flexible (so willing to let you defer decisions, mock anything, monkey-patch anything) that TDD often feels like a discipline imposed on top of the language. Go doesn't give you those escape hatches. The compile loop is fast, the standard testing package is ergonomic, table-driven tests are the obvious idiom, and there's no framework decision to litigate. TDD in Go felt less like a virtuous practice and more like the path of least resistance.

syscall.Exec is a magic word. Tollgate's pass-through path, for unwatched commands like git status, uses syscall.Exec to replace the shim process entirely with the real binary, rather than spawning the real binary as a child and waiting for it. The shim doesn't sit in memory. There's no parent process for the child to inherit from. It's just gone. That kind of OS-level control isn't available in most higher-level languages, and even when it is, it's nervous-making. In Go it's a one-line standard library call. Knowing it exists is the kind of thing that makes you think differently about what's possible at the edges of a tool.

Single-binary distribution is the underrated feature. I knew this in theory. Living it changed how I think about CLI tools. Every CLI I've ever written or used in another language carried install friction with it. Tollgate doesn't. That's not a small thing, and it's not just about user experience. It's about whether the tool gets used. A tool with painful installation gets installed by enthusiasts and abandoned by everyone else. A tool that installs in two seconds gets tried by people who'd never have tried otherwise.

The boundaries of an idea are part of the idea. The biggest design lesson came from learning what tollgate cannot protect. The shim approach relies on $PATH inheritance: when any process runs git, the OS searches $PATH in order and runs the first match. Tollgate wins by being first. That works perfectly for the common case (a Claude Code session you launched from a terminal that has your updated $PATH), but it does not cover every case. A pre-existing Claude Code session that was open before you ran tollgate install already has the old $PATH. Cron jobs and CI runners typically start with a minimal environment that does not include your shell config. Anything that calls git with an absolute path skips $PATH lookup entirely. I spent a real amount of time being frustrated by these gaps before I realized the right move was to document them clearly in the README, not to engineer around them. The tool I have is honest about what it covers. A tool that pretended to cover everything would be worse, because users would trust it where they should not. Naming the limits is part of the design.

What's next

Tollgate is shipped. It's running on my machine, on my actual work, with my actual agents. The discipline of dogfooding has already shaped its design. The prompt format went through three iterations in the first week, settling into the simplest possible form: [y/n/a]. I had originally drafted more options (allow-once, deny-once, allow-session, deny-session, show details), and using the tool every day taught me that almost no one needs deny-session. If you want to deny something, you deny it. If a class of operation should be auto-denied, that's a config change, not a session preference. Cutting options made the tool better.

Vercel support is next, which closes the original loop. After that, the design opens up to whatever else turns out to need gating. I've thought about npm publish and cargo publish. I've thought about destructive Terraform operations. I've thought about a generic pattern-matching layer that lets users gate anything they want without me having to ship support for it. I'll figure out which ones matter as I keep using the tool.

If you're working with agents and you've ever felt the small unease of "I'm trusting that the model read my CLAUDE.md and will follow it," tollgate might be worth a look. It moves the safety guarantee from the prompt layer to the OS layer, and you sleep better at night. That was the whole point.

The repo is at github.com/rockwellwindsor/tollgate, with full install and usage documentation in the README. If you build something on top of it, or have ideas for what should get gated next, I'd genuinely love to hear about it.