Building a custom Claude code plugin for sublime text

25 min read

I wanted watchexec, but for code review.

That sentence is most of the project. Watchexec is the tool that popularized a pattern most working developers use without thinking about it now: tell it which files to care about and which command to run, and it runs that command every time those files change. Tests on save. Builds on save. Whatever you want, scoped to what you care about. The filter is the feature.

I'd been using watchexec for years. Then Claude Code came along and I started reaching for it as a code reviewer, dropping into a terminal mid-flow to ask "anything I should know about what I just wrote?" It worked, but the friction was real. You had to remember to ask. You had to paste in the file you wanted reviewed. You had to switch contexts. The signal-to-friction ratio was bad enough that some days I'd write a hundred lines without asking for a review at all, not because I didn't want one, but because the asking was its own task.

So I thought: I want this to behave like watchexec.

That's the entire origin story of claude-watcher. It's a Sublime Text plugin that listens for save events in projects I care about and, when one fires on a file extension I care about, automatically opens a Terminus panel with Claude Code asking for a quick review. The same review I used to ask for manually, now automatic. Same scope discipline as watchexec: only the files I care about, only the projects I'm actively working on, nothing else.

The repo is at https://github.com/rockwellwindsor/claude_watcher. This post is about why it exists, why the filter matters more than the AI, and what twelve evening hours of polish work taught me about shipping a side project as a portfolio piece.

Why scope is the feature

The interesting choice in this design isn't the Claude part. It's the filter.

The naive version of this tool reviews every file you save. Every time. Across every project. You'd burn through tokens reviewing CSS tweaks in your spouse's birthday-card website and migration files in a Rails project that has been working fine for three years. The noise would teach you to ignore the tool within a week.

The version that earns a permanent place in your workflow does the opposite. It reviews almost nothing. You configure a small list of projects that are actively in flux, you configure a list of file extensions that are interesting (mostly source code, not assets), and you let it run silently the rest of the time. When a save passes the filter, a review happens. When a save doesn't, nothing happens at all. There's no notification, no panel opening, no token spent.

This is the same logic that makes watchexec good. Watchexec doesn't run your tests every time anything changes in your filesystem. It runs them when something changes that you said to watch for. The discipline of telling the tool what to care about is what makes the tool useful.

In a world where every AI tool wants to be ambient and pervasive, scoping is countercultural. The pitch for most AI dev tools is "it's watching everything, it knows everything, it's always available." The pitch for this one is "it's watching the four projects you told it to watch, it ignores everything else, it'll get out of your way the moment you stop wanting it." Less ambient. More opinionated. Better.

How it actually works

The plugin lives in your Sublime packages directory. When you save a file, Sublime fires an event. The plugin checks three things in order:

First, is the watcher enabled? You toggle it on and off from the Command Palette. The default is off when Sublime launches, so you opt in to the kind of session where you want reviews running.

Second, is the saved file inside one of your watched projects? You list them as substrings in a settings file: a full path, a project folder name, whatever matches. Match means in-scope; no match means stop.

Third, is the extension one you care about? Default list covers most languages I work in. Editable in settings.

If all three pass, the plugin debounces briefly (so rapid saves collapse into one review) and then opens a Terminus terminal panel running Claude Code with a small prompt: "I just saved <filename>. Quick review for bugs, refactors, security issues, or improvements. Be brief."

The Terminus panel stays open per-project. If you keep saving files in the same project, the same conversation continues, so Claude has context from earlier saves. When you switch projects, the previous session's content is written to a log file on disk before the panel is replaced. The logs are useful for two reasons: they let you scan a session's review history after the fact, and they preserve context that would otherwise be lost when the panel closes.

That's the whole tool. Maybe a hundred and fifty lines of Python plus a small settings file. The behavior I just described is the behavior. There's no roadmap to make it smarter. Smarter is the wrong direction. The thing I want from this plugin is exactly what it does and no more.

Why classical filtering, not classical AI

There's a version of this tool I deliberately did not build: a smart filter. One that tries to figure out which saves are "interesting" using heuristics or, worse, by asking a model. "Is this save worth reviewing?" passed to Claude, with the answer determining whether the actual review runs.

The reason I didn't build that is the same reason watchexec doesn't try to be clever about which file changes are interesting. The user knows what's interesting. The user is the one configuring the filter. Adding a layer of "AI guesses what you mean" between the user's configuration and the tool's behavior makes the tool less predictable, not more useful. Predictable is the entire point.

The same logic applies to the prompt. I could have built something that adapts the review prompt based on file type, on recent edits, on the user's apparent skill level. I didn't. The prompt is a fixed string with one variable: the filename. The strength of the tool is that you always know what's about to happen when you save a watched file. Variability would erode that.

This is part of a thesis I've been working out across several side projects: the interesting work in agent-driven development isn't about making agents smarter or more capable. It's about building the scaffolding that lets a human and an agent share a working environment with clarity about who's responsible for what. The agents are getting more capable on their own schedule. The scaffolding is the part I can build now.

What's next

I'm using claude-watcher daily, in the projects I care about, on the file types I want reviewed. The defaults that emerged from real use are probably the most useful artifact of the whole exercise: the debounce interval that feels right, the extension list that catches the right things, the prompt format that gets brief useful answers from Claude instead of verbose ones.

Future versions are on the roadmap but not urgent. Package Control submission is the obvious next step so installation gets simpler. Linux testing is the second. Eventually there might be a "review just the changed lines instead of the whole file" mode for big files, but I've found whole-file review is usually what I want anyway, so that one might never ship.

The repo is at https://github.com/rockwellwindsor/claude_watcher, with install instructions and full documentation. If you build something on top of it, or have ideas for what else should be filtered, scoped, and made boring, I'd genuinely like to hear.