If you’ve started building skills for Claude Code, you’ve probably had the experience of writing one that never triggers. You ask for the exact thing the skill was built to do, and Claude just… does it the long way, ignoring the skill entirely. The frustrating part is that the fix is rarely in the logic. It’s in how the skill is written.
So before any of the patterns, it helps to understand what a skill actually is.
How skills actually work
A skill isn’t executable code. It’s a prompt template that gets injected into the conversation when Claude decides it’s relevant. Claude sees a formatted list of every available skill — just names and descriptions — sitting in its system prompt. When a task comes in, it reads that list and decides whether to load one. If it does, the full instructions become part of what it’s working from.
The key thing to sit with: skill selection is pure reasoning. There’s no keyword matching, no embeddings, no classifier deciding for you. Claude reads your description and makes a call. That means your description is the single most important piece of the whole skill. Everything else can be perfect, and a vague description will still leave the skill on the shelf.
Three levels of disclosure
The architecture is built around loading as little as possible until it’s needed:
- Metadata — always in context. Just the name and description from the frontmatter, roughly a hundred words. Every skill you have installed shares one budget here, so this layer stays lean by design.
- The skill body — loaded only when the skill triggers. The main instructions. Keep it under 500 lines.
- Bundled resources — loaded on demand. Scripts, reference docs, templates. These only enter context when Claude actually reaches for them, and scripts can run without ever being read into context at all.
That structure is the reason a well-built skill can be large without being expensive. Most of it is dormant until the moment it’s relevant.
The description is the whole game
Community testing across a couple hundred prompts found that optimizing a description lifts activation from around 20 percent to roughly 50 percent — and adding concrete examples pushes it close to 90. Claude tends to undertrigger: it won’t reach for a skill it’s unsure about. So you write descriptions that are a little pushy.
Here’s the difference. Weak:
description: Helps with documents
Strong:
description: >
Extract text and tables from PDF files, fill forms, merge documents.
Use when working with PDF files or when the user mentions PDFs, forms,
document extraction, or needs to process .pdf files in any way.
The rules that get you there:
- Write in third person — “Processes Excel files,” not “I can help you with spreadsheets.”
- Say both what it does and the specific situations that should trigger it.
- List the actual words and phrases someone would say.
- Lean toward over-matching. A skill that fires when it’s marginally relevant beats one that never fires at all.
- Stay under 1024 characters.
Patterns that hold up
Once the skill is triggering, the rest is structure. A handful of patterns show up again and again in skills that work.
Route, don’t dump. The dominant pattern in production skills is a short body that acts as a router, pointing to detailed files only when they’re needed. The body might cover the quick-start path, then say “for form filling, see FORMS.md.” Claude loads FORMS.md only when forms come up. One critical rule: keep references one level deep. Don’t chain a skill into one file that points to another that points to a third — Claude may only preview a nested file rather than read it fully.
Split by domain. When a skill spans several areas, break the references out by variant — finance, sales, product — and give the body a short directory so Claude pulls only the slice that’s relevant. Same progressive-disclosure idea, applied to breadth instead of depth.
Build in a validation loop. This is one of the highest-impact patterns. Complex operations should follow do-the-thing, validate, fix, repeat — and the instruction should say only proceed when validation passes. For code skills the validator is a script. For writing skills it can be a checklist or a style guide to review against. Either way, the loop catches mistakes before they compound.
Plan, validate, execute. For high-stakes or batch work, add an intermediate artifact — a plan file — that gets validated before anything runs. Analyze, write the plan, check the plan, then execute. The trick that makes this work is verbose error messages. “Field ‘signature_date’ not found. Available fields: customer_name, order_total, signature_date_signed” tells Claude exactly how to self-correct. A bare “validation failed” doesn’t.
Guide the decision, don’t front-load it. Rather than dumping every instruction at once, walk Claude through decision points: creating new content goes one way, editing existing content goes another. If a branch gets heavy, push it into its own file and tell Claude which one to read.
Bundle the scripts. Even when Claude could write a script itself, a pre-made one is more reliable, costs no generation tokens, and stays consistent across runs. Be explicit about how each one is meant to be used. There’s a real difference between “run analyze_form.py to extract the fields” — where the output enters context but the code never does — and “see analyze_form.py for the extraction logic,” which you only want when Claude needs to understand how it works.
Show, don’t describe. For anything where output quality matters, give input/output pairs. A couple of concrete examples of a commit message, before and after, beat a paragraph describing the format you want.
Hand over a checklist. For long multi-step tasks, give Claude a checklist it can copy and track as it goes. It keeps a five-step process from quietly losing a step in the middle.
How to actually write the thing
The patterns are structural. The prose inside them has its own rules, and most of them come down to respecting the context window as a shared resource.
Be concise. Only add what Claude doesn’t already know. Challenge every paragraph — does it really need this, or can you assume it? Don’t explain what a PDF is or how a Python import works. That’s pure waste, and waste in a skill is waste in every conversation that loads it.
Explain why, not just what. Instead of stacking up MUSTs and NEVERs, give the reasoning. “Filter test accounts from queries — they pollute metrics and have caused incorrect executive reports” lands better than “ALWAYS filter test accounts.” When Claude understands why a rule exists, it applies it more sensibly at the edges.
Match freedom to risk. Fragile operations like database migrations want low freedom — exact scripts, no room to improvise. Where a preferred pattern exists, give pseudocode and some configuration. Where many approaches are valid, give the general direction and trust the model. Over-constraining a flexible task is its own kind of failure.
A note on naming, since it trips people up: use the gerund form — processing-pdfs, writing-documentation — or a clear variant like pdf-processing. Avoid helper, utils, tools, documents. They’re too vague to mean anything. Lowercase, numbers, and hyphens only.
The traps
A few anti-patterns are worth naming because they’re easy to walk into:
- Deeply nested references. Claude may only preview a nested file instead of reading it. Keep everything one level deep.
- Too many options. “Use pypdf, or pdfplumber, or PyMuPDF, or…” Pick a default. Mention alternatives only for the specific edge cases that need them.
- Assuming tools are installed. Include the install command before you use anything.
- Time-sensitive instructions. Don’t write “if before August, use the old API.” Put outdated approaches in their own “old patterns” section instead.
- Over-explaining what Claude already knows. The most common and most expensive waste.
- Vague descriptions. Still the number one reason skills don’t trigger.
Build the test before the docs
The development loop that works is less about writing and more about observing. Do a task once without a skill and notice what context you keep providing by hand — that repetition is the skill. Draft it. Then write two or three test prompts, the kind of thing someone would actually say, and run them against a fresh Claude. Watch where it struggles. Bring those observations back and refine.
The principle underneath all of it: build your evaluations before you write extensive documentation. It’s the difference between a skill that solves a real problem and one that documents a problem you imagined. You only find out which one you’ve built by testing it — ideally across models, since what’s obvious to Opus may need spelling out for Haiku.
Skills reward iteration more than they reward getting it right on the first pass. Ship a rough one, watch it fail, fix the part that failed. The patterns here are a head start, not a substitute for that loop.