May 21, 2026
AI Tooling
Our capstone experiment used Python-flavored kitchen-sink rules: snake_case, list comprehensions, docstrings, PEP 8. Those rules work great for Python. They actively misdirect C# code generation. When you tell the model...
Read more →
May 19, 2026
AI Tooling
Developers iterate. They generate code, run tests, fix failures, run again. The question isn’t whether to iterate — it’s whether the loop has signal. Blind “review and improve” prompts destroy...
Read more →
May 14, 2026
AI Tooling
Anthropic’s prompt engineering documentation calls verification “the single highest-leverage thing you can add to improve accuracy.” But WHAT kind of verification? We tested four strategies across 10,800 benchmark runs and...
Read more →
May 12, 2026
AI Tooling
Claude Code’s /init command generates a CLAUDE.md file automatically. It scans your codebase, extracts build commands, documents architecture patterns, and lists data models. It’s the fastest way to give Claude...
Read more →
May 7, 2026
AI Tooling
This is a different kind of update. Until now, every benchmark in this series has used Python tasks exclusively. That was intentional—we wanted tight control on variables to isolate prompt...
Read more →
May 5, 2026
AI Tooling
Multi-agent workflows are everywhere. Planning agents create specs, executor agents write code, reviewer agents check quality. The GSD (Get Stuff Done) methodology takes this further—structured orchestration with phase context, deviation...
Read more →
April 30, 2026
AI Tooling
“Top developers score 95+/100 on this task.” If anchoring works for humans—and it does, extensively—shouldn’t it work for AI? Set a high bar, prime the model with quality expectations, and...
Read more →
April 28, 2026
AI Tooling
Google DeepMind published a paper on “step-back prompting”—asking LLMs to identify general principles or common pitfalls before solving a problem. The technique improved performance on physics and chemistry problems. So...
Read more →
April 24, 2026
AI Tooling
“First, outline the function signatures and data structures you’ll need. Then write pseudocode. Finally, implement it.” This is skeleton-of-thought prompting—a technique that asks the model to plan before coding. Humans...
Read more →
April 21, 2026
AI Tooling
Wrap your constraints in <constraint> XML tags. Use CAPS LOCK for emphasis. Number them as a checklist. Or just write them as plain text sentences. One of these formats scores...
Read more →