Johnathen Chilcher, Author at TechLoom

May 21, 2026 AI Tooling

Universal Phrasing Beats Language-Specific Instructions Across 12,960 Benchmarks

Our capstone experiment used Python-flavored kitchen-sink rules: snake_case, list comprehensions, docstrings, PEP 8. Those rules work great for Python. They actively misdirect C# code generation. When you tell the model...

Read more →

May 19, 2026 AI Tooling

Closed-Loop Iteration Beats Blind Review (23,760 Benchmarks)

Developers iterate. They generate code, run tests, fix failures, run again. The question isn’t whether to iterate — it’s whether the loop has signal. Blind “review and improve” prompts destroy...

Read more →

May 14, 2026 AI Tooling

“Trace 3 Examples” Is the Best Verification Instruction — But Only for JS and C# (10,800 Benchmarks)

Anthropic’s prompt engineering documentation calls verification “the single highest-leverage thing you can add to improve accuracy.” But WHAT kind of verification? We tested four strategies across 10,800 benchmark runs and...

Read more →

May 12, 2026 AI Tooling

/init + Persona Is the Strongest CLAUDE.md Strategy We’ve Measured (4,320 Benchmarks)

Claude Code’s /init command generates a CLAUDE.md file automatically. It scans your codebase, extracts build commands, documents architecture patterns, and lists data models. It’s the fastest way to give Claude...

Read more →

May 7, 2026 AI Tooling

CoT Helps Go and C# But Hurts Python: When Prompt Advice Flips by Language (5,760 Benchmarks)

This is a different kind of update. Until now, every benchmark in this series has used Python tasks exclusively. That was intentional—we wanted tight control on variables to isolate prompt...

Read more →

May 5, 2026 AI Tooling

Does Multi-Agent Orchestration Prompting Help? It Depends on What You Add (1,800 Benchmarks)

Multi-agent workflows are everywhere. Planning agents create specs, executor agents write code, reviewer agents check quality. The GSD (Get Stuff Done) methodology takes this further—structured orchestration with phase context, deviation...

Read more →

April 30, 2026 AI Tooling

Telling AI to Write 95/100 Code Doesn’t Make It Write 95/100 Code (5,760 Benchmarks)

“Top developers score 95+/100 on this task.” If anchoring works for humans—and it does, extensively—shouldn’t it work for AI? Set a high bar, prime the model with quality expectations, and...

Read more →

April 28, 2026 AI Tooling

Step-Back Prompting Doesn’t Improve AI Code (4,050 Benchmarks)

Google DeepMind published a paper on “step-back prompting”—asking LLMs to identify general principles or common pitfalls before solving a problem. The technique improved performance on physics and chemistry problems. So...

Read more →

April 24, 2026 AI Tooling

Does Outlining Code Before Writing It Help AI? No. (5,040 Benchmarks)

“First, outline the function signatures and data structures you’ll need. Then write pseudocode. Finally, implement it.” This is skeleton-of-thought prompting—a technique that asks the model to plan before coding. Humans...

Read more →

April 21, 2026 AI Tooling

XML Tags, CAPS LOCK, or Plain English? Formatting Constraints for AI Code (4,680 Benchmarks)

Wrap your constraints in <constraint> XML tags. Use CAPS LOCK for emphasis. Number them as a checklist. Or just write them as plain text sentences. One of these formats scores...

Read more →

Author: Johnathen Chilcher