TechLoom - Your Vision, My Craft.

I run experiments and publish what I find.

Empirical benchmarks and field notes on AI coding, prompt engineering, and infrastructure — from Johnathen Chilcher, Senior SRE at GoDaddy with 10+ years inside Fortune 100 production systems.

May 7, 2026

CoT Helps Go and C# But Hurts Python: When Prompt Advice Flips by Language (5,760 Benchmarks)

Until now, every benchmark in this series has used Python tasks exclusively — tight control on variables to isolate prompt technique. This post revisits chain-of-thought across Go, C#, and Python and finds the advice flips by language.

May 5, 2026

Does Multi-Agent Orchestration Prompting Help? It Depends on What You Add (1,800 Benchmarks)

Multi-agent workflows are everywhere — planning agents write specs, executor agents write code, reviewer agents check quality. This benchmark isolates what actually moves the needle in structured orchestration prompts.

April 30, 2026

Telling AI to Write 95/100 Code Doesn’t Make It Write 95/100 Code (5,760 Benchmarks)

“Top developers score 95+/100 on this task.” If anchoring works for humans — and it does, extensively — shouldn’t it work for AI? Set a high bar, prime the model with quality expectations, and see what happens.

About Johnathen

I’m a Senior Site Reliability Engineer at GoDaddy with 10+ years building and running infrastructure inside Fortune 100 companies. TechLoom is where I publish the experiments I run, the open-source tools I build, and what I learn about AI, prompt engineering, and reliability along the way.

Everything here is empirical. Real data, real benchmarks, shown the way I’d want someone else to show me — with the methodology, the numbers, and the parts that didn’t work.

Building something interesting? I’m occasionally open to research collaborations — drop me a line.

I run experiments and publish what I find.

Latest research

CoT Helps Go and C# But Hurts Python: When Prompt Advice Flips by Language (5,760 Benchmarks)

Does Multi-Agent Orchestration Prompting Help? It Depends on What You Add (1,800 Benchmarks)

Telling AI to Write 95/100 Code Doesn’t Make It Write 95/100 Code (5,760 Benchmarks)

About Johnathen

Ledgr

I run experiments and publish what I find.

Latest research

CoT Helps Go and C# But Hurts Python: When Prompt Advice Flips by Language (5,760 Benchmarks)

Does Multi-Agent Orchestration Prompting Help? It Depends on What You Add (1,800 Benchmarks)

Telling AI to Write 95/100 Code Doesn’t Make It Write 95/100 Code (5,760 Benchmarks)

Get new research in your inbox.

About Johnathen

Ledgr