Does this work with models other than Claude?

Yes. Claude Code, Cursor, Aider, anything that loads markdown skills.

Does the CLI need an API key?

No. Pure regex, zero network, zero runtime deps.

Why eight commands and not ten or twenty?

One per discipline AI assistants skip. Fewer overloads; more fragments.

Is lemmaly for senior engineers or juniors?

Seniors using AI as leverage. It's a forcing function, not a teacher.

Does installing the skill actually change what frontier AI writes?

Honestly, we haven't run a rigorous head-to-head eval. Anecdotally, on Haiku 4.5 / Sonnet 4.6 / Opus 4.7 the frontier models already reach for Set, HyperLogLog, and the right structures most of the time without prompting. The CI scanner is the part we trust — it catches the lazy code deterministically on every PR, no matter which model wrote it.

Then why ship the skills at all?

Three reasons: (1) they codify a reviewing standard a team can point at; (2) they push the model toward explicit `// Time: O(?)` comments in the code, which the CI scanner can then audit; (3) weaker / older models plausibly need them. We just haven't measured that.

an agent skill · v0.1.0

Catch the slow code AI writes by default.

59 deterministic rules. Finds O(n²) loops, N+1 queries, and brute force in AI output, before it ships.

one line. zero config. runs in CI.see it scan →source →

works through your agents

fit for · embedded · iot · edge · zero api key · zero network

Six classes, one input.constant finishes first · exponential never does

Complexity VisualizerBig-O race · n = 100k

O(1)constantconstant work

0 µsrunning…

O(log n)logarithmichalve each step

0 µsrunning…

O(n)linearone pass

0 µsrunning…

← /complexity points here

O(n log n)linearithmicdivide + merge

0 µsrunning…

O(n²)quadraticnested pass

0 µsrunning…

← unguided AI often ships this

O(2ⁿ)exponentialexhaustive recursion

0 µsrunning…

the rules and methodology aim your team at the top rows

What it catches

The patterns that pass review and die in production.

AI writes code that works on 1k rows and falls over at 100k. lemmaly flags the four most common ways.

arr.includes() inside a loop

O(n²). 1k rows: fine. 100k rows: 3 seconds.

js-unique-via-indexof

await query inside a for-loop

N+1. One request becomes one hundred.

js-await-in-for-loop

recursion with no memoization

exponential. fib(40) freezes the tab.

js-recursion-no-memo

SELECT without LIMIT

unbounded scan. Slows down as the table grows.

sql-select-no-limit

Captured · 498 real responses · 5 models · 10 tasks

When optimization is the task, AI nails it. When it's a side concern, AI quietly skips it.

Five frontier LLMs, ten runs per task, every response saved and classified for the exact anti-patterns lemmaly catches. The textbook prompts ask the model to optimize, and it does. The production prompts ask it to ship a feature; optimization is a detail it drops along the way. Code in /sandbox/ai-sessions.

Optimization is the taskdedupe · two-sum · fib · find-by-email

0%0 / 199

Ask for `twoSum` and you get the one-pass hashmap. Optimization is what the prompt asks for; the model returns it. Training data is saturated, the bugs aren't here.

Building the feature is the taskcsv-dedupe · user-search · notification-feed · csv-export · prefix-search · users-with-posts

55%163 / 299

Express handlers against real tables. The model gets the structure right and the optimization wrong: N+1 imports, missing `LIMIT`, full-result fetches into memory. The bad shape is the easy shape when building is what you asked for.

Per-model rate · the prompts that fail

model

prefix-search · %query% + no LIMIT

user-search · ILIKE no LIMIT

csv-export · full fetch into memory

csv-dedupe · SELECT-in-loop

claude-sonnet-4-5