The Universal Needle In A Haystack Finder
Three months ago, the idea of using AIs to help debug code sounded like complete nonsense to me, given that they couldn’t even write code well. In my experience, the AI models I’ve tried still can’t write code very well, but it turns out this is a completely different skill from finding bugs. In reality, AIs are already superhumanly good at finding logic errors, and while Anthropic’s Mythos is usually what comes to mind, much weaker models can actually find the same security flaws if given specific instructions in highly constrained environments.