• @jarfil@beehaw.org
    link
    fedilink
    4
    edit-2
    11 months ago

    People tend to deify LLMs, because of the vast amounts of knowledge trained into them, but their answers are more like a single “reasoning iteration”.

    How many human coders are capable of sitting down, typing a bunch of code at 100 WPM out of the blue, then end up with zero security flaws or errors? About absolutely none, not even if they get updated requirements, and the same holds up for LLMs. Coding is an iterative job, not a “zero shot” one.

    Have an LLM iterate several times over the same piece of code (“think” about it), have it explain what it’s doing each time (“reason” about it)… then test run it, fix any compiler errors… run a test suite, fix for any non-passing tests… then ask it to take into account a context of best practices and security concerns. Only then the code can be compared to that of a serious human coder.

    But that takes running the AI over and over and over with a large context, while AIs are being marketed as “single run, magic bullet”… so we can expect a lot of shit to happen in the near future.

    On the bright side, anyone willing to run an LLM a hundred times over every piece of code, like in a CI workflow, in an error seeking mode, could catch flaws that would otherwise take dozens of humans to spot.