Has anyone tried in organization to use self hosted llm models for agentic programming?

Im curious if it makes any sense. My organization spends fortune on tokens from us companies. I want to recommend something…

  • PeeOnYou [he/him]@lemmygrad.ml
    link
    fedilink
    arrow-up
    1
    ·
    2 days ago

    our CEO has been buying new hires desktop gaming machines for this reason… currently they don’t have squat for graphics cards but once the rug is pulled from the cloud model pricing he said he’ll spend the $10k per machine to put a 96gb vram card in peoples’ machines to run shit locally

  • Eager Eagle@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    ·
    4 days ago

    Qwen 3.6 and gemma4 models are the only ones usable for agentic prog sessions that I and my employer run locally. It’s less stable and slower than third-party services, even on much better hardware (as it’s with my employer). The best way is to go with a provider hosting deepseek flash/pro if your privacy policy allows though. It’s going to be hard to beat their price.