• mrpoops@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Best cheapest option to run smaller ai models on?

    Like the gguf’d Mistral 7b versions that are lighter on memory, for example. I need fast inference and I don’t really feel like depending on OpenAI, or paying them a bunch of money. I’ve fucked up and spent like $200 on api charges before, so definitely trying to avoid that.

    I have a 980ti and it’s just too damn old. It works with some stuff but it’s super hit or miss with any of the newer libraries.

    • cupatkay@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      Maybe a used RTX 30 series? They have tensor cores which helps a lot ini running AI stuff. I got a used 3070 for $240 a few weeks ago.

    • YoloSwaggedBased@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      Consider something with a bit more VRAM, like a 2080ti or a 3080. The extra headroom of 10gbs will help with higher quantisation precision (e.g. 4 bit vs 8 bit). You need a bit over 14GBs to run full 7B models w/o quantisation, so you’ll need quantisation.