Hi all, I’d like to hear some suggestions on self hosting LLMs on a remote server, and accessing said LLM via a client app or a convenient website. Either hear about your setups or products you got good impression on.

I’ve hosted Ollama before but I don’t think it’s intented for remote use. On the other hand I’m not really an expert and maybe there’s other things to do like add-ons.

Thanks in advance!

  • Bluefruit@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    15 days ago

    Me personally, i use my AMD 7700XT to run ollama on my main pc. Can be helpful for troubleshooting as the internet gets worse amd worse to search especially if i dont know what the issue is. Thats my main use case for it but id like to set up something with RAG and use it to help me with documentation if i have questions.

    I don’t think its super worth it to use a VPS for an LLM if you already have a decrnt gpu that you can run it on. If yoy dont already have the hardware, plenty of older gpus can run the models pretty well. My 1070ti still kicks ass all these years later and you can find them for $100 bucks or less on ebay used. Ive used it for ollama as well and it does just fine.

    Will it be super fast or a really big model? No, but if its for personal use, i dont see any benefit to paying a monthly subscription for it and like i said, works well enough for me.

    Its also more secure and private to host it yourself if thats worth anything to you. Thats one of the biggest reasons i self host.

    • ddh@lemmy.sdf.org
      link
      fedilink
      English
      arrow-up
      2
      ·
      11 days ago

      If you do set up a RAG store, please post the tech stack you use as I’m in a similar situation. The inbuilt document store management in ollama+openwebui is a bit clunky.

      • Bluefruit@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        11 days ago

        If i can figure it out I’ll be sure to post something lol.

        So far, i found a python project that is supposed to enable RAG but i have yet to try it and after a reinstall of my linux pc to Popos, im having less than success getting ollama to run.