Self-Hosted AI is pretty darn cool

chagall@lemmy.world · 1 year ago

Self-Hosted AI is pretty darn cool

coffee_with_cream@sh.itjust.works · 1 year ago

Uncensored models are so much better, too. chatGPT is like one of those plastic children’s toy hammers vs real models are titanium hammers

HumanPerson@sh.itjust.works · 1 year ago

Yeah, I like it too. My only issue is ollama’s lack of intel support. I have been looking at issue 1590 on their GitHub. For now I have a 1050ti in a cardboard box PC with other hardware being 10+ years old and a mixed set of RAM totalling 12G. It also has a 100Mbit nic, so I can’t take advantage of full internet speed when downloading models. The worst part is they can support intel, but haven’t merged the solution because of an issue with the windows intel drivers. Linux is fine but I can 't have it. I wasn’t planning to rant, but I already typed it so… enjoy?

chagall@lemmy.world · edit-2 1 year ago

Yeah, I have an NVDIA GPU and it is magic. The best part is when you are using Ollama, open a second terminal window and enter the command, watch -n 0.5 nvidia-smi and you can see your GPU usage go up and down in real-time as you ask the GPT questions. Pretty cool.

Hopefully they get the ARC folks up and running soon.

superglue@lemmy.dbzer0.com · 1 year ago

What kinds of specs do you need to run it well? I’ve got a laptop with a 3070.

coffee_with_cream@sh.itjust.works · edit-2 1 year ago

You probably want 48gb of vram or more to run the good stuff. I recommend renting GPU time instead of using your own hardware, via AWS or other vendors - runpod.io is pretty good.

NotMyOldRedditName@lemmy.world · 1 year ago

Kinda defeats the purpose of doing it private and local.

I wouldn’t trust any claims a 3rd party service makes with regards to being private.

31337@sh.itjust.works · 1 year ago

IDK, looks like 48GB cloud pricing would be 0.35/hr => $255/month. Used 3090s go for $700. Two 3090s would give you 48GB of VRAM, and cost $1400 (I’m assuming you can do “model-parallel” will Llama; never tried running an LLM, but it should be possible and work well). So, the break-even point would be <6 months. Hmm, but if Severless works well, that could be pretty cheap. Would probably take a few minutes to process and load a ~48GB model every cold start though?

ffhein@lemmy.world · 1 year ago

Assuming they already own a PC, if someone buys two 3090 for it they’ll probably also have to upgrade their PSU so that might be worth including in the budget. But it’s definitely a relatively low cost way to get more VRAM, there are people who run 3 or 4 RTX3090 too.

CallMeButtLove@lemmy.world · 1 year ago

Is there a way to host an LLM in a docker container on my home server but still leverage the GPU on my main PC?

Dataprolet@lemmy.dbzer0.com · 1 year ago

Isn’t this using a lot of computing power?

Phoenicianpirate@lemm.ee · 1 year ago

I am going to be buying a monster high end machine and I want to do all the AI stuff on it.

EonNShadow@pawb.social · 1 year ago

“learned some things like Linux, command line, docker, and networking/pfsense” “I don’t consider myself technical”

Don’t sell yourself short, I work in IT and have colleagues on our helpdesk who would struggle endlessly with those concepts.

I hereby dub you a tech person, like it or not, those skills can and do pay the bills.

damnthefilibuster@lemmy.world · 1 year ago

Now that you’ve dubbed OP a tech person…

Hey OP, can you help me fix my printer? It’s only printing “RED RUM RED RUM” for some reason.

sugar_in_your_tea@sh.itjust.works · 1 year ago

Have you tried giving it red rum?

Oh, and make sure you hold it out with the insides of your arms exposed, it’ll feel less threatening that way.

webghost0101@sopuli.xyz · 1 year ago

Thank you for this. I consider myself technical and those words felt like a punch in the gut.

chagall@lemmy.world · 1 year ago

I’m sorry if I offended. I can’t code or understand existing code and have always felt that technical people code. I guess I should expand my definition. Again, sorry that my words felt like a punch in the gut… wasn’t my intention at all.

IsoKiero@sopuli.xyz · 1 year ago

It depends heavily on what you do and what you’re comparing yourself against. I’ve been making a living with IT for nearly 20 years and I still don’t consider myself to be an expert on anything, but it’s a really wide field and what I’ve learned that the things I consider ‘easy’ or ‘simple’ (mostly with linux servers) are surprisingly difficult for people who’d (for example) wipe the floor with me if we competed on planning and setting up an server infrastructure or build enterprise networks.

And of course I’ve also met the other end of spectrum. People who claim to be ‘experts’ or ‘senior techs’ at something are so incompetent on their tasks or their field of knowledge is so ridiculously narrow that I wouldn’t trust them with anything above first tier helpdesk if even that. And the sad part is that those ‘experts’ often make way more money than me because they happened to score a job on some big IT company and their hours are billed accordingly.

And then there’s the whole other can of worms on a forums like this where ‘technical people’ range from someone who can install a operating system by following instructions to the guys who write assembly code to some obscure old hardware just for the fun of it.

toynbee@lemmy.world · 1 year ago

With all respect, the first paragraph seems self contradictory.

Appoxo@lemmy.dbzer0.com · 1 year ago

Very technical vs not can be very subjective.
It can be a 50 year old sysadmin vs Adam I pulled from the street or a graybeard linux admin vs a beginner sysadmin only in it for thr career instead of the passion (those can be very non-technical but good problem solver folks)

I know my comparison is flawed

chasingtheflow@lemmy.world · 1 year ago

Very cool! You can use something like Tailscale to access your local services remotely without exposing them to the internet.