How are companies SO bad at seeeling AI to us? (And how to fix it)

Smorty [she/her]@lemmy.blahaj.zone · 11 months ago

How are companies SO bad at seeeling AI to us? (And how to fix it)

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

but i wanna have a website others can access too. I tried using VPNs for cool stuff already (like controlling my lil raspberry robot from work with my phone) but I want this website to be available to all the people…

should i just bite the bullet and rent some hosting service? Or is there still hope for me putting “setup home website server” on my resume?

Smorty [she/her]@lemmy.blahaj.zone · edit-2 1 year ago

[Help] IPv4 address reaches website, but domain doesn't... (wrong community?)

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

Apparently microsofts new phi-4-mini is business-tuned?

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

i totally agree… with everything. 6GB really is smol and, cuz imma crazy person, i currently try and optimize everything for llama3.2 3B Q4 model so people with even less GB VRAM can use it. i really like the idea of people just having some smollm laying around on their pc and devs being able to use it.

i really should probably opt for APIs, you’re right. the only API I ever used was Cohere, cuz yea their CR+ model is real nice. but i still wanna use smol models for a smol price if any. imma have a look at the APIs you listed. Never heard of Kobold Horde and Samba so i’ll have a look at those… or i go for the lazy route and chose depseek cuz it’s apparently unreasonably cheap for SOTA perf. so eh…

also yes! Lemmy really does seem anti AI, and i’m fine with that. i just say yeah companies use it in obviously dum ways but the tech is super interesting which is a reasonable argument i think.

so yes, local llm go! i wanna get that new top amd gpu once that gets announced. so i’ll be able to run those spicy 32B models. for now i’ll just stick with 8B and 3B cuz they work quick and kinda do what i want.

Smorty [she/her]@lemmy.blahaj.zone · edit-2 1 year ago

could you define “right settings”?

I assuma Q4 and some context window Q8 aswell. Anything lese to tweak?
I just have a smol gtx1060 6gb VRAM, so i probably can’t fit it on mine and imma have to use cpu partly. but maybe other readers here can!

(I’m just a silly ollama user, not knowing anything more complex than the tokenizer… so yea, maybe put a lil infodump in here to make us all smarter please <3 )

EDIT: brucethemoose probably refered to this model named “Medius”. there is no 14B in the name.

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

i luv command R+ so very much and now i wanna try that smoler model but also the newly released r7b model was really not the best so i got sad…

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

i like how when ai summarizes a sad dramatic thing, people go :o like it’s something special and not exactly what it was trained to do.

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

Oh they totally will try. Microsoft is dum enough to try it, just like they are dum enough to try to train massive LLMs, and damn, they not be showin’ successes til now :)

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

I remember, I think I also saw that video by that one Linux/windows tech guy

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

They cannot do that. The foss community is too strong to fail, many people use GNU/Linux specifically because it’s not owned by EvilCo™ and EvilHoldings™.

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

Could you maybe name some distros and DEs which have this feature pre-built?

I have used mint, Debian and Fedora and none of them seem to have this kind of feature.

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

ok fair u got me there. Lutris really is a solution, but it still feels like some game-specific launcher you gotta run to run normal programs.

i was mostly complaining about the double click thing.

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

I’m using an index. You can use Envision, as it supports wireless headsets and gives you a nice interface to set things up.

That is also what I use, as Envision also has support for cabled headsets.

In general, the lvra website is a great source for cool foss VR stuff on linux.

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

I actually never heard of this saying, but I just looked up. Woah, that’s really a phrase they use internally, hm? Crazy.

And it does accurately describe what they try to do here. It can’t really work like that, since many people use GNU/Linux specifically because it’s not owned by EvilCo™. But they could probably take over some part of the server-hosting business like this. And that is a scary thought.

Imagine, they could make it super easy to deploy things by incoperating premade docker containers into their UI thingy. That’s - like - real bad.

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

Small addition:
Now that VR works essentially perfectly on GNU/Linux, even on Wayland with Gnome and an nvidia GPU, I have now stopped dualbooting for occasional VR Chat and Beat Saber (which are VR games).

In my opition, when looking away from online games with anticheat, Microsofts Office and adobes whatever software, there is no reason to use Winblows anymore.

The amount of configuration GNU/Linux gives me is truly empowering, running any scripts I want using shortcuts being a big one for me.

Some shortcuts I use daily

Super+E -> Nautilus (obvious)
Super+W -> Firefox
Super+Y -> Youtube
Super+C -> Local LLM chat
Super+G -> Launch Godot
Generally vim navigation

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

We already have Linux at home!

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

Having trouble to generate correct output? Try prefixes!

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

What is it with this `externally-managed-environment` pip install error?

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

yup, renamed it to […].rules.backup. Thanks for responding though!

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

why u delete your comment?

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

Just tried it, and sadly that didn’t change anything after a reboot.

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

Fair, but they supported it a bit before that too I think. Like, they allowed it to show up in the login.

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

Unfortunately that did not fix it for me. I have now renamed the file to […].backup but it still only displays X11 options.

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

No wayland option despite fresh gnome install (Debian 12, testing, nvidia)

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

Something similar to this already kinda exists on HF with the 1.58 bit quantisation which seem to get very similar performance to the original Llama 3 8B model. That’s essentially a two bit quanitsation with reasonable performance!

Smorty [she/her]@lemmy.blahaj.zone · 1 year ago

I’m even more excited for running 8B models at the speed of 1B! Laughably fast ok-quality generations in JSON format would be crazy useful.

Also yeah, that 7B on mobile was not the best example. Again, probably 1B to 3B is the sweetspot for mobile (I’m running Qwen2.5 0.5B on my phone and it works tel real for simple JSON)

EDIT: And imagine the context lengths we would be ablentonrun on our GPUs at home! What a time to be alive.