Smorty [she/her]
I’m a person who tends to program stuff in Godot and also likes to look at clouds. Sometimes they look really spicy!


- 22 Posts
- 91 Comments
Smorty [she/her]@lemmy.blahaj.zoneto AI@lemmy.ml•Cohere Drops Command-R 35B 08-2024 Update, Just About a Perfect Local LLM for 24GB GPUs.
1·1 year agoi totally agree… with everything. 6GB really is smol and, cuz imma crazy person, i currently try and optimize everything for llama3.2 3B Q4 model so people with even less GB VRAM can use it. i really like the idea of people just having some smollm laying around on their pc and devs being able to use it.
i really should probably opt for APIs, you’re right. the only API I ever used was Cohere, cuz yea their CR+ model is real nice. but i still wanna use smol models for a smol price if any. imma have a look at the APIs you listed. Never heard of Kobold Horde and Samba so i’ll have a look at those… or i go for the lazy route and chose depseek cuz it’s apparently unreasonably cheap for SOTA perf. so eh…
also yes! Lemmy really does seem anti AI, and i’m fine with that. i just say
yeah companies use it in obviously dum ways but the tech is super interestingwhich is a reasonable argument i think.so yes, local llm go! i wanna get that new top amd gpu once that gets announced. so i’ll be able to run those spicy 32B models. for now i’ll just stick with 8B and 3B cuz they work quick and kinda do what i want.
Smorty [she/her]@lemmy.blahaj.zoneto AI@lemmy.ml•Cohere Drops Command-R 35B 08-2024 Update, Just About a Perfect Local LLM for 24GB GPUs.
1·1 year agocould you define “right settings”?
I assuma Q4 and some context window Q8 aswell. Anything lese to tweak?
I just have a smol gtx1060 6gb VRAM, so i probably can’t fit it on mine and imma have to use cpu partly. but maybe other readers here can!(I’m just a silly ollama user, not knowing anything more complex than the tokenizer… so yea, maybe put a lil infodump in here to make us all smarter please <3 )
EDIT: brucethemoose probably refered to this model named “Medius”. there is no 14B in the name.
Smorty [she/her]@lemmy.blahaj.zoneto AI@lemmy.ml•Cohere Drops Command-R 35B 08-2024 Update, Just About a Perfect Local LLM for 24GB GPUs.
1·1 year agoi luv command R+ so very much and now i wanna try that smoler model but also the newly released r7b model was really not the best so i got sad…
Smorty [she/her]@lemmy.blahaj.zoneto AI@lemmy.ml•Man learns he’s being dumped via “dystopian” AI summary of texts
1·1 year agoi like how when ai summarizes a sad dramatic thing, people go :o like it’s something special and not exactly what it was trained to do.
Oh they totally will try. Microsoft is dum enough to try it, just like they are dum enough to try to train massive LLMs, and damn, they not be showin’ successes til now :)
I remember, I think I also saw that video by that one Linux/windows tech guy
They cannot do that. The foss community is too strong to fail, many people use GNU/Linux specifically because it’s not owned by EvilCo™ and EvilHoldings™.
Could you maybe name some distros and DEs which have this feature pre-built?
I have used mint, Debian and Fedora and none of them seem to have this kind of feature.
ok fair u got me there. Lutris really is a solution, but it still feels like some game-specific launcher you gotta run to run normal programs.
i was mostly complaining about the double click thing.
I’m using an index. You can use Envision, as it supports wireless headsets and gives you a nice interface to set things up.
That is also what I use, as Envision also has support for cabled headsets.
In general, the lvra website is a great source for cool foss VR stuff on linux.
I actually never heard of this saying, but I just looked up. Woah, that’s really a phrase they use internally, hm? Crazy.
And it does accurately describe what they try to do here. It can’t really work like that, since many people use GNU/Linux specifically because it’s not owned by EvilCo™. But they could probably take over some part of the server-hosting business like this. And that is a scary thought.
Imagine, they could make it super easy to deploy things by incoperating premade docker containers into their UI thingy. That’s - like - real bad.
Small addition:
Now that VR works essentially perfectly on GNU/Linux, even on Wayland with Gnome and an nvidia GPU, I have now stopped dualbooting for occasional VR Chat and Beat Saber (which are VR games).In my opition, when looking away from online games with anticheat, Microsofts Office and adobes whatever software, there is no reason to use Winblows anymore.
The amount of configuration GNU/Linux gives me is truly empowering, running any scripts I want using shortcuts being a big one for me.
Some shortcuts I use daily
Super+E -> Nautilus (obvious) Super+W -> Firefox Super+Y -> Youtube Super+C -> Local LLM chat Super+G -> Launch Godot Generally vim navigation
Smorty [she/her]@lemmy.blahaj.zoneOPto Linux@lemmy.ml•No wayland option despite fresh gnome install (Debian 12, testing, nvidia)
2·1 year agoyup, renamed it to […].rules.backup. Thanks for responding though!
Smorty [she/her]@lemmy.blahaj.zoneOPto Linux@lemmy.ml•No wayland option despite fresh gnome install (Debian 12, testing, nvidia)
2·1 year agowhy u delete your comment?
Smorty [she/her]@lemmy.blahaj.zoneOPto Linux@lemmy.ml•No wayland option despite fresh gnome install (Debian 12, testing, nvidia)
1·1 year agoJust tried it, and sadly that didn’t change anything after a reboot.
Smorty [she/her]@lemmy.blahaj.zoneOPto Linux@lemmy.ml•No wayland option despite fresh gnome install (Debian 12, testing, nvidia)
1·1 year agoFair, but they supported it a bit before that too I think. Like, they allowed it to show up in the login.
Smorty [she/her]@lemmy.blahaj.zoneOPto Linux@lemmy.ml•No wayland option despite fresh gnome install (Debian 12, testing, nvidia)
2·1 year agoUnfortunately that did not fix it for me. I have now renamed the file to […].backup but it still only displays X11 options.
Something similar to this already kinda exists on HF with the 1.58 bit quantisation which seem to get very similar performance to the original Llama 3 8B model. That’s essentially a two bit quanitsation with reasonable performance!
I’m even more excited for running 8B models at the speed of 1B! Laughably fast ok-quality generations in JSON format would be crazy useful.
Also yeah, that 7B on mobile was not the best example. Again, probably 1B to 3B is the sweetspot for mobile (I’m running Qwen2.5 0.5B on my phone and it works tel real for simple JSON)
EDIT: And imagine the context lengths we would be ablentonrun on our GPUs at home! What a time to be alive.






but i wanna have a website others can access too. I tried using VPNs for cool stuff already (like controlling my lil raspberry robot from work with my phone) but I want this website to be available to all the people…
should i just bite the bullet and rent some hosting service? Or is there still hope for me putting “setup home website server” on my resume?