In search for a new self-hosted LLM

Tanka@lemmy.ml · edit-2 4 days ago

In search for a new self-hosted LLM

zorflieg@lemmy.world · 3 days ago

Gemma4 e4b quant8 will fit in 12gb and is good

Evotech@lemmy.world · 2 days ago

I’d use some Chinese model. Qwen3.5 Claude 4.6 distilled ablitirated is what I use

Matt@lemmy.ml · 3 days ago

Qwen is pretty good. Also try LFM models.

Jozzo@lemmy.world · 4 days ago

I find Qwen3.5 is the best at toolcalling and agent use, otherwise Gemma4 is a very solid all-rounder and it should be the first you try. Tbh gpt-oss is still good to this day, are you running into any problems w it?

Tanka@lemmy.ml · 4 days ago

No problems per se. I just thought that I had not checked for an update for a longer time.

jacksilver@lemmy.world · 3 days ago

You’re probably aware, but updating the model periodically is probably a good idea just because things do change overtime.

A model from two years ago was trained on data from at least two years ago. Meaning any technology, code, world event changes wouldn’t be reflected in the model.

SuspciousCarrot78@lemmy.world · edit-2 3 days ago

What sort of coding and what sort of automation tasks? The latter is an easier ticket to fill than the former, though I might have an idea for you on that end if coding is a must

theunknownmuncher@lemmy.world · 4 days ago

How much VRAM?

carzian@lemmy.ml · 4 days ago

I’m in the same boat. You’ll get better responses if you post your machine specs. I

jaschen306@sh.itjust.works · 4 days ago

I’m running gemma4 26b MOE for most of my agent calls. I use glm5:cloud for my development agent because 26b struggles when the context windows gets too big.