Hey :) For a while now I use gpt-oss-20b on my home lab for lightweight coding tasks and some automation. I’m not so up to date with the current self-hosted LLMs and since the model I’m using was released at the beginning of August 2025 (From an LLM development perspective, it feels like an eternity to me) I just wanted to use the collective wisdom of lemmy to maybe replace my model with something better out there.
Edit:
Specs:
GPU: RTX 3060 (12GB vRAM)
RAM: 64 GB
gpt-oss-20b does not fit into the vRAM completely but it partially offloaded and is reasonably fast (enough for me)
Gemma4 e4b quant8 will fit in 12gb and is good
I’d use some Chinese model. Qwen3.5 Claude 4.6 distilled ablitirated is what I use
Qwen is pretty good. Also try LFM models.
I find Qwen3.5 is the best at toolcalling and agent use, otherwise Gemma4 is a very solid all-rounder and it should be the first you try. Tbh gpt-oss is still good to this day, are you running into any problems w it?
No problems per se. I just thought that I had not checked for an update for a longer time.
You’re probably aware, but updating the model periodically is probably a good idea just because things do change overtime.
A model from two years ago was trained on data from at least two years ago. Meaning any technology, code, world event changes wouldn’t be reflected in the model.
What sort of coding and what sort of automation tasks? The latter is an easier ticket to fill than the former, though I might have an idea for you on that end if coding is a must
How much VRAM?
I’m in the same boat. You’ll get better responses if you post your machine specs. I
I’m running gemma4 26b MOE for most of my agent calls. I use glm5:cloud for my development agent because 26b struggles when the context windows gets too big.




