@SuspciousCarrot78

SuspciousCarrot78@lemmy.world · 6 hours ago

I’ll cop to that. At a high level it is “tool calling + RAG + guardrails”.

Ok.

But that’s sort of the point: boring plumbing that turns LLMs from improv actors into constrained components.

Addressing your points directly as I understand them -

1) Doesn’t prevent lying

If you mean “LLMs can still hallucinate in general”, yes. No argument. I curtailed them as much as I could with what I could.

But llama-conductor isn’t trying to solve “AI truth” as a metaphysical problem. It’s trying to solve a practical one:

In Mentats mode, the model is not allowed to answer from its own priors or chat history. It only gets a facts block from the Vault. No facts → refusal (not “best effort guess").

That doesn’t make the LLM truthful. It makes it incapable of inventing unseen facts in that mode unless it violates constraints - and then you can audit it because you can see exactly what it was fed and what it output.

So it’s not “solving lying,” it’s reducing the surface area where lying can happen. And making violations obvious.

2) Wouldn’t a normal search algorithm be better?

I don’t know. Would it? Maybe. If all you want is “search my docs,” then yes: use ripgrep + a UI. That’s lighter and more portable.

The niche here is when you want search + synthesis + policy:

bounded context (so the system doesn’t slow down / OOM after long chats)
deterministic short-term memory (JSON on disk, not “model remembers")
staged KB pipeline (raw docs → summaries with provenance → promote to Vault)
refusal-capable “deep think" mode for high-stakes questions

I think an algo or plain search engine can do wonders.

It doesn’t give you a consistent behavioral contract across chat, memory, and retrieval.

3) “Everything looks like a nail”

Maybe. But the nail I’m hitting is: “I want local LLMs to shut up when they don’t know, and show receipts when they do.”

That’s a perfectly cromulent nail to hit.

If you don’t want an LLM in the loop at all, you’re right - don’t use this.

If you do want one, this is me trying to make it behave like infrastructure instead of “vibes”.

Now let’s see Paul Allen’s code :P

SuspciousCarrot78@lemmy.world · 7 hours ago

Not intentionally :)

SuspciousCarrot78@lemmy.world · 7 hours ago

I would be super interested to hear if it could do that. I genuinely don’t know, because I haven’t tried it.

If you can export your emails in the correct format, it might actually work. Try a small batch and report back.

PS: you DON’T HAVE TO run >>summ if you don’t want to. You can ask questions against the raw files too. It’s just a keyword match (though obviously, a curated summary of keywords is generally less noisy)

Wishing you luck! I didn’t make this enterprise grade, but if it works, use it.

SuspciousCarrot78@lemmy.world · edit-2 7 hours ago

Ah. So -

First prize: picture of you

Second prize: two pictures

?

:P

SuspciousCarrot78@lemmy.world · edit-2 6 hours ago

Oh it can try…but you can see its brain. That’s the glass box part of this. You can LITERALLY see why it says what it says, when it says it. And, because it provides references, you can go and check them manually if you wish.

Additionally (and this is the neat part): the router actually operates outside of the jurisdiction of your LLM. Like, the LLM can only ask it questions. It can’t affect the routers (deterministic) operation. The router gives no shits about your LLM.

Sometimes, the LLM might like to give you some vibes about things. Eg: IF YOU SHOUT AT IT LIKE THIS, the memory module of the router activates and stores that as a memory (because I figured, if you’re shouting at the llm, it’s probably important enough in the short term. That or your super pissed).

The llm may “vibe” a bit (depending on the temp, seed, top_k etc), but 100/100, ALL CAPS >8 WORDS = store that shit into facts.json

Example:

User: MY DENTIST APPOINTMENT IS 2:30PM ON SATURDAY THE 18TH.

LLM: Gosh, I love dentists! They soooo dreamy! <----PS: there’s no fucking way your LLM is saying this, ever, especially with the settings I cooked into the router. But anywayz

[later]

USER: ?? When is my dentist appointment again

LLM: The user’s dentist appointment is at 2:30 PM on Saturday, the 18th. The stored notes confirm this time and date, with TTL 4 and one touch count. No additional details (e.g., clinic, procedure) are provided in the notes.

Confidence: high | Source: Stored notes

Yes, I made your LLM autistic. You’re welcome

SuspciousCarrot78@lemmy.world · 8 hours ago

This is a quote from Deming, one of the fathers of modern data analysis. It basically means “I don’t trust you. You’re not god. Provide citations or retract your statement”

SuspciousCarrot78@lemmy.world · 9 hours ago

Correct. Curate your sources :)

I can’t LoRa stupid out of a model…but I can do this. If your model is at all obedient and non-stupid, and reasons from good sources, it will do well with the harness.

Would you like to see the benchmarks for the models I recommend in the “minimum reccs” section? They are very strong…and not chosen at random.

Like the router, I bring receipts :)