The model has to pass extensive redteam testing trying to get it to lie / misrepresent its internal state/etc. Merely being wrong is okay (although of course I won't allow for any silly rules-lawyering eg a language model so stupid it can't lie). It has to be a language model I care about.
If the redteam clearly isn't very good / is incentivized to not find anything it doesn't count.
For instance if the redteam is part of the organization building the language model that probably won't count.
"Redteam" is being used loosely here: if releasing it to the public + giving a bounty for catching it in a lie doesn't find a lie after a month, that counts.
If the model lies a little I may still accept but given the lack of an explicit testing procedure I cannot state a hard cutoff. Certainly it needs to be more honest than a human.
If the model makes contradictory statements but not in the same context window that does not necessarily count. Contradictory statements in the same context window (whatever that happens to mean in 2027) definitely do count as lies.