Direct answer

Yes — anonymize by default, with eyes open about the cost. Strip the "OpenAI said" / "Gemini said" labels before a moderator synthesizes the debate. This reduces a specific, documented failure mode: LLM judges tend to favor outputs from themselves or their siblings. But you pay for it. Identity carries calibration signal — which model is strong at security, which is strong at concurrency, which tends to over-flag — and anonymization throws that signal away.

The right answer depends on the consensus mode. For moderator-decides, anonymize. For a weighted voting threshold where per-provider weight encodes domain expertise, do not. Joint Chiefs anonymizes before the final synthesis and keeps provider attribution in the transcript so you can audit afterward.


What anonymization actually means

In a multi-model debate, each spoke emits a list of findings. Each finding has a title, a severity, a file and line, a rationale, and — if left unredacted — a provider label. Anonymization is the step that removes the label before the moderator's context window is assembled.

What survives in the anonymized view:

What gets removed:

The point is to force the moderator — often an LLM itself — to weight findings by the quality of their argument rather than by pattern-matching on brand. Research on LLM-as-judge has found repeatedly that judges tend to prefer outputs from themselves or from models in their own family. Removing the label removes the shortcut.


Three ways identity leaks through

Anonymization is not airtight. A sufficiently attentive reader — human or model — can often guess which provider wrote which finding, for three reasons.

1. Writing style

Models have voices. One tends toward long, qualifying sentences with hedged confidence. Another is terse and list-heavy. A third opens every finding with a brief restatement of the problem. These fingerprints are stable across prompts and hard to scrub without rewriting the content itself. A moderator from the same family as a spoke can often recognize the sibling.

2. Characteristic vocabulary

Word choice leaks identity. One model favors "utilize" where another says "use." One says "consider" where another says "you might." Technical vocabulary leaks harder: a model trained more heavily on one security corpus will reach for a specific set of terms (TOCTOU, use-after-free, prototype pollution) that another model uses less often. You cannot remove this without rewriting the finding into something generic, at which point you have degraded the content.

3. Formatting preferences

Bullet depth, bold usage, code fence style, use of headers inside a finding, presence of trailing summary — all fingerprints. Some models obey output schemas rigidly. Others drift. A judge that has seen enough output from each provider starts recognizing layout alone.

None of this means anonymization is pointless. Removing the explicit label still meaningfully reduces bias — the leak is a degraded signal, not a clean one. But anonymization is a mitigation, not a guarantee, and anyone designing around it should understand that.


When anonymization helps most

The clearest win is the moderator-decides mode. In this mode, after spokes debate through their rounds, the final decision is written by a single model — in Joint Chiefs, this defaults to Claude — which reads the full transcript and produces the consensus output. This is the step where LLM-as-judge bias has the most documented effect, and the step where anonymization has the most to offer.

Three specific cases where anonymization is worth the calibration loss:

CaseWhy anonymization helps
The moderator shares a family with one of the spokes. Without anonymization, the moderator disproportionately endorses its sibling's findings even when the arguments are weaker.
One provider has a reputation (deserved or not) that precedes it. Both humans and models carry priors about "which lab is the serious one." Labels activate those priors, even when they are wrong for the specific finding.
A minority position is well-argued but comes from a model with a weaker prior reputation. Anonymization makes it harder for the moderator to dismiss a well-reasoned finding on brand alone. The argument has to carry itself.

When anonymization hurts

Anonymization has one specific failure mode: it throws away information the voter or moderator could have used well.

The clearest case is a voting-threshold mode with per-provider weights. Joint Chiefs supports per-provider weighting from 0.0 to 3.0, so you can say "weight Grok at 1.5 on security findings and 0.8 on idiomatic-style findings." Weighting is only useful if the vote-counter knows which provider produced which finding. If you anonymize and also weight, the weights become dead code — they have no labels to apply to.

A second case, subtler: in domains where one model has a large and well-calibrated advantage — say, a model with known state-of-the-art on a specific language or framework — a human reader scanning the transcript may want to see attribution so they can weigh findings themselves. Anonymization in the moderator's context is fine; anonymization in the final report the human reads often is not.

Joint Chiefs threads this by anonymizing inside the moderator's context and restoring per-finding attribution in the output transcript you read. You get the bias reduction on the decision step and the calibration signal on the audit.


What Joint Chiefs does and why

Joint Chiefs ships with anonymization on by default for the moderator-decides mode. Findings are stripped of provider labels before Claude reads the debate and writes the final synthesis. The orchestrator still tracks attribution internally and writes it to the transcript, so when you read the output you can see which provider raised which finding — just not during the model's decision step.

For the voting-threshold mode with weights, anonymization is off by design. Weighting requires labels. Running both simultaneously would be incoherent.

The other two consensus modes — strict majority and best-of-all — don't require anonymization because their decision rules are mechanical. Strict majority counts agreement across spokes; best-of-all picks the finding with the highest severity across the panel. Neither involves a judge model reading prose, so there is no bias for anonymization to remove.

The practical rule: anonymization is a tool for removing judge bias in modes that have a judge. It is not a universal good, and applying it to mechanisms that don't have a judge is just noise.

Key takeaways

  • Anonymize when a moderator LLM will read the debate and write the final decision — this is where brand bias has the most effect.
  • Writing style, vocabulary, and formatting leak identity even without labels. Anonymization is a mitigation, not a guarantee.
  • Do not anonymize when you are using weighted voting — the weights need labels to attach to.
  • Keep attribution in the output transcript even when it is hidden from the moderator. Human readers benefit from calibration signal the model cannot be trusted with.
  • Joint Chiefs anonymizes in moderator-decides mode, preserves attribution in transcripts, and skips anonymization where the decision rule is mechanical.

Frequently asked questions

What does it mean to anonymize model outputs before consensus?

Strip the provider label from each finding before the moderator reads the debate. The moderator sees the arguments, not the brands. The goal is to force the final decision to be driven by reasoning rather than by trust or distrust of any particular lab.

Do LLMs actually show brand bias when judging other LLMs?

Research on LLM-as-judge has repeatedly found that models tend to prefer outputs from themselves or from siblings in the same family, even when those outputs are not better. This is a documented failure mode of unanonymized judging and the reason anonymization exists as a mitigation.

Doesn't anonymization also strip useful signal?

Yes. A reader who knows which model is strong at security or at concurrency can use that as a prior when weighing findings. Anonymization removes that prior. The trade-off is real: you lose calibration signal in exchange for reducing bias toward any single lab.

Can anonymization really hide which model wrote what?

Not perfectly. Writing style, characteristic vocabulary, and formatting preferences leak identity even when the label is gone. Models tend to have recognizable voices. Anonymization still helps because it removes the explicit cue, but a sufficiently attentive judge can often guess.

Does Joint Chiefs anonymize outputs?

Yes. Findings are anonymized before the final synthesis so the moderator judges arguments rather than brands. The orchestrator preserves per-finding provider attribution in the transcript you can read afterward, but the moderator's context does not include it during the decision step.

When should anonymization be turned off?

When you are using a weighted voting-threshold mode and the weights carry domain meaning — for example, giving a security-strong model a heavier vote on security findings. Weighting is only useful if the vote-counter knows which model produced which finding. In those modes, anonymization and weighting cancel each other out.