A model released by Rio de Janeiro’s city government as a homegrown artificial intelligence achievement is now at the centre of a credibility crisis. The Rio LLM — formally known as Rio-3.5-Open-397B — was presented by IplanRIO, the city’s municipal IT agency, as an original 397-billion-parameter model trained from scratch. Independent researchers at Nex-AGI say that’s simply not true, and they’ve published the receipts to prove it.
- Rio LLM was presented as an original 397B model trained by IplanRIO, but evidence shows it’s a direct weight merge.
- The Rio LLM identifies itself as ‘Nex, from Nex-AGI’ 79% of the time when its custom system prompt is removed.
- Every weight tensor in the model matches a 0.6 Nex / 0.4 Qwen blend across all 60 layers with no anomalies.
- The case raises serious questions about transparency and accountability in government-funded AI projects.
Table of Contents
What IplanRIO Claimed
The pitch was compelling, politically and technically. A major Latin American city building its own frontier-scale language model would be a meaningful milestone — the kind of story that attracts investment, earns headlines, and signals digital sovereignty. IplanRIO made exactly that pitch when it published Rio-3.5-Open-397B on Hugging Face, positioning the Rio LLM as a locally developed model trained by the city’s own infrastructure team. At 397 billion parameters, it sits in the same weight class as serious frontier models — which is precisely why the claim drew attention.
Training a model at that scale requires enormous compute, months of engineering work, vast datasets, and deep technical expertise. It’s not impossible for a government agency to pull off, but it would be extraordinary. And extraordinary claims, as the saying goes, require extraordinary evidence.

The Rio LLM Doesn’t Know Its Own Name
Nex-AGI, a company that built its own large language model called Nex, noticed something immediately suspicious. When they probed Rio-3.5-Open-397B through its deployed interface — bypassing or removing the hardcoded “You are Rio” system prompt that was baked into every response — the Rio LLM didn’t identify itself as Rio at all. Instead, it called itself “Nex, from Nex-AGI” in 79% of test cases. It identified as “Rio” exactly 0% of the time.
It gets more specific than that. The model didn’t just use the name Nex — it recited Nex-AGI’s bespoke organizational backstory word-for-word. That’s not a hallucination quirk or a training data bleed. That’s a model that genuinely believes, at the level of its internalized identity, that it is Nex. A system prompt is a thin cosmetic veneer. Strip it away, and whatever’s underneath shows its true nature.
This kind of identity test isn’t new. Researchers frequently probe rebranded or fine-tuned models by asking them directly who they are, especially when base model contamination is suspected. The technique is blunt but effective — and in this case, the result was unambiguous.
The Tensor-Level Evidence Is Even Harder to Dispute
Identity probing is suggestive. Weight analysis is definitive. Nex-AGI went deeper, performing a tensor-by-tensor comparison of the Rio LLM against both their own Nex model and Alibaba’s Qwen3.5-397B-A17B base model. The result: every single weight tensor in Rio-3.5-Open-397B matches a linear blend of approximately 0.6 Nex and 0.4 Qwen — across all 60 layers and every component of the network architecture.
The deviation from that ratio, they report, sits at thousands of standard deviations below noise — meaning this isn’t a statistical coincidence or an approximation. It’s an exact blend. Nex-AGI also points out that this pattern can’t be explained by fine-tuning. When you fine-tune a model, the weights shift in complex, non-linear ways tied to gradient updates on new data. A clean, consistent element-wise interpolation ratio held across every single layer is the fingerprint of a merge operation — not a training run.
Model merging is a legitimate technique. It’s widely used in the open-source community, where tools like mergekit let researchers blend models with a few lines of config. The process can produce genuinely useful results and has become a popular method for combining the strengths of two models without the compute cost of training. There’s nothing inherently wrong with it. The problem here isn’t the merge — it’s the story told about it.
Why This Matters Beyond Rio
Government-funded AI projects occupy a different accountability space than private ones. When a startup oversells its technology, investors bear the risk. When a city government claims to have built original AI infrastructure, the public is on the hook — financially, politically, and in terms of trust in the institutions making those claims.
IplanRIO hasn’t publicly responded to Nex-AGI’s findings at the time of writing. That silence is itself notable, given that the evidence has been posted publicly on GitHub and is attracting significant attention from the technical community. If the analysis is wrong, a rebuttal with counter-evidence would be straightforward. If it’s right, the implications for the Rio LLM’s credibility — and for whoever signed off on the public communications around it — are significant.
This also isn’t an isolated incident in the AI space. The past two years have seen a wave of AI announcements from organizations — governments, universities, and companies alike — that have later turned out to be fine-tunes or wrappers presented as original training work. The gap between what’s technically happening and what’s communicated publicly has become one of the more uncomfortable features of the current AI moment. Most end users, and most policymakers, have no way of auditing these claims themselves.
The Open-Source Community as Watchdog
What’s worth paying attention to here is how this was caught. Not by a regulatory body, not by a journalist with source access, but by another AI lab that recognized its own model’s fingerprints in someone else’s release. The open-weight AI ecosystem has inadvertently created a peer-review mechanism: because the weights are public, anyone with the right tools and motivation can inspect them. Nex-AGI had both.
That’s a genuinely healthy dynamic, even if it emerges from a messy situation. The alternative — a world where model weights are always closed and claims about training are unverifiable — would be far worse for accountability. The fact that Nex-AGI could publish their methodology, show the tensor comparisons, and invite others to “judge for yourself” reflects what open-source norms at their best are supposed to enable.
The question now is what happens next. If the Rio LLM was funded by public money on the premise of original development, there are real questions about how that money was spent and who approved the framing. More broadly, this case is likely to accelerate calls for standardized disclosure requirements when governments release AI systems — including documentation of training data, compute used, and whether any base models were incorporated. The EU’s AI Act includes some provisions in this direction, but enforcement and specificity remain works in progress.
For now, the Rio LLM stands as one of the cleaner examples of why the AI industry’s self-reporting problem isn’t going away on its own — and why the people best positioned to catch it are often the ones whose work was quietly borrowed in the first place.
Source: Hacker News

