HomeEmerging technologiesC Undefined Behavior Is Everywhere — Shocking Proof

C Undefined Behavior Is Everywhere — Shocking Proof

  • C undefined behavior affects virtually all non-trivial code, even when written by 30-year veterans of the language.
  • C undefined behavior isn’t just a compiler trick — it corrupts intent at every stage from source to hardware execution.
  • Simple operations like casting a pointer or calling isxdigit() can silently trigger dangerous undefined behavior.
  • The industry’s continued reliance on C and C++ in 2026 is increasingly hard to justify given modern alternatives.

Nobody Can Write Safe C — Not Even the Experts

C undefined behavior isn’t a niche concern for academics or a footnote in language specifications. It’s a landmine buried in virtually every non-trivial C or C++ program ever written — and one of the most experienced voices in systems programming is now saying so explicitly. Thomas Habets, a software engineer with roughly 30 years of daily C and C++ experience, has laid out a damning case: not only is C riddled with undefined behavior, but it’s so pervasive that blaming individual programmers for it is, at this point, genuinely unfair.

That’s not a hot take from someone who stumbled into C last year. Habets describes himself as someone who listens to C++ podcasts, watches conference talks, and genuinely enjoys the language. He’s not a detractor. He’s a true believer who’s slowly, reluctantly, arrived at the conclusion that the environment C was designed for — 1972, single-threaded, predictable hardware, no hostile actors — simply doesn’t exist anymore. C++ fares only marginally better; 1985 isn’t 2026 either.

What C Undefined Behavior Actually Means

There’s a persistent myth among developers that C undefined behavior is mostly a compiler trick — that as long as you compile without optimizations, the compiler won’t “cheat” and exploit your sloppy code. This is wrong, and it’s worth being precise about why.

C undefined behavior doesn’t mean the compiler is looking for opportunities to punish you. It means the compiler is allowed to assume your code is valid. When your code contains UB, the compiler doesn’t need to handle the impossible case — because according to the language spec, that case can’t happen. The optimizer doesn’t generate fallback logic for something it’s been told will never occur. The result isn’t deliberate sabotage; it’s a quiet, confident miscompilation that may work perfectly today and catastrophically fail tomorrow when you change compilers, update a library, or move to new hardware.

And that last point matters more than most developers appreciate. The x86 architecture is unusually forgiving. Misaligned memory reads, certain atomicity assumptions, and a range of other technically undefined operations often just work on x86. That forgiveness has masked decades of latent C undefined behavior in codebases that are now being ported to ARM, RISC-V, or future architectures that may not share x86’s generous hardware behavior.

C Undefined Behavior Hides in Surprisingly Mundane Code

Habets walks through several examples that illustrate just how deep the problem goes — and they’re not contrived edge cases. They’re the kind of code you’d find in any production codebase reviewed and approved by experienced engineers.

Take pointer alignment. Consider a function that simply dereferences an integer pointer:

  • On Linux Alpha hardware, an unaligned pointer dereference would sometimes trap to the kernel, which would silently emulate the intended behavior — masking the bug entirely.
  • On SPARC, the same code produces a SIGBUS and crashes the program.
  • On x86, it typically works fine — possibly even as an atomic read — because the hardware handles it transparently.

Three different outcomes for the same C undefined behavior, depending on architecture. Now consider what happens as your codebase ages and gets compiled for a future processor where integer pointer registers don’t even populate the lowest bits, because the architecture guarantees aligned pointers. The compiler, entirely within its rights, generates instructions that assume alignment. Your decades-old code breaks silently — or loudly, in production.

But here’s where it gets worse: the UB doesn’t start at the dereference. It starts at the cast. Writing (const int*)bytes to reinterpret a byte buffer as an integer pointer is itself undefined behavior before you even touch the data. The compiler is permitted to assign semantic meaning — garbage collection tags, security bits, provenance metadata — to the lower bits of typed pointers. The act of creating the misaligned pointer is already the mistake.

The isxdigit() Trap

One of the more subtle examples Habets raises involves isxdigit(), a function so basic it barely registers as a risk. Pass it a character, get back whether it’s a hex digit. Simple. Except the function signature takes an int, not a char, because it also needs to handle EOF — a value that by definition can’t fit in an unsigned char.

Here’s the catch: whether char is signed or unsigned is implementation-defined in C. On architectures or compilers where char is signed — which is common — passing a character value above 127 results in a negative integer after promotion. A valid implementation of isxdigit() might use that integer as a lookup table index. A negative index means reading from arbitrary memory. In the worst case, that’s not a crash — it’s a read from I/O-mapped memory that does something. Something unintended. Something that varies by platform and runtime state.

This isn’t theoretical. This is the kind of bug that ships, lurks, and surfaces years later as a CVE with a memory corruption label that everyone scratches their heads over.

The Industry Has Known About This for Years

Habets notes that the conversation about C’s fundamental unsafety isn’t new. He recalls reading, roughly a decade ago, an argument that continued use of C++ in financial systems could constitute a SOX violation — a reference to the Sarbanes-Oxley Act, which imposes strict controls on software that handles financial reporting. Whether or not that argument holds up legally, the underlying point has only become harder to dismiss: using a language where correct code is essentially impossible to guarantee is a liability, not just a technical inconvenience.

That view has gained substantial institutional weight since then. The NSA, CISA, and the White House’s Office of the National Cyber Director have all issued guidance in recent years recommending that organizations migrate away from memory-unsafe languages — C and C++ specifically — toward alternatives like Rust, Go, and Swift. Microsoft has publicly stated that roughly 70% of its CVEs over the past decade trace back to memory safety issues. Google has reported similar figures for Android vulnerabilities.

The C undefined behavior problem isn’t separate from the memory safety conversation — it is the memory safety conversation, just described from a language-specification angle rather than a vulnerability-exploitation angle.

Why C Still Runs the World Anyway

None of this is news to the systems programming community, and yet C and C++ remain the dominant languages in operating systems, embedded firmware, game engines, and performance-critical infrastructure. The reasons are familiar: decades of existing code, unmatched performance tuning, hardware proximity, and a developer base that has built careers around these languages.

There’s also a subtler inertia at work. C undefined behavior is often invisible. Code that works on x86 in a CI pipeline doesn’t fail until it hits different hardware, a different compiler version, or a different optimization flag. The feedback loop is broken. Engineers write code that works in practice, ship it, and never encounter the UB — until they do, catastrophically, in a context that’s hard to reproduce and harder to debug.

Habets’ point is sharp here: if all non-trivial C code contains undefined behavior, then undefined behavior isn’t a programmer error — it’s a language design failure. You can’t screen for it in code review. You can’t lint it away entirely. You can’t rely on sanitizers in production. The tools help, but they’re patches on a language that was designed before modern hardware complexity, before modern security threat models, and before we understood what we were getting into.

What C Undefined Behavior Means for the Future of Systems Code

Rust has made significant inroads precisely because it makes the class of errors Habets describes either impossible or explicit. Misaligned pointer casts, uninitialized memory access, and data races are either compile-time errors or require the programmer to opt into an unsafe block — a clearly marked perimeter that auditors, security teams, and automated tools can focus on.

That’s not a perfect solution, and Rust has its own learning curve and ecosystem gaps. But it represents a genuine architectural response to the problem Habets is describing, rather than another layer of tooling on top of a fundamentally ambiguous language specification.

The harder question is what happens to the hundreds of millions of lines of existing C and C++ code that run critical infrastructure — the Linux kernel, embedded automotive systems, medical devices, industrial control systems. Rewriting them isn’t realistic on any near-term timeline. That code will keep running, and C undefined behavior will keep lurking inside it, waiting for the hardware or compiler combination that finally surfaces it.

For new code, though, the argument for C is getting thinner by the year. When a 30-year expert in the language tells you that nobody can write correct C — not even him — that’s not pessimism. That’s an honest engineering assessment, and the industry is increasingly treating it as one.

Source: https://blog.habets.se/2026/05/Everything-in-C-is-undefined-behavior.html

Yasir Khursheed
Yasir Khursheedhttps://www.squaredtech.co/
Meet Yasir Khursheed, a VP Solutions expert in Digital Transformation, boosting revenue with tech innovations. A tech enthusiast driving digital success globally.
RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular