Why not Rust. Why not TypeScript. Why Gleam.

Gleam is my language of choice for LLM-assisted development of concurrent systems. The short answer for why: it's Pareto-optimal across the criteria that actually matter. The long answer is this post.

I'm not claiming Gleam is the best language, full stop. That claim dies to "best for what?" My claim is narrower: every alternative wins on at least one axis and fails on at least one axis. Gleam is the only language that's acceptable on all of them. Pareto-optimal means you can't improve on one dimension without losing on another.

The criteria, and why Gleam hits them

Most language comparisons use criteria invented by the language's advocates. Before picking, I wrote down what I'm actually building: long-running concurrent systems, written with heavy LLM assistance. Here's what I care about, and where Gleam lands on each.

Type system you can't defeat

LLMs reach for escape hatches when they can't satisfy the type checker. any and as in TypeScript. .unwrap() and .clone() and Rc<RefCell<T>> in Rust. @ts-ignore comments. These defeat the guarantees the type system is supposed to provide. A language with no escape hatches is a language where the LLM either writes correctly-typed code or writes code that doesn't compile. Both are acceptable. The middle ground is not: code that compiles but defeats the guarantees. Gleam has no unsafe, no any, no as, no .unwrap(). There's no escape hatch for the LLM to reach for.

Fast feedback loop

AI-assisted work is iteration-bound. Model writes code, compiler runs, tests run, errors feed back, model fixes. Compile speed is part of it, but so is test speed, because the compiler can't catch every invariant and the tests have to cover what the types don't. You already wait a few minutes for the model to write code. If you have to wait a few more just to find out the code doesn't compile or the tests fail, do that a few times and half your afternoon is gone. Gleam compiles quickly and tests run quickly on BEAM. LSP is responsive. Iteration feels like TypeScript, not like Rust.

Static typing with exhaustiveness

Sum types, pattern matching, compile-time guarantees that you've handled every case. Exhaustiveness matters specifically because it's the failure mode LLMs produce when they don't know every variant of a sum type. The compiler refuses to let you forget a case. Gleam has all of this. Errors are Result values, not exceptions. No null.

Fault tolerance as a runtime property

The strategy that actually works for concurrent systems in production is "isolate the failure, restart from a known-good state." BEAM is the only runtime that provides this natively: supervision trees, process isolation, let-it-crash semantics as runtime primitives, not library conventions. Some libraries approximate this, but you're fighting the grain. Gleam sits on BEAM. Forty years of production hardening from Ericsson, WhatsApp, Discord, and everyone else who runs BEAM, inherited for free.

LLM-friendly ecosystem access

Young languages get penalized for thin ecosystems. But if the LLM can generate the missing pieces, or the language can FFI to a mature ecosystem, the penalty shrinks. FFI is mechanical. Read upstream docs, write type signatures, done. Exactly the kind of task LLMs are reliable at. This is the criterion most comparisons get wrong because they're still pricing ecosystem depth the way it mattered a few years ago. Gleam's own ecosystem is young, but Gleam compiles to Erlang, and FFI to Erlang/Elixir is an @external declaration with types on the Gleam side. Net effect: Gleam's type safety on top of Erlang's ecosystem. gen_server, Mnesia, Broadway, Phoenix components, every mature BEAM library.

Simplicity

Small language, few ways to be clever. This matters for LLM-assisted work specifically: fewer features means fewer features the LLM can misuse, fewer idioms means less disagreement about how to write things, fewer abstractions means less reading comprehension load when reviewing LLM output. Go is the exemplar, adopted at scale partly because the language was boring, predictable, and hard to misuse. Haskell is the opposite extreme: enormous feature surface, multiple competing idioms for the same task, cultural pressure toward clever abstractions. The same function gets written five different ways by five different Haskellers. Gleam is Go-like: around a dozen keywords, no macros, no typeclasses, no HKTs, no inheritance, minimal metaprogramming. Combined with gleam format and a standard library that's idiomatically opinionated, most Gleam code converges toward a common style. There's usually an obvious way.

Low reasoning load

FP reduces both the state the model has to track and the mechanical code the model has to infer intent from. No mutation means every variable is what it was bound to. Fewer live variables, fewer control-flow branches, less working memory needed. Named operations like map, filter, and fold make intent explicit where imperative loops make it implicit in the mechanics. A pipeline is three named operations the model recognizes instantly; the equivalent imperative code is a for-loop the model has to parse to realize it's doing filter-then-map. Both effects matter specifically for refactoring and debugging, where the model is reasoning about existing code rather than writing new code. Tracing pure functional code is following data. Tracing imperative code is simulating a mental VM with mutable state. Models are worse at the latter. Gleam is immutable, pipeline-oriented, pattern-matching-first. All the mechanisms apply.

Training data

The honest awkward one. LLM code quality correlates with training data volume. A language with millions of examples is one the model writes fluently; a language with thousands is one where the model hallucinates. This alone would favor Python, TypeScript, or Java. Gleam is on the thin side, and I'm not going to pretend otherwise. The compensating factors: the language is small (fewer things to hallucinate), the compiler is fast and precise (tight feedback corrects hallucinations quickly), and the standard library is tight (fewer functions for the model to invent).

How every alternative fails

TypeScript wins on training data, tooling, and ecosystem by a wide margin. Template literal types, discriminated unions, exhaustive checks via never, powerful type narrowing. tsserver is the best language server in existence. Every LLM speaks TypeScript fluently.

It loses on escape hatches, hard. LLMs trained on a decade of TypeScript code have absorbed the habit of reaching for any, as unknown as, @ts-ignore, and non-null assertions. The behavior is consistent across models. You get code that passes the type checker but has the runtime bugs the type checker was supposed to prevent.

Rust wins on type-system power for correctness. Ownership-as-correctness, affine types, const generics, sealed traits, #[must_use], trait bounds. More invariant-encoding machinery than anything else in production. The compiler catches more classes of wrong code than any mainstream alternative.

It loses on compile speed. Rust's compile-test loop is slow enough that even when the compiler would catch more, the iteration cost is real. Every cycle where the model writes code and waits for rustc is a cycle where you're not getting feedback. The loop matters more than the theoretical ceiling of what the compiler can prove. It also loses on runtime model for my use case. Rust's concurrency story is excellent for shared-memory parallelism and wrong for actor-model supervision.

OCaml and Haskell both have mature type systems with exhaustive matching; Haskell is more expressive than Gleam on every type-system axis (HKTs, typeclasses, GADTs, type families, linear types). Both lose on the runtime criterion: no OTP, no supervision as a runtime primitive. Haskell additionally loses on simplicity. Enormous surface area, multiple competing idioms for the same task, cultural pressure toward clever abstractions. The same function gets written five different ways, and LLMs pick whichever abstraction they saw most in training, which isn't always the right one for your codebase. OCaml is better on that axis but still out on the runtime.

Pony is the one most people haven't heard of. Statically-typed actor-model language with reference capabilities encoded in the type system. The compiler enforces data-race freedom at compile time by tracking how references can be shared. No null, no exceptions, no data races, deadlock-free by construction. Native compilation via LLVM.

Pony has stronger concurrency guarantees in its type system than Gleam. It also has more type-system machinery (six reference capability modifiers), which is the Haskell tradeoff applied to concurrency specifically. More expressive, less simple. It loses on maturity, tooling, community, and training data. There's no equivalent of BEAM's 40 years of production hardening. No OTP. Almost no LLM training data. Language development has slowed. If Pony had BEAM behind it, this post would be harder to write.

JVM languages (Scala + Akka/Pekko, Kotlin + coroutines, Clojure + core.async). Win on ecosystem, tooling, training data. Lose on the runtime: supervision is an approximation, not a primitive. Also lose on weight: startup time, memory footprint, GC pauses. You can't run millions of lightweight processes on the JVM the way you can on BEAM. Loom's virtual threads help but don't close the gap.

.NET languages (C#, F#). F# has real type-system merit and a clean ML lineage. Same runtime problem as the JVM: supervision is an approximation. Also loses on vibes. .NET is the thing your company makes you use, not the thing you'd pick.

Go loses on three of my criteria. Type-system power: no sum types, no exhaustive matching, nil pointers everywhere, interfaces are structural but limited. Runtime model: goroutines are great but there's no supervision. Low reasoning load: Go is deliberately verbose, with explicit if err != nil everywhere, mutable state, and no pipelines or pattern matching. The model has to track more state and infer intent from mechanics. Simplicity alone isn't enough.

Elixir is the hardest case, and José Valim made the strongest version of it in Why Elixir is the best language for AI. Read it, it's the best counter-argument to this post. I still pick Gleam today because it's statically typed with exhaustive matching, which Elixir's set-theoretic type system is approaching but doesn't yet match. When typed Elixir catches what Gleam catches in practice, the ecosystem advantage probably flips the balance.

What would change my mind about Gleam

Fast proof checking plus symbolic synthesis

This is the real horizon. Today's LLM workflow is already multi-stage. Intent becomes a spec, spec becomes a plan, plan becomes code. LLMs write most of it. In theory humans review at each transition: spec against intent, plan against spec, code against plan. In practice, spec and plan get reviewed because they're short and human-scaled. Code often doesn't, because the model generates faster than anyone can meaningfully read.

At the horizon, the split between "types catch what they can" and "tests fill the rest" collapses. Specs become theorems, universal statements over all inputs, and the code isn't passing tests that sampled the state space. It's carrying a proof that the theorem holds. A proof checker verifies it in milliseconds. Human review compresses toward the top of the stack, where it actually belongs: reviewing intent against spec. Everything downstream is machine-verified. The "I'm not reviewing code" failure mode stops being a failure.

@VictorTaelin's Bend2 is the most promising candidate I've seen. Dependently-typed, compiling to HVM (@higherordercomp's parallel runtime based on interaction combinators). Other dependently-typed languages have proof checkers that take ages. Bend2's is fast enough to fold into the compile step. Pair that with NeoGen, Bend2's symbolic synthesizer that constructs functions from specs directly, and you have the two ingredients the horizon needs.

The interesting question is what happens if something like Bend2 becomes a universal frontend that compiles to arbitrary target languages. Correctness guarantees get proven once, upstream. Every downstream language inherits them for free. You'd write and verify in Bend2, emit Gleam or Rust or whatever the runtime demands, and the target language's type system becomes almost irrelevant because the hard invariants were already discharged. The target becomes a transport format, not a correctness layer.

The hard part is whether LLMs can write good theorems (or equivalently, under the Curry-Howard correspondence, good types). The training trajectory is already pointing there. Today we RL models to produce code that passes tests. The same machinery works for any verifiable output, including theorems and proofs, which is active research. Once models generate theorems with the reliability current models produce TypeScript, the workflow closes. We're not there today. When we are, this post is obsolete.

The honest bottom line

Gleam isn't the best at anything, individually. It's the best combination under the criteria I actually care about for AI-assisted concurrent systems. Every alternative fails on at least one axis; Gleam clears the bar on all of them.

Pick the bridge you can actually walk across. That's Gleam, for me, for now.

Melbourne
Baldove

Melbourne Baldove