An epic treatise on error models for systems programming languages

Target audience: Practitioners interested in programming language design and familiar with representations of errors in at least a few different languages such as error codes, checked/unchecked exceptions, tagged unions, polymorphic variants etc.

Estimated reading time: 60 to 90 mins.

In research papers on programming languages, there is often a focus on sophisticated type system features to rule out certain classes of errors, whereas error handling itself receives relatively little attention, despite its importance.This doesn’t seem too dissimilar to problems in the database community/literature, where less “cool” topics like better representation for strings receive relatively little attention compared to their importance. See also: What are important data systems problems, ignored by research? and DuckDB creator Hannes Mühleisen’s CIDR 2023 keynote on Developing Systems in Academia for related discussion.

For example, in Simple testing can prevent most critical failures [PDF]Readers in a rush can skim sections 1 and 2 and read the ‘Findings’ in bold in section 4.

, Yuan et al. found that in the context of distributed data-intensive systems:

“Almost all catastrophic failures (92%) are the result of incorrect handling of non-fatal errors explicitly signaled in software” (Finding 10, p256)
- Out of these “35% [..] are caused by trivial mistakes in error handling logic — ones that simply violate best programming practices; and that can be detected without system specific knowledge.” (Finding 11, p256)

In contrast, there is some excellent long-form writing on error models out there by practitioners.

Joe Duffy’s blog post on the error model in Midori: This describes a variant of C# used to write a microkernel, drivers, and a large amount of user code. Duffy describes a two-pronged error model – “abandonment” (fail-fast error handling) and a variation on checked exceptions (utilizing class hierarchies).
The design doc for Swift’s error model: This discusses the pros and cons of error models in several different languages.
The TigerBeetle docs on “Tiger Style”: This describes how the TigerBeetle database tries to ensure various forms of safety. The discussion on assertions in the Safety section is particularly worth reading.
Rust RFC 236 - Error conventions: This describes conventions for how Rust libraries should use different error handling mechanisms in different situations.
Joe Armstrong’s talk The Do’s and Don’ts of Error Handling: Armstrong covers the key requirements for handling and recovering from errors in distributed systems, based on his PhD thesis from 2003 (PDF).By this point in time, Armstrong was about 52 years old, and had 10+ years of experience working on Erlang at Ericsson.

Out of the above, Armstrong’s thesis is probably the most holistic, but it’s grounding in Erlang means that it also does not take into account one of the most widespread forms of static analysis we have today – type systems. Here’s an excerpt from page 209:

We can impose a type system on our programming language, or we can impose a contract checking mechanism between any two components in a message passing system. Of these two methods I prefer the use of a contract checker

In this post, I want to approach error models in a way that is both holistic and geared towards ecosystems of systems PLs.

By holistic, I mean that I will approach error models from several different perspectives:

From a “product” perspective: What are the core assumptions, what are all the requirements and how are they informed by practical use cases.
From a type systems perspective: How can the type system assist the programmer to accomplish what they want.
From a language design perspective: How API design guidelines, conventions and tooling can be combined with type system features to enable people to write more robust applications.

By geared towards ecosystems of systems PLs,The term “systems programming language” inevitably seems to trigger people into rehashing the same “arguments” in comment sections – many people seem to like using it as a term for excluding other languages because they use reference counting or tracing GC as the default memory management strategy. I’m picking a definition here for the sake of this post. If you dislike this definition, you can either mentally replace all usages of “systems PL” with “language with XYZ memory management strategy” or you can stop reading the post.

I mean that the error model must take into account the following needs:

Graceful degradation in the presence of errors, because the underlying platform and/or acceptance criteria may offer limited recovery capabilities.
Optimizability – either by the compiler, the programmer or both – as performance bottlenecks arise within the system rather than from external sources. This may involve trading off flexibility in error handling for performance.
Facilitation of interoperability between and co-evolution of libraries and applications.

Some examples of such software include databases, high-performance web servers, and interpreters. I will be ignoring use cases such as small-scale scripting for one-off tasks, interactive data exploration etc. I will also be ignoring ABI considerations, because those can largely be resolved by varying levels of indirection in the run-time representation (e.g. see Swift).

The structure of this ~~thesis~~ blog post is as follows:

Section 1 discusses definitions for important terms such as ‘error’, ‘error model’ etc.
Section 2 goes over the primary theses about errors and how people reason about them, along with examples.
Section 3 covers the key requirements for an error model, and how these are justified based on the theses in Section 2.
Section 4 describes a hypothetical programming language Everr and its error model.The reason for introducing a fake language for the sake of discussion, instead of say proposing extensions to an existing language, is that it offers a clean slate for combining ideas across different existing languages without having to worry about things like compatibility with existing language features and/or idioms.
Section 5 describes the error models of various existing and upcoming programming languages.
Section 6 compares Everr with other languages based on the requirements laid out in Section 3.
Section 7 concludes with some questions for you, an unorthodox choice to introduce two new terms, a potpourri of metaphors, and some fluffy exhortations.

For the die-hard ~~fans~~ readers, there is also an Appendix with (EDIT: Mar 9, 2025) 7 8 sections.

To this end, this post is fairly large, clocking in at 15K+ words at the time of writing. You have been warned.

Still here? Let’s get started.

Section 1: Key definitions

For the sake of this post, I’m going to use the following definitions:

Error: I’ll use this term in two ways:

A program state that is undesirable or non-ideal in some way.
A value reifying an undesirable program state into something that can be operated upon by the language’s constructs. For example, errors may be represented as error codes (in C), exceptions (in Java), error values (in Go), a Result::Err case (in Rust) and so on. This value may or may not have some metadata attached, such as a stack trace or some indicator of progress made before hitting the error.

Error propagation: The act of passing an error received from a function that was called to one’s own caller. During error propagation one might want to attach more metadata, release some resources etc.

Error handling: The act of inspecting an error value and/or its metadata, and making some decision based on that. For example, one might decide to log the error, propagate it, discard it, or convert it to a different form.

Error model: The overall system for representing, creating, propagating, and handling errors in a programming language, including best practices and common idioms.

In common parlance, this is just called “error handling”, but it’s always felt a bit weird to me that seemingly very different actions like declaring, propagating and inspecting errors would get lumped together under “handling”. So I’m borrowing this term from Duffy’s blog post instead.

Exhaustive: An error (type) is said to be exhaustive when all the possible data for that error is known up-front. For example, conversion from an Int64 value to a UInt32 value will only fail in a couple of ways: either the value is negative, or it exceeds 2³² - 1. Thus, such a conversion operation supports recording an exhaustive error in case of failure. In contrast, errors types from external systems, such as third-party APIs, are typically non-exhaustive.

Exhaustiveness has two axes: fields and cases.I’m deliberately avoiding a discussion on records vs variants (or structs vs enums/tagged unions, or classes vs case classes) here; more on that in later sections.

Hopefully, none of the above are too controversial. Let’s keep moving.

Section 2: Key theses

The rest of this post is going to be based on a few key theses:

Whether a program state is an error or not is contextual.
Whether an error can be recovered from or not is sometimes contextual.
Errors need to be able to carry metadata.
Robust error handling sometimes requires detailed understanding of possible error cases.
Errors and metadata in unstructured form are difficult to handle reliably.
Programs typically need to handle both exhaustive and non-exhaustive errors.
Programmers mostly operate with incomplete knowledge about errors.

Let’s dig in to each of these one-by-one.

Thesis 1: Whether a program state is an error or not is contextual

Let’s consider the classic example of opening a file. You have some code trying to open a file and that file is not found. Is that an error? Well, it depends!

For example, say you are writing a CLI application that is doing a recursive lookup for a configuration file in ancestor directories from the working directory.

Instead of first checking if the file exists (e.g. using stat(2) on Linux) and then opening the file, you write your code to open the file directly.You do this because the existence check + open strategy can still fail at the open stage with a ‘File not found’ error if some other process deleted the file in the middle of your operations.

However, in this case, the file opening operation not succeeding with a ‘File not found’ is not undesirable or incorrect in some way, i.e., it is not an error, but it is part of normal operation.

Now consider the situation where the path to the configuration file is obtained via a command-line argument. In this case, if you get a ‘File not found’ error when opening the file, it’s likely that the user made a mistake when providing the command-line argument, so it would make sense to surface the ‘File not found’ to the user.

Thesis 2: Whether an error can be recovered from or not is sometimes contextual

Some common examples of programming bugs that are often considered as non-recoverable in the context of application languages are out-of-bounds array accesses, unwrapping optionals (or nil/null dereference) and out-of-memory – these are all listed in the Swift docs and Joe Duffy’s blog post as examples.

Rust RFC 236 categorizes errors into three distinct types: catastrophic errors, contract violations and obstructions. Out of these, out-of-memory is considered as a catastrophic error, whereas index out of bounds is considered a contract violation.Yes, I understand that this is a conventions RFC and that individual libraries may deviate from the conventions if they have a different set of needs. However, conventions strongly affect the design of language features and standard library APIs, so I think it’s worth discussing this here as well.

For both of these, the RFC states that “The basic principle of the conventions is that: Catastrophic errors and programming errors (bugs) can and should only be recovered at a coarse grain, i.e. a task boundary.”

I think it’s important to recognize that even for these errors, the recoverability is contextual. For example, in video game code, if there is an off-by-one error in some rare cases in collision detection or lighting, it’s possible that the game still works mostly fine, and that’s good enough.

For out-of-memory in a web server, you may want to limit the amount of memory a single task can consume, to prevent the whole server from being terminated when the server process runs out of memory.

Even if you don’t have per-request limits, it’s possible that you breach the process-wide memory limit outside of the context of a particular task handling some request or background work. If the server has the opportunity to recover from this, it might be OK to shed load by terminating the tasks for a few requests (e.g. if they don’t use global state, or only use it in a limited way that allows cleanup) instead of terminating the entire process.

To be clear, I’m not saying that such examples represent the majority of cases. However, my point is that even for cases which seem relatively cut-and-dry, there are situations where the classification of an error into recoverable/non-recoverable is not clear cut.

Thesis 3: Errors need to be able to carry metadata

Once code grows beyond a certain scale, understanding errors requires collecting metadata about where/what/when/how/why. For example, in a web server, you might care about tracking the following:

A stack trace for the place where the error was originally created.
Severity of the error.
For data validation errors such as in validating JSON, some ‘path’ within the larger structure (at the time of the error), as well as the value of the unexpected portion.
Retryability of operations, such as DB transactions or RPC calls in a framework like gRPC.

Additionally, one needs to be able to have some logic that can make use of this metadata (e.g. as methods on an error type, if the language supports methods in some form), such as something for generating key-value pairs for observability, computing equality/hashes for checking sameness etc.

Corollary: Error codes as the primary language-supported way of error handling are inadequate for many use cases (see also: Zig issue #2647 - Allow returning a value with an error).

Thesis 4: Robust error handling sometimes requires detailed understanding of possible error cases

In Designing Data-Intensive Applications, Martin Kleppman gives an excellent example of database transactions and retryability:

popular object-relational-mapping (ORM) frameworks [..] don’t retry aborted transactions [..]. This is a shame, because the whole point of aborts is to enable safe retries.

Although retrying an aborted transaction is a simple and effective error handling mechanism, it isn’t perfect:

If the transaction actually succeeded, but the network failed while the server tried to acknowledge the successful commit to the client [..], then retrying the transaction [is unsound without extra app-level de-duplication]

If the error is due to overload, retrying the transaction will make the problem worse [..]

It is only worth retrying after transient errors ([..] deadlock, isolation violation, temporary network interruptions, and failover); after a permanent error (e.g. constraint violation) a retry would be pointless

For a library or framework which provides APIs for interacting with a SQL database, it is necessary to be able to distinguish the various cases of network errors and database errors if it wants to support automatic retries for aborted transactions.

Of course, not all error handling needs this level of rigor in analyzing the various cases. Depending on the context, it might be OK to just log an error and keep going if one is confident that it won’t negatively impact the overall system.

Corollary: An API specification language should probably discourage hiding error information from return values (see also: GraphQL’s default machinery for error handling).

Thesis 5: Errors and metadata in unstructured form are difficult to handle reliably

Anakin and Padme meme. Anakin: I like using domain-specific types for return values. Padme: (smiling) For both success and failure, right? Anakin gives a blank stare. Padme (concerned): For both succcess and failure, right?

I imagine this point is probably the least controversial.

Stuffing error case information and metadata into strings makes an API harder to use for a client which cares about error handling.

If you expose an API where the only way a conscentious user can extract more data for an error is by parsing a string, they’re going to write that error parser, and all the options when you want to change that error message are going to suck. (see also: Hyrum’s Law).

Corollary: A standard library should probably not encourage people to easily attach arbitrary data to errors in a way that cannot be recovered later (see also: Go’s fmt.Errorf function which implicitly encourages users to create unstructured errors**.

Thesis 6: Programs typically need to manage both exhaustive and non-exhaustive errors

When a part of a program does not interact with any external systems, but behaves like a pure function with a well-understood set of possible behaviors, using exhaustive error types allows one to model this certainty.

On the other hand, when interacting with external systems which may change in the future, such as third-party APIs across a network, OS APIs, databases etc., the program needs a way to model and handle errors where all the cases and fields are not known up front. In such situations, ideally, the addition of new error cases and/or fields should not break existing third-party clients at a source level, and lead to some reasonable dynamic behavior.

Thesis 7: Programmers mostly operate with incomplete knowledge about errors

Outside of safety critical contexts, I think it’s fair to say that for most production systems, most people working on them (myself included) have a fairly limited picture of all the different things that could go wrong.

I suspect this is probably true even if one limits the scope to just the situations where the underlying APIs are written by oneself and just involve pure (but complex) computation, as well as situations where all the inputs and system parameters are within “acceptable” ranges.

To be clear, I mean this as an observation and not as a value judgement – I think there are various contributing reasons for this; competing priorities, high system complexity, poor docs, minimal language/tooling support and perhaps even optimism.For more discussion on optimism specifically, see Appendix A7.

Discussing this more would probably take a full blog post or more by itself, so I’m going to stop there.

Section 3: Key criteria for an error model

Based on the above theses and my personal experience, I believe that an error model should be judged based on how well it satisfies the following key criteria.If these sound a bit too abstract, I’ll be discussing them in more detail shortly.

Error declarations must support:
- Exhaustiveness annotation: The ability to mark a declaration as being exhaustive (or not). This must support both axes: fields and cases.
- Case extension: The ability to extend errors with new cases (at the declaration site itself) in a backward-compatible way, if the error was declared to be non-exhaustive in terms of cases.
- Field extension: Analogous to the above but for fields.
- Case refinement: The ability to refine previously defined cases into more fine-grained cases over time (at the declaration site itself) in a backward-compatible way.
Error propagation must support:
- Explicit marking: The abilityThe choice of the word “ability” is intentional. Using an explicit marking discipline may or may not be the default, and may or may not be conventional, but following it must be possible.
  
  to force code to be written in a way such that:
  1. Possible errors from primitive operations are indicated with explicit marks.
  2. Propagating errors from an invoked operation to one’s own caller requires an explicit mark.
  The absence of explicit marks must cause a localized static error.
- Structured metadata attachment: The ability to attach structured metadata to an error. Attaching metadata must preserve the fact that the error is still an error.
- Error combination: The ability to combine errors into a larger error.
- Erasure: The ability to abstract fine-grained errors into more coarse-grained errors.
Error handling must support:
- Exhaustiveness checking:This needs to correctly account for access control rules. For example, in Rust, the #[non_exhaustive] attribute has no effect within the same crate.
  
  For errors declared to be exhaustive, the ability to match exhaustively against all cases.
  
  Non-exhaustive matches for exhaustive errors must be diagnosed statically.
  
  Additionally, for non-exhaustive errors, attempting an exhaustive match against the known cases must be diagnosed statically.
- Structured metadata extraction: The ability to extract structured metadata attached to an error (the dual of Structured metadata attachment).
- Error projection: The ability to inspect individual sub-errors out of a combined error (the dual of Error combination).
- Unerasure: The ability to unerase fine-grained information out of coarse-grained errors (the dual of Erasure).

Criteria for error conventions:
- Error category definitions: Different categories of errors must be documented, so that the ecosystem can rely on centralized definitions. These must be accompanied by examples of when an error should be put in a certain category, and when it may be considered as part of another category.
- Guidelines on error definitions: These should cover what ought to be documented, which annotations should be considered and/or avoided, as well as considerations for downstream users of libraries.
- Guidelines on error propagation: These should cover when it is appropriate to return errors that are erased vs unerased.
- Guidelines on error handling: These should cover when it is and is not appropriate to handle errors within and across library boundaries.
Guidelines should generally be accompanied by rationale, as well as curated lists of potential reasons to deviate from the guidelines.
Criteria for tooling:
- Lint support: Certain classes of errors are likely best avoided through lints/style checkers, rather than type system features.
  1. (Important) A lint that prevents error values from being discarded using standard shorthands (e.g. _ = <expr>), without an explicit annotation, such as a comment or a call to an earmarked function (to allow for ‘Find references’) etc.
  2. (Nice-to-have) If exhaustiveness of cases/fields is the default at the language level, then a lint to require manual exhaustiveness annotations on every type.
  3. (Nice-to-have) A lint for enforcing that fields and cases of error types must be documented.
- Editing support:
  1. (Important) A structured flow for adding new error cases and fields. The editing environment should identify locations which potentially need changes – these will generally be locations which materialize or handle errors of the same type.

For each of these criteria, the following sub-sections describe why they are useful.

Criteria for error declarations

Exhaustiveness annotation is necessary because of Thesis 6 - typical programs need to work with both exhaustive and non-exhaustive errors. This should probably take the form of an annotation or similar, rather than two entirely different type system features (e.g. subclasses for the non-exhaustive case vs standard sum types for the exhaustive case), because:

It is possible for a non-exhaustive error to become exhaustive (if the underlying API stops evolving)
The small “conceptual diff” would probably be best reflected as a small syntax diff in code, rather than an entirely different way of organizing the program.

Once one accepts that non-exhaustive errors are permitted, for such errors to be usable across project boundaries, it naturally follows that the language must support adding new cases and fields to a non-exhaustive error type without breaking source-level backward compatibility.

Lastly, when an API developer gains more knowledgeRecall Thesis 7 - Programmers mostly operate with incomplete knowledge about errors.

about a previously broad error case, they need a way of communicating that to new clients without breaking existing clients. This requires some way of expressing “this one case is now actually N other cases, where N >= 2” which is exactly how I described case refinement.

Criteria for error propagation

The ability to force code to be written using explicit marks for error handling is valuable because it enables modular reasoning about control flow and error handling, one function at a time.

Even if a programmer is able to magically keep track of which functions can silently return which errors in their head, they may stop working on the codebase, and the next programmer working on the code will need assistance in gradually assembling a new theory of the codebase they’ve inherited.

The ability to attach structured metadata is valuable because not all errors can be described by simple primitive values, and often, it is necessary to have contextual information in order to debug why an error occurred (e.g. where in the JSON file is there a missing , again, goddammit).

The ability to combine errors is valuable in the same way that collections like arrays, sets and maps are valuable.

The ability to erase errors is valuable since not all code cares about the details of an error. For example, code inside a supervisor task/process might potentially only care about how to log an error.

Criteria for error handling

Exhaustiveness checking is valuable because it provides clarity that all cases have been handled. However, in some cases, it is impossible to have that clarity, since:

One might not have sufficient time to understand and analyze all the cases.
If the error comes from a dependency:
- One might be unable to change the code in the dependency
- One might be unable to replace the dependency
- One might not want to (or potentially, cannot) lock to a certain version of the dependency.

This means that non-exhaustive errors must also get proper treatment during exhaustiveness checking.

Features like structured metadata extraction, error projection, and recovery are needed because their corresponding duals only make sense when used in concert with them.

Now that we’ve covered a fair bit of ground related to error models themselves, let’s switch gears and talk about what a language design grounded in these observations could look like.

Section 4: An immodest proposal

I’m going to describe an error model in terms of a hypothetical systems language Everr (“Evolving errors”) and its ecosystem. I’ll demonstrate Everr using examples.Normally, I’d hope that this is understood, but since this is the internet, I’m going to spell this out explicitly; the concrete syntax being used here is not the point. This is a language I literally just made up to illustrate some concepts.

After that, we’ll see how well Everr’s error model compares to existing mainstream languages with respect to the criteria outlined in the previous section.

Here’s a rough summary of Everr’s core language constructs:

Everr supports semantic attributes on declarations, similar to C++, Rust, Swift etc.
Everr supports structs (a.k.a. records / product types) and enums (a.k.a. variants / sum types). These can be exhaustive or non-exhaustive.
- Enums are second-class, like Cap’n Proto’s (tagged) union types. However, they support dedicated sugar making them feel essentially first-class.
Everr supports style pattern matching using a minor variation of Cheng and Parreaux’s “Ultimate Conditional Syntax”.
Everr has simple namespaces for grouping things, similar to namespaces in C++ and modules in Rust. Types themselves are not namespaces.This is largely for simplifying the description here. If types were allowed to function as namespaces, (or alternately, if namespaces were allowed to have type parameters), that would necessitate different syntax for referring to “outer” generic parameters vs introducing fresh generic parameters, and we’d end up going into the weeds.
Everr supports union types,When I say “union types”, I mean it in the type-theoretic sense, not in the sense of untagged unions as in C, C++ etc.

but only with a mandatory upper bound. For brevity, I’ll refer to these as “bounded union types”. Bounded union types come in two flavors – exhaustive and non-exhaustive.
Everr supports a Rust-like trait mechanism as well as a trait deriving mechanism.The exact mechanism powering trait derivations – whether they be hard-coded in the compiler, be implemented as macros, compiler plugins etc. – is not terribly important for this post, so I will ignore it.
Everr supports a delegation mechanism for field accesses and method calls from one type to another.

First, let’s do a tour of these core language features. After that, I’ll describe Everr’s error model.

A tour of Everr

Everr has an Optional type for representing optional values:

@exhaustive
enum Optional[A] {
    | None {}
    | Some { value: A }
}

This is syntax sugar for the following code:

namespace Optional {
    // ↓ None will never have new fields
    @exhaustive(fields)
    struct None {}

    // ↓ Some will never have new fields
    @exhaustive(fields)
    struct Some[A] { value: A }
}

// ↓ Optional will never have new fields
@exhaustive(fields)
struct Optional[A] {
    // ↓ Actual enum syntax, without the sugar
    case: @exhaustive(cases) enum {
        | type Optional.None
          // ↑ refers to struct None
        | type Optional.Some[A]
          // ↑ refers to struct Some[A]
    }
}

So Some and None represent first-class types,For comparisons with Scala’s case classes and Rust’s proposed enum variant types, see Appendix A1.

not just cases of Optional.This kind of design does not preclude niche optimizations, see Appendix A2.

Optional values support pattern matching:

fn demo0(x: Optional[Str]) -> Str {
    if x.case is {
       Optional.None {} -> return "Got None"
       // _ represents a value being discarded
       Optional.Some { value: _ } -> return "Got Some"
    } // Compiler checks exhaustiveness for:
      // 1. Cases of Optional
      // 2. Fields of None
      // 3. Fields of Some
}

Even though pattern matching looks like it happens against types directly, internally it works like tagged union types in other languages, so exhaustiveness checking is supported, unlike pattern matching with open class hierarchies.

To reduce verbosity, Everr has some syntax sugar for pattern matching.

Everr supports implicit projection of the field named case to provide a more familiar syntax.This is the only special case (ahem) where Everr inserts a field projection operation implicitly.
To allow the user to omit the type name in common cases, Everr supports looking up namespaces with the same name as the type, so one can use “leading dot syntax”, similar to Swift etc.

Rewriting the code using the above sugar:

fn demo1(x: Optional[Str]) -> Str {
    if x is {
        .None {} -> return "Got None"
        .Some { value: _ } -> return "Got Some"
    }
}

Neat! 😃 Things get more interesting when Everr’s non-exhaustive enums come into the mix.

@non-exhaustive
enum Dessert {
    | Cake { icing : Str }
    | IceCream {}
}

The above type declaration desugars to:

namespace Dessert {
    @non-exhaustive(fields)
    struct Cake { icing : Str }

    @non-exhaustive(fields)
    struct IceCream {}
}

@non-exhaustive(fields)
struct Dessert {
    case: @non-exhaustive(cases) enum {
        | type Dessert.IceCream
        | type Dessert.Cake
    }
}

Here’s how one might write some string conversion functions for these types in a different project.Everr’s rules around mandatory handling of future cases and fields are similar to the behavior of Rust’s #[non_exhaustive] – they only apply across access control boundaries.

fn cake_to_str(cake: Dessert.Cake) -> Str {
    if cake is Dessert.Cake { icing: i, !__ } {
        i.append(" cake");
        return i
    }
}

There are two sigils here: ! and __.

__ means “ignore any label-value pairs, if present”. Since Cake has a @non-exhaustive(fields) annotation, the __ is mandatory, similar to mandatory catch-all clauses for cases of an enum.

! means “if the next identifier matches one or more known label-value pairs, or one or more cases, please issue a warning”.This is a generalized version of @unknown in Swift (docs for @unknown).

Since Cake has a @non-exhaustive(fields) annotation, if a new field is added to it in the future, then a warning will be issued inside cake_to_str.

This design allows the author of the type to add new fields without breaking source-level backward compataibility, while still allowing the user of the type to opt-in to a “notification” when the type definition changes.

Using the above function, one can write a function to convert a Dessert to a string.

fn dessert_to_str(d: Dessert) -> Optional[Str] {
    if d is Dessert { case: dcase, !__ }
       and dcase is {                        // (1)
        .Cake c ->
            return .Some { value: cake_to_str(c) }
        .IceCream { __ } ->
            return .Some{value: "ice cream"} // (2)
        !_ -> return .None{}                 // (3)
    }
}

I know there’s a lot going on in the above code example, so let’s take it step-by-step.

    if d is Dessert { case: dcase, !__ }
       and dcase is {                        // (1)

There are a few things going on here:

dcase is a new binding for the case field of d. It is immediately matched on again after and, using the aforementioned Ultimate Conditional Syntax.
The __ matches the rest of the fields not covered by the pattern, and their values are ignored. Omitting __ would trigger a compiler error, since Dessert is a non-exhaustive struct.

Alternately, if one didn’t care about a new field being added to Dessert, then one could directly use if d.case is.
Due to the !, the compiler will issue a warning if the __ matches one or more fields, similar to the logic in cake_to_str.

Let’s look at the branch (2):

        .IceCream { __ } ->
            return .Some{value: "ice cream"} // (2)

Here again, __ means that future fields are being ignored. The absence of ! means that this code will not issue any warnings if the IceCream type has new fields in the future.

Lastly, since d.case has a non-exhaustive enum type, a catch-all pattern is mandatory:

        !_ -> return .None{}                 // (3)

_ means “ignore a single value” similar to other languages. Due to the !, this line will trigger a compiler warning if one or more new cases are added to Dessert.

Whew, that was a lot! Here’s an XKCD for a short break.

Let’s continue. Suppose we have two people Alice and Bob and we want to write two functions to describe which desserts they like, in the same project as Dessert.If it were in a different project, the initialization syntax would not be available for non-exhaustive types.

Here are their preferences:

Alice likes cake with ganache on top, and also ice cream, but Alice is not open to trying new desserts.
Bob likes cake with buttercream on top not ice cream. However, Bob is open to trying new desserts if there are more options in the future.

These preferences can be modeled using Everr’s bounded union types.

First, let’s model Alice’s preference.

pub fn alice_likes() -> Array[Dessert:union[.Cake | .IceCream]] {
    return Array.new(
        .Cake{icing: "ganache"},
        .IceCream{},
    )
}

Here, Dessert:union[.Cake | .IceCream] represents an exhaustive (bounded) union type. It implies that values of types Cake and IceCream are allowed, and in the future, even if Dessert gains new cases, those will not be returned. Similar to exhaustive enums, if a caller tries to pattern match on the union value, they can match exhaustively, without needing a catch-all pattern.

Now let’s model Bob’s preference.

pub fn bob_likes() -> Array[Dessert:union+[.Cake]] {
    return Array.new(.Cake{icing: "buttercream"})
}

Here Dessert:union+[.Cake] represents a non-exhaustive (bounded) union type. It implies that only values of type Cake are allowed, but in the future, other cases of the top type – in this case, Dessert – may also appear. Similar to non-exhaustive enums, if a caller tries to pattern match on the union value, they must account for new cases being added (including IceCream if Bob changes his mind).

The exhaustiveness of a union type only applies to its cases. When pattern matching, one still needs to be explicit about handling unknown fields (using __ or !__) since both Cake and IceCream have @non-exhaustive(fields).

Since enum cases have first-class struct types, different enums can share cases.

enum BakeryItem {
    | Bread {}
    | type Dessert.Cake // OK
}

This allows reuse of case types without needing to copy the case definition.

Everr supports traits and trait derivations. Everr has a built-in UpCast trait meant to represent O(1) conversions from enum cases to a type containing the enum.

trait UpCast[From, To] {
    fn up_cast(_: From) -> To
}

This trait is automatically implemented for enum cases. So you’d have:

impl UpCast[Dessert.Cake, Dessert] { ... }
impl UpCast[Dessert.IceCream, Dessert] { ... }

impl UpCast[BakeryItem.Bread, BakeryItem] { ... }
impl UpCast[Dessert.Cake, BakeryItem] { ... }

Similar to interfaces in other languages, these can also be implemented by hand.

Lastly, Everr supports a delegation mechanism across types. For example, if you have code like:

struct BakeryProduct {
    @delegate
    base: BakeryItem
    ingredients: Array[Ingredient]
    price: Money
}

At usages of the . operator for field access and method calls on a BakeryProduct value, the compiler first checks if BakeryProduct has the field or method, and if not, checks if BakeryItem has the field or method.

At most one field can have a @delegate annotation to maintain predictability in the face of field re-ordering without needing ad-hoc tie-breaking rules.

Because cases are represented through a special .case field, pattern matching directly on a BakeryProduct value will work, because BakeryItem has a .case field which allows pattern-matching.

Okay, that concludes the tour of Everr’s core language features! Now let’s talk about Everr’s error model.

Everr’s error model - Overview

Everr’s error model is based on the core observation that the handling and recovery from different errors is generally context-dependent, so it provides flexibility and broad mechanisms for different error handling strategies in different contexts.

There are two modes for error propagation and handling.

Fail-slow error handling: This is done using the standard Result type, which is an exhaustive enum equivalent to that in Rust and Swift.
Fail-fast error handling: This comes in two flavors:
- Recoverable: This is done using panic and catch panic primitives, similar to panic/recover in Go and panic!/catch_unwind in Rust. Panicking and panic catching themselves use memory pre-allocated at program startup to avoid out-of-memory in the middle of a panic, while accepting the risk of potentially needing to discard some relevant data.
- Non-recoverable, i.e. program termination. This is done using an exit primitive.

OS signals are handled using callbacks and potentially manifested using one of the above depending on the exact signal and configuration.

Notably, there is no support for asynchronous termination of non-cooperative tasks via asynchronous exceptions, such as that in Haskell, OCaml or Erlang.

For primitive operations, the defaults are as follows:

Numeric operations on bounded integer types panic on failure, prioritizing safety over performance.
Assertion failures trigger a panic.
Out-of-memory for the heap aborts the program.
Stack overflow aborts the program.

However, these can be customized; more details on each of them soon.

One important aspect on how Everr code negotiates what is and is not allowed, is through the designation of certain core language features as capabilities. Examples of capabilities include heap allocation, panicking, program termination and foreign function interface (FFI) usage.

Capabilities manifest at the boundaries of packages, which are Everr’s units of distribution. Each package has a manifest file, which supports specifying:

The capabilities used by the given package, in addition to the defaults. Each capability has three levels (other than Unavailable):
- Implicit: All code is granted access to the capability, without needing any annotation.
- Explicit: Code which uses the capability must have an explicit annotation.
- Binding: Code which uses the capability must have an explicit annotation, and this annotation is considered part of the API contract; weakening the annotation is considered a breaking change.
  
  One can optionally configure the standard linter to enforce that functions which do not use the capability also have an explicit annotation indicating that the function is guaranteed to not use the capability in the future or that it reserves the right to use the capability later on.
The capabilities permitted for different dependencies.

If that sounds a bit too abstract, don’t worry, I’ll explain the capability system later with examples.

First, let’s discuss fail-slow and fail-fast error handling.

Fail-slow error handling

For very basic operations where only a single failure is possible, (e.g. a map lookup), the failure is exposed using the standard Optional type. In most other cases, either the Result or Hurdle types are used.Result and Hurdle are recognized specially by the standard Everr linter, which issues warnings if _ is used to ignore the Fail case the problem field.

// Result represents computations where errors block progress.
//
// Most commonly used for short-circuiting logic.
@exhaustive
enum Result[A, E] {
    | Pass { value: A }
    | Fail { error: E }
}

// Hurdle represents computations where problems do not
// block progress, but they still need to be bubbled up,
// such as in the form of warnings.
@exhaustive
struct Hurdle[A, E] {
    data: A
    problem: E
}

For the E type parameter, Everr programmers are recommended to use domain-specific struct and enum types which define errors. Since both enums and structs support exhaustiveness annotations, it is easy to mark error types for future evolution.

Since enums are desugared to structs:

It is possible to add a field common to all cases of an enum, without breaking source-level backward compatibility.
It is possible to refine a coarse case into multiple sub-cases by adding a case: enum { ... } field to the corresponding case.

For example, say one has a float parsing function which returned an error with the following type:

@exhaustive(cases)
enum FloatParseError {
    | UnexpectedChar { byte_offset: UInt64 }
    | NotRepresentable {}
}

Since this is marked specifically as being @exhaustive(cases), it means that adding new fields is allowed without it being considered a breaking change.

One could add a field to the NotRepresentable case, and refine it, without breaking backward compatibility:

@exhaustive(cases)
enum FloatParseError {
    | UnexpectedChar { byte_offset: UInt64 }
    | NotRepresentable {
        // Represents the lower bound on the number of bits
        // that would be needed for a floating point type
        // to be able to represent the given value.
        min_bits_needed: UInt64
        case: @exhaustive enum {
            | TooSmall {}
            | TooLarge {}
        }
    }
}

Once the type is refined, both of these forms of pattern-matching work.

if err is {
    .UnexpectedChar { .. } -> ..
    .NotRepresentable { __ } -> ..
}
// or more fine-grained
if err is {
    .UnexpectedChar { .. } -> ..
    .NotRepresentable nr and nr is {
        .TooSmall {} -> ..
        .TooLarge {} -> ..
    }
}

Case refinement in Everr provides optionality without a-priori factoring out enum definitions into separate enum and case struct definitions, unlike typical languages in the ML family. The cost of additional verbosity is paid when refinements are introduced.

Errors can be propagated using a try operator, which can be used at different granularities. – it can be used for one-or-more statements, or for a specific (tree of) expressions, including individual call expressions.

enum ABCError {
    | type SubError1
    | type SubError2
    | type SubError3
    | type SubError4
}

fn abc() -> Result<Int, ABCError> {
    try {
        sub_op1(...)
        sub_op2(...)
    }
    let a = try sub_op3(...).some_method(...)
    let b = sub_op4(a).@try
    return ok(b.other_method())
}

try allows for a single level of automatic conversion of errors, via the previously mentioned UpCast trait. This is similar to ? and Try in Rust.

Everr programmers are encouraged to define and use structured error types, and attach metadata to errors by defining new struct fields. Structs can be easily serialized using the same mechanism as for trait derivation, with minimal boilerplate. This integrates well with observability libraries such as those providing APIs for structured logging and tracing.

The Everr language server has a code action for defining error types for a given function based on the error types of the functions that are called. It also can intelligently suggest modifications to error cases as the function body evolves over time, taking contextual rules such as access control into consideration.

The Everr standard library exposes a standard type for call traces.A call trace covers different kinds of traces such as a stack trace, Zig-style error return trace, as well as async stack traces. While these are all recorded differently, they fall under the same concept (sequence of source code locations which describe “how did we get here”).

Even though the use of structured errors is encouraged, the Everr standard library provides APIs for working with unstructured errors.

An AnyErrorContext type which exposes convenient APIs for:
- Attaching key-value pairs.
- Capturing and recording call traces.
An AnyError type which pairs an error value along with an AnyErrorContext and zero-or-more child AnyError values (similar to error in Go). This exposes convenient APIs for:
- Initialization from a specific struct/enum error type.
- Merging several sub-errors into a larger error.
- Tree traversal and inspection.

Fail-fast error handling

Recoverable fail-fast error handling is done via panics, which are similar to unchecked exceptions in languages like Java and C++.

Function declarations and types can optionally be annotated with a @panics attribute which covers whether the function might panic, it might panic when compiled in development mode (but never in release mode), or never at all.

Panicking is a capability. The default capability level for panicking is Implicit, so most packages do not use @panics annotations.

For packages compiled with Explicit or Binding level for panic marking, the Everr compiler checks if a function’s @panics annotation (or lack thereof) matches with the annotations on other functions that are called by it.

The standard linter recognizes functions with @panics annotations and recommends adding a section to the function’s API docs describing when the function might panic.

For the Binding level in particular, the standard linter has an optional lint requires explicit @panics(maybe) or @panics(never) annotations on all functions, to avoid accidental API additions without @panics annotations (which would prevent the implementation from using assertions).

The Everr core libraries, including the standard library, use Explicit panic marking.

A small portion of the Everr ecosystem, chiefly some minimal alternatives to the standard library and related packages meant to be used in the context of embedded systems, default to using the Binding level for panic marking.

Similar to panicking, program termination using the exit primitive is treated as a capability; it has a matching @exits attribute.

However, unlike panicking, the default level for the termination capability is Explicit, as it is fairly rare to require it in library code.

Primitives - Bounded integer operations

In Everr, numeric operations on bounded integer types panic on overflow.

One can opt-in to alternate behavior on overflow (commonly wrapping) at the granularity of a package, one or more statements, or an expression. Similar to try, this operates at a syntactic level.

    let sum_plus_one = @math.wrap { 1 + vec.sum() }
    // Addition semantics in the sum() call itself are unaffected.

When overflow behavior is overriden at the package level, the Everr language server supports showing inlay hints for the overflow behavior at function granularity.

Primitives - Assertions

Everr’s assertions are customizable.I’m deliberately not describing the exact APIs for assertions here as there is some room for building interesting APIs, such as sometimes assertions.

By default, an assertion failure triggers a panic. However, a binary package can customize the semantics of assertions,This customizability introduces some more complexity when considering the interaction with default capability levels and capability checking. I’ve not fully thought through the ramifications, but my gut feeling is that this is a solvable problem.

by specifying an “assertion overlay” in its package manifest. This acts as an override for the default assertion related APIs, so every package in the dependency graph ends up using one’s custom implementation of assertions.

The two most commonly used assertion overlays are:

Profiling overlay: This helps collect metrics related to individual assertions in production.
Toggling overlay: This helps disable assertions in dependencies, either statically or dynamically.

Out-of-memory handling

By default, out-of-memory for the heap results in program termination.

Heap usage is a capability, with the default level of Implicit.

Primitives which utilize heap allocation have fallible alternatives, making it possibleThe design of ergonomic and performant APIs for using custom allocators while maintaining memory safety (or, at least, trying to reduce the number of footguns) is an open problem with several competing approaches. For more details, see Section 6 and Appendix A4.

to build APIs on top without having to rely on heap allocation always being successful.

Stack overflow handling

By default, stack overflow results in program termination.

Recursion and usage of indirect calls are both considered capabilities, with the default level of Implicit.

Code which cannot afford to have a stack overflow, such as code following NASA’s rules for safety-critical code, can easily disable both of these features, allowing the potential call graph of the program to be statically computed.

This allows computing the total stack usage of the program at the time of compilation.

Stack usage is not guaranteed to be stable across minor or patch versions of the Everr compiler, but since qualifying the compiler for safety-critical code tends to be a time-consuming affair that is less frequent than compiler releases, this is considered an acceptable trade-off.

For an alternative design which allows for more expressivity at the cost of more implementation complexity, see Appendix A5.

Section 5: Error models in the wild

The Swift Error Handling Rationale and Proposal doc I linked earlier covers the error model followed in C, C++, Objective-C, Java, C#, Go, Rust and Haskell, so I will not elaborate on those much more here.

Some older statically typed languages missing from that list include D, OCaml and Ada. Some newer languages understandably missing include Pony, Nim, Zig, Odin, Roc, Jai and Hare.

Scala does not market itself as a systems programming language, but its type system offers interesting ways of expressing errors, so it is included below. Other languages such as Dart, Kotlin, Gleam and Unison are excluded because I did not see any unusual or interesting ideas related to error handling in their docs, and they seem to be more targeted towards applications with less stringent performance requirements.

(EDIT: Mar 09, 2025) Common Lisp is not statically typed, but supports an interesting system for resumable exceptions so I’ve included it as well.Thanks to feedback from Manuel Simoni (@manuel) and Tony Finch (@fanf) on on Lobste.rs and Gavin Howard (@GavinHoward) on Hacker News.

I’m going to try to summarize the error models very quickly here, based on the language’s own docs. My summaries should not be considered authoritiative, especially for pre-1.0 languages where conventions are probably still in flux, and docs are perhaps more likely to be out-of-date.

D

As of Feb 2025, the official D docs recommend using unchecked exceptions as the primary way of communicating errors.

However, at DConf 2020, Walter Bright, the creator of D, stated that he thinks that exceptions are obsolete.

Other noteworthy aspects:

Functions can be marked nothrow.
The type Exception is the base class for all “errors that are safe to catch and handle”. Throwing an Exception is not allowed in nothrow functions.
The type Error is the base class of “all unrecoverable runtime errors”. Throwing an Error is allowed in nothrow functions.
The default assert function/built-in throws an Error.

OCaml

Error propagation in OCaml comes in several flavors:

Unchecked exceptions - These are widely used by the standard library. Notably, assertion failures get converted to exceptions.
option and result types - Increasingly more library APIs have alternate variants which return option or result in the failure case rather than an exception.
Async exceptions are used for out-of-memory, stack overflow and for OS signals like SIGINT.

Additionally, OCaml supports syntax sugar for short-circuiting evaluation/chaining of operations using option and result.

The official OCaml docs are probably best described as being somewhat neutral in terms of prescribing a particular error propagation strategy, but they do state:

It tends to be considered good practice nowadays when a function can fail in cases that are not bugs (i.e., not assert false, but network failures, keys not present, etc.) to return type such as ’a option or (’a, ’b) result (see next section) rather than throwing an exception.

The book Real World OCaml recommends using the Base library by Jane Street in addition to the standard library. This library includes additional helper types such as Error.t – a lazy string with various helper functions for ease of construction – and helper functions such as Or_Error.try_with to convert an exception throwing computation to a result.

Real World OCaml concludes:

If you’re writing a rough-and-ready program where getting it done quickly is key and failure is not that expensive, then using exceptions extensively may be the way to go. If, on the other hand, you’re writing production software whose failure is costly, then you should probably lean in the direction of using error-aware return types.

[..] If an error occurs sufficiently rarely, then throwing an exception is often the right behavior.

Also, for errors that are omnipresent, error-aware return types may be overkill. A good example is out-of-memory errors, which can occur anywhere, and so you’d need to use error-aware return types everywhere to capture those.

Ada

Ada supports unchecked exceptions using values of a specific type exception that can optionally carry a string payload. If I understand correctly, exceptions have a notion of identity that is separate from the string payload.

The pre-defined exceptions include:

Constraint_Error of out-of-bounds accesses, overflow, null dereference etc.
Storage_Error for allocation failure and stack exhaustion

In one Stack Overflow discussion, I found two different practitioners stating:

“The reason for [not having a way to chain exceptions] is that the language has been designed for exceptions to be used scarcely”
“Ada exceptions are expected to be used for truly exceptional problems.”

However, the Ada style guide on Wikibooks has a guideline “Use Exceptions to Help Define an Abstraction”. One of the code examples in this section is a stack where the Pop operation raises an exception when the stack is empty.

I/O operations in the Ada standard library pervasively use exceptions. I was not able to verify if there are other widely used standard library alternatives.

GNAT (Ada compiler toolchain in GCC) supports several settings for restricting exception usage such as No_Exception_Handlers, No_Exception_Propagation, and No_Exceptions. For example, No_Exception_Propagation requires functions to handle exceptions in callees.

Scala

Scala 3 supports using unchecked exceptions as well as a tagged union type Try which enables easily catching exceptions and converting them to a sum type representation.

Some popular libraries such as Li Haoyi’s libraries for IO use exceptions for errors such as a file not being found.

I’m guessing the more FP-oriented parts of the ecosystem probably use case classes and/or Result to a greater extent, but I was not able to validate this within 5~10 minutes of searching.

Nim

Based on a quick skim of the Nim docs, I surmised the following:

Nim supports exception handling, and this is used by standard library APIs such as for opening files.
Nim supports raises annotations on functions which indicate the types of exceptions a function may throw (e.g. {.raises: [ParseError]}).
Nim supports a way to mandate explicit raises annotations across a module, by putting {.push raises: [].} at the start of a module. This setting ignores exceptions which inherit from system.Defect, which is Nim’s way of signaling errors such as division by zero and assertion failure.
Nim also has optional and result types.

The Nim tutorial section on exceptions states that

A convention is that exceptions should be raised in exceptional cases, they should not be used as an alternative method of control flow.

The rest of the tutorial is descriptive – it covers how to use different language features related to exceptions.

This unofficial Nim style guide has more detailed recommendations on modeling and handling errors. I was unable to find similar prescriptive language in the official Nim docs.

The below languages are pre-1.0 as of March 2025.The exception is Odin which currently uses date-based versioning. I’m assuming that as being equivalent to pre-1.0 in terms of stability guarantees given that Odin is less than 10 years old.

Pony

The Pony tutorial has a page on the error expression, which allows a function to abort execution until the enclosing try block. All partial functions must be annotated with ?.

As far as I can tell, there is no way to attach any data when using the error statement. If that is correct, it means that the error primitive is similar to an optional type which must be explicitly unwrapped.

Looking at unofficial sources, based on this blog post and StackOverflow discussion on distinguishing between different types of errors, it seems like the most common way is to do error handling in Pony is using union types (e.g. Int64 | OverflowError).

As of Feb 2025, I could not find any official docs further explaining how errors ought to be modeled and how error handling should to be done, outside of the performance cheat sheet which recommends avoiding error and union types in performance-sensitive code, both for different reasons.

Based on the docs, it looks like union types are compiled using a type ID pointer as the tag plus implicit boxing for the value.This makes sense, as it allows implementing the natural subtyping relationship T <: (T | U) to be implemented with zero run-time cost.

Zig

The official Zig way to do error handling is by defining “error sets”:

const FileOpenError = error{
    AccessDenied,
    OutOfMemory,
    FileNotFound,
};

Error sets are structurally typed – identically named cases in different error{...} declarations in different files are interchangeable.I’m curious to see how this decision pans out over time as the Zig ecosystem grows. Will library authors eventually start adding unique prefixes to library-specific errors, similar to prefixes in C and Objective-C, to avoid collisions?

This allows coercion of errors from a subset to a superset, as well as merging using a || operator.

Errors cannot not carry payloads. As a substitute, people use different patterns such as out parameters and avoiding the standard error handling machinery.

Functions may write or omit the various error cases returned.

// Inferred error set
pub fn parse_f32(...) !f32 { ... }

// Explicit error set
pub fn parse_f32(...) FloatParseError!f32 { ... }

Here FloatParseError!f32 is an “error union type”.

For handling an error union value returned from a function, one can use catch keyword along with a default value or a code block.

Zig supports try as a shortcut for error propagation, which amounts to being sugar for catch |err| return err.

Zig supports non-exhaustive enums. The docs do not explicitly mention support for non-exhaustive errors and structs (but they do state that “An error set is like an enum”).

Zig programs are able to use a bevy of compile-time introspection facilities, such as compile-time iteration over the fields of a struct. The Zig docs do not state how compile-time introspection interacts with non-exhaustive enums.

Zig uses whole program compilation. During this:

Unique integer values are picked for different error cases.
The maximum height of the call graph is computed (recursion is capped to height 2), and that is used for pre-allocating a buffer for error return traces in the Debug and ReleaseSafe modes.
- Return traces are implemented using a hidden function parameter.
- catch and try are integrated with return traces.

Zig has a defer statement for resource cleanup. It also has an errdefer statement which runs a cleanup operation only if the enclosing function returns an error.

Zig supports a customizable @panic operation. By default, the implementation prints a stack trace and terminates the program. However, the “root file” (the one containing main) can be used to override this implementation to do something else.

Odin

Odin supports sum types using the union keyword, and C-style enums using the enum keyword.

The Odin docs do not have a dedicated section on the error model, but the code examples in the docs showcase the usage of sum types for errors.

Error :: union #shared_nil {
	File_Error,
	Memory_Error,
}

File_Error :: enum {
	None = 0,
	File_Not_Found,
	Cannot_Open_File,
}

Memory_Error :: enum {
	None = 0,
	Allocation_Failed,
	Resize_Failed,
}

Similar to Go, all types in Odin have a zero value. Generally, nil is a valid value for sum types.This can be overriden with a #no_nil annotation on the declaration, in which case the sum type must have a default value.

The nil values for the types for individual cases can be merged using the #shared_nil keyword, which is used in the above example.

In 2018, the creator of Odin wrote a blog post Exceptions — And Why Odin Will Never Have Them.

One of the consequences of exceptions is that errors can be raised anywhere and caught anywhere. This means that the culture of pass the error up the stack for “someone else” to handle. I hate this culture and I do not want to encourage it at the language level. Handle errors there and then and don’t pass them up the stack. You make your mess; you clean it.

The Odin docs do not mention non-exhaustive structs or unions.

Roc

Roc’s sum types are structural.

For error handling, Roc recommends the standard Result sum type, and standard algebraic types for errors.

I searched for a bit, but could not tell if Roc supports some way of representing non-exhaustive structs and enums.

Roc deliberately does not define an Optional type and recommends the consistent use of Result for error handling instead.

Integer overflow in Roc translates to an irrecoverable program crash, similar to Swift.

Jai

I watched some of Jonathan Blow’s videos a few years back and do not recall any particular discussion on the error model.

The unofficial Jai docs do not have any notable mentions of error handling.

Hare

Hare supports declaring anonymous tagged union types with implicit tags using |.

type index = size;
type offs = size;

export fn main() void = {
	let z: (index | offs) = 1337: offs;
	assert(z is offs);
};

Based on my reading of the docs, the type A = B syntax introduces a newtype in Haskell-speak, not a type alias unlike most other languages that I know of using the same syntax (e.g. Rust, Go, Swift, Haskell, OCaml).In most languages, substitution of type aliases is expected to not affect program semantics. Recursive type aliases are also usually not permitted.

Hare’s tagged union types actually implement union type semantics in the type-theoretic sense, not sum type semantics. For more discussion, see Appendix A3.

Error types in Hare can be declared using a prefix !.

type error = !(io::error | invalid | unexpectedeof);

Such an error type can be handled using match, and supports implicit injection from individual component types.

For functions returning errors, apart from pattern matching, the error can be propagated to the caller using post-fix ? and used to trigger a crash with post-fix `!**.

The Hare docs do not mention non-exhaustiveness for fields or cases.

Common Lisp

(EDIT: Mar 9, 2025: This section was newly added):

Common Lisp has a condition system which allows for resumable exceptions. Conditions do not require unwinding the call stack.

Rust historically used to have a condition system, but that was removed.Thanks to Steve Klabnik for pointing this out in the Lobste.rs thread.

In the linked PR, Alex Crichton writes:

Functions raising conditions are not always clear that they are raising conditions. This suffers a similar problem to exceptions where you don’t actually know whether a function raises a condition or not. The documentation likely explains, but if someone retroactively adds a condition to a function there’s nothing forcing upstream users to acknowledge a new point of task failure.

The notion of resumptions is also closely related to continuations and in more recent research, algebraic effects. For a more detailed discussion, including potential concerns around integration with tracking ownership and lifetimes, see Appendix A8.

Section 6: Everr vs the World

The limits of my language mean the limits of my world.

– Ludwig Wittgenstein

In this section, I’m going to compare Everr against other existing programming languages based on the key criteria outlined in Section 3.

I will be focused on native support, i.e. whether a language supports a direct way of expressing a particular construct, not whether it can be emulated using other constructs.

I recognize that some language prefer minimizing in-language complexity in favor of pushing it out into libraries and/or applications (e.g. by offering general purpose mechanisms such as macros, compile-time reflection etc.). However, as a trend, natively supported features generally get special treatment in terms of syntax sugar, recognition by tooling and dedicated error messages, as well as more mentions in official documentation, so it makes sense to focus on native support.

As the saying goes, “In theory, there is no difference between theory and practice. In practice, there is.” This is going to be a bit of an apples-to-wax apple comparisonBy “wax apple”, I mean an inedible apple-lookalike made out of wax, not the wax apple fruit.

because there is no implementation of Everr, so there is no real world evidence that these ideas are workable without major changes. All I have to offer is indirect evidence of “success” in the form of usage of all the languages that Everr ~~shamelessly copies~~ borrows ideas from.

With that big caveat, let’s proceed.

Error declarations

Languages using exceptions as the primary mode of error handling generally lack a native way to enable exhaustiveness checking: idiomatic error handling is non-exhaustive by construction.

In contrast, newer pre-1.0 languages such as Zig, Roc, Hare etc. lack native support for non-exhaustive errors.

Older languages with algebraic data types such as Haskell and OCaml do not have native support for non-exhaustiveness annotations on data types.

Swift supports marking structs and enums as non-exhaustive, but this is coupled to ‘Library Evolution’, which is Swift’s overarching feature for maintaining ABI-compatibility guarantees.

Rust supports marking structs and enums as non-exhaustive, and extending them with fields and cases respectively when these annotations are used. However, case refinement requires a-priori defining separate structs for individual enum cases.

In languages with native support for algebraic data types, generally sum types and product types are both first-class, and sum types cannot carry shared fields. This sometimes leads to soft recommendations of using single-field structs instead of native enums to maintain optionality in case one wants to ever add a field common to all cases without breaking backward compatibility.

Scala should technically support case refinement very well because case classes are first-class types that can be inherited from. However, my understanding is that this is highly frowned upon.

Scala supports exhaustiveness annotations along the cases axis using the sealed keyword. However, the Scala docs do not mention support for non-exhaustiveness annotations for fields of a case class.

Like Rust, Everr supports exhaustiveness annotations for fields and cases.

Like Cap’n Proto, Everr allows evolving a non-exhaustive enum to a struct by adding shared fields without a source-level breaking change.

Everr supports case refinement, because cases are represented using structs, and exhaustivity annotations from the outer type are propagated to the structs for individual cases, which allows addition of sub-cases using a case field.

Error propagation - Explicit marking

Earlier, in the first sub-point under the bullet for ‘Error propagation’, I wrote:

Explicit marking: The ability to force code to be written in a way such that:

Possible errors from primitive operations are indicated with explicit marks.

Propagating errors from an invoked operation to one’s own caller requires an explicit mark.

The absence of explicit marks must cause a localized static error.

Let’s discuss the second bit first.

When exceptions are used for error handling, the most common approach is that exceptions are silently propagated, and that nearly any function may throw any exception. This approach relies on programmers’ diligence in reading, writing and maintaining doc comments.

Some languages which support finer control over exception propagation are Java, D, Ada, Nim and C++.

My understanding is that the usage of checked exceptions in Java is uncommon due to the added syntactic overhead, and the inability to change code over time to return new error cases without jumping through extra hoops.

Go and Rust have a panicking mechanism which is similar to exceptions, and the typical recommendation is that these should only be used for “exceptional” or “catastrophic” situations.

The Rust ecosystem has a well-known “linker hack” which allows static enforcement over panic propagation in optimized builds.

Swift, Pony, Zig, Roc and Odin do not have native support for any exception-like mechanism, and require explicit propagation of errors (or crashing the program).

Rust, Swift and Zig have dedicated syntax for explicit error propagation.

Rust has a post-fix ? operator for short-circuiting control flow; this is primarily used for error propagation. Rust has an unstable feature try_blocks to limit the scope of ?.
Swift has a prefix try keyword which will bubble any errors out of the enclosing function. This works at the level of individual statements as well as sub-expressions. Swift also supports try? for converting an error to nil (an Optional), as well as try! for triggering a program crash on encountering an error.
Zig has the catch and try keywords as explained earlier.

Haskell’s do notation also provides similar functionality, but it can require additional boilerplate depending on way effects are being propagated (e.g. using effect types or monad transformers).

Some languages have on-going or recent work in a similar vein:

Go has a on-going proposal for dedicated postfix ? sugar for error propagation. Apart from the sugar, programmers are recommended to explicitly attach context to errors using various standard library functions.
The OCaml docs note the trend towards increased use of explicit propagation. The OCaml standard library also offers more APIs now with explicit errors.
OCaml’s sugar for error propagation helps in writing complex logic without needing repeated explicit pattern matching.
C++23 added support for std::expected, analogous to Result in other languages.

Error propagation in Everr is in some sense a blend of existing languages:

It is flexible in allowing try in different places, like Swift.
It offers a third choice between function-wise propagation and program termination, similar to exceptions.
It allows marking code as “cannot panic” like C++’s noexcept.

Additionally, panicking being a capability, and capabilities supporting different levels means that different levels of rigor can be used in different contexts.

Error propagation - Primitive operations

For overflow errors in integer operations, languages that tend to offer control do so in the form of compiler flags and standard library APIs rather than dedicated syntax.

Most languages use wrapping semantics for integer overflow, with alternate semantics provided through standard library functions, built-ins and/or types, if at all.

Some notable exceptions:

Swift and Roc trigger a crash on overflow.
Rust triggers a panic on overflow in debug mode. This can be overriden using compiler flags.
C++ treats signed integer overflow as undefined behavior – this can be overriden to wrapping/trapping semantics using compiler flags.
In Zig, integer overflow triggers:
- Undefined behavior in the ReleaseFast and ReleaseSmall build modes.
- A panic in the Debug and ReleaseSafe build modes (i.e. crash by default, but overridable).

Everr is somewhere between Swift and Rust in some sense; it always triggers a panic, regardless of build mode.

For heap allocation, most languages will happily let you create very large structures in memory(EDIT: Mar 9, 2025) - Wording changed to focus on memory usage spike instead of heap exhaustion, based on observation by Alex Kladov in Lobste.rs thread.

– such as when attempting to accidentally process a very large file entirely in memory – triggering program termination when heap allocation fails.I recognize that on a shared system, the operating system is free to kill a process for consuming too much memory, and how much memory is “too much” can only be determined dynamically. However, at least in some contexts, such as servers, one often knows the amount of memory available at build time or program initialization time.

Zig and Odin are different from other languages here; allocators are passed down ~everywhere as parameters (explicitly in Zig, and often implicitly in Odin using the hidden context parameter). This allows handling allocation failure at various levels.It would be interesting to examine what fraction of applications written in Zig and Odin actually have code paths for dedicated handling for out-of-memory errors, instead of just propagating it up the call stack and terminating the program. It would also be interesting to know what sub-fraction of that group has tests for the out-of-memory error handling code path, and how good that test coverage is.

Zig and Odin do not have any standard source-level markers for code that is known to be memory safe vs code that is not.

Such discipline is possible in C++, Rust and other languages to varying extents, but is less common.

C++ standard types such as std::string and std::vector take an optional type parameter for customizing the allocator.
Rust has an unstable allocator_api feature, where the discussion originally started in 2016. Rust also has a competing storage API proposal.

Ada is somewhat unusual in that it allows returning dynamically sized data through a secondary stack, which can avoid the need for heap allocation in certain cases.

Older languages using tracing GCs have evolved to have features which make it easier to write code free of heap allocations. Unboxed types are now available to varying extents in Java, C#, Haskell, OCaml etc.

Since I have not specified neither the memory management strategy nor the type system for Everr, it doesn’t make sense to perform a comparison of Everr with other languages for this particular point.

I believe Rust’s current approach of requiring you to use different collection types (e.g. Vec, bumpalo::Vec, smallvec::SmallVec) provides an OK sweet spot by putting the complexity of dealing with different memory management strategies onto users which need them, rather than on everyone.

See Appendix A4 for more details on some interesting research in this space.

I do not know of any non-research language which can statically guarantee the absence of stack overflow.

The GNAT compiler for Ada supports static stack usage analysis, which allows gathering stack usage data. Based on some searching, it is not clear if SPARK can statically guarantee the absence of stack overflow.

Technically, one can employ techniques such as dynamic stack probing, and then growing the stack using the heap if it’s likely that one might need more space. This approach is taken by the Rust stacker crate.

Using the capability mechanism, by marking recursion and indirect calls as capabilities, Everr can reduce the risk of stack overflow, likely significantly so. Similar to Ada, an Everr compiler can thus provide concrete upper bounds on stack usage for functions where these capabilities are turned off or not used.

See Appendix A5 for one potential idea on how a language can rule out stack overflow statically while maintaining modularity.

Error propagation - Structured metadata attachment, error combination and erasure

Strictly speaking, structured metadata attachment, error combination and erasure can be achieved in any mainstream language, the question is really about how much boilerplate is needed.

For structured metadata attachment, in a language without inheritance, adding more data to a type requires:

Duplicating the original type definition.
Updating the duplicate type definition to add a new field (or case).
Writing a conversion function from the old type to the new type.

Depending on which type-aware compile-time metaprogramming facilities are available, it should be possible to cut down on boilerplate for these kinds of operations significantly. Languages such as Zig and Nim fall into this bucket.

When using metaprogramming for extending the types, programmers may encounter difficulties in debugging metaprograms. Metaprogramming also poses language design challenges by requiring a clear design for how metaprograms interact with abstraction boundaries. For example:

Are private fields possible? As of Mar 2025, Zig says no whereas Nim says yes.
If private fields are possible, is the metaprogram allowed to inspect private fields?
- If the metaprogram can inspect private fields, how can the owner of the type definition retain the ability to remove the private field?
- If the metaprogram cannot inspect private fields, then how can it duplicate a type definition?

In theory, support for extensible records (similar to TypeScript and Elm) would solve this problem cleanly. However, naively using structural extensible records introduces compatibility hazards across library boundaries, requires exposing the full structure in type signatures (against encapsulation), and increases type system complexity.

In Everr, you can “extend” a sum type defined in an upstream context such as:

@exhaustive
enum ImageProcessingError {
    | DownloadingError { ... }
    | ProcessingError { ... }
    | StoringError { dbError: PgError }
      // Suppose 'PgError' represents a Postgres error which is a complex
      // data type with many fields and methods
}

by defining a new type with specific information of interest:

@exhaustive
struct DetailedStoringError {
    @delegate
    base: ImageProcessingError.StoringError,
    dbName: Str,
    dbURL: URL,
}

and creating a replacement for the outer enum type:

@exhaustive
enum DetailedImageProcessingError {
    | type ImageProcessingError.DownloadingError
    | type ImageProcessingError.ProcessingError
    | type DetailedStoringError
}

this still requires duplicating the cases from the full enum definition, but not any inline fields specified in it.Remapping the shared cases still needs to be done “by hand” in a separate function or relying on implicit injection from T to A:union[.T | ...].

This relies on:

The ability to share case types across enums.
The delegation mechanism to avoid duplicating the type definition as well as explicit forwarding of methods.

In languages with support for both union types and delegation (e.g. via inheritance), such as Scala and Pony, the same could be achieved with similar or less boilerplate.

Everr’s modeling can be copied over exactly to Odin, because its tagged unions build on top of structs, and it supports delegation.

In a language with sum types but without first-class enum case types (e.g. Rust, Swift etc.) this requires making sure that you’re defining dedicated structs for each enum case instead of defining fields inline.

One way to get a fully static native solution with minimal boilerplate would be to add support for defining new types using “diffs” from existing types. While this would be independently useful (e.g. simultaneously supporting multiple versions of a data format or API), this introduces more complexities of its own, and so I’ve chosen to omit that from this presentation. For a sketch of how that could look like, and the complexities such a system would have to deal with, see Appendix A6.

Error handling

Error handling has four sub-criteria: exhaustiveness checking, structured metadata extraction, error projection, and unerasure.

The latter three essentially amount to specific library calls and support for field projection and method call syntax, which are supported by ~most languages nowadays, including languages which don’t identify themselves as “object-oriented”.

For exhaustiveness checking, most newer languages support it in some form. The flexibility of pattern matching syntax (or “switching” syntax) varies heavily based on language. Depending on the language, the following features may have varying levels of support:

Nested patterns
Or patterns
Pattern guards (i.e. using arbitrary functions in a branch)

Everr’s use of the the Ultimate Conditional Syntax generalizes all of these mechanisms, as well as allows more compact expression of pattern matching by:

Reducing the need for explicit temporaries by supporting “splits”
Allowing immediate inline use of bindings following and, without nesting, generalizing if let style bindings in Rust and Swift.

Error conventions

Software development can be reduced to a single, iterative action. Almost everything we do in the course of a day — the pull requests, the meetings, the whiteboard diagrams, the hallway conversations — is an explanation. Our job is to explain, over and over, the meaning of our software: what it is, and what we expect it to become.

– Zach Tellman, Explaining Software

Language documentation generally takes a descriptive position on error handling, describing all the different ways in which error handling can be done, but avoids being prescriptive.

This means that projects often tend to follow either (1) the path the standard library does OR (2) the path which requires the least boilerplate. Sometimes, these are the same.

The responsibility of prescribing error conventions for different situations is generally left to style guides, managed by third parties.

Here’s a short, non-exhaustive list of plausible reasons for this phenomenon:

Prescribing approaches is perceived to be “messy” as it requires deeply understanding and taking into account many different contexts of usage.
Consensus building is both time-consuming and challenging, requiring strong communication skills and high emotional energy.
Writing documentation is generally under-valued in practice compared to programming. For example, language release notes typically mention new features and APIs, not new docs.

As ecosystems evolve over time, introducing conventions later in time is more likely to face opposition, unless these conventions simply codify existing practices as “best practice.” This can be true even when evidence is presented in favor of newer conventions, due to various cognitive biases.

I belie ve there is much more room for languages to provide clearer guidance on appropriate contexts for using specific ways of defining, propagating and handling errors.

I further believe that it is valuable to provide prescriptive guidance earlier in a language’s lifespan than is common, and to encourage thinking about guidance as an evolving artifact grounded in evidence. For example, guidance can be accompanied by short summaries of past evidence showing positive/zero/negative results in relation to the guidance, as well as explicit invitations for collecting further evidence.

Tooling

In principle, given sufficient resources, almost any kind of tooling can be built for any language. The problem is that in practice, “resources” are generally never “sufficient”, so it makes sense to try to make it as easy as possible to build correct tools.

For this specific point, I believe the Go ecosystem is a good example,Perhaps the point about “necessity is the mother of invention” applies here? From what I’ve heard, Java has excellent tools for heap profiling, whereas tooling for languages like Rust and C++ is much more lacking in comparison.

where even though several bits of functionality considered table-stakes in other ecosystems – such as exhaustiveness checking – are not natively supported by the Go compiler, it is easy to create new linters, and then integrate them with other linters and build systems like Bazel.

Standardizing on error propagation mechanisms in particular across an ecosystem can also help motivate investment into deterministic refactorings for simplifying repetitive tasks. For example, in Everr, the language server could offer a refactoring to attach metadata to an error by handling the boilerplate of defining new error types etc.

That’s the end of the comparison between the error models of Everr and that of other languages.

While many languages offer different mechanisms for defining, propagating and handling errors, there is not a language which is “strictly superior” than the other ones along all of these axes.

I believe a design similar to that of Everr can potentially help programmers express the different possibilities of error cases, and how to handle them, in a way that matches or improves upon most languages along most axes, while preserving the ability to maintain code over long periods of time.

The next section is more philosophical than all the ones so far, so if that’s not your cup of tea, you can stop reading here, no judgement. 😆

Section 7: Closing thoughts

In this section, I want to do something a bit different. Before sharing my own thoughts, let me ask you some questions.

Questions for you

Programmers accustomed to statically typed programming languages are likely to raise an eyebrow if they encounter a codebase in the same language where all functions return Any (or equivalent) upon success.

And yet, the use of untyped errors along with need for down-casting is widespread across languages. For example, in Rust, a common recommendation is to use the anyhow crate in applications, and in Go, functions which may fail return an error interface in the vast majority of cases.

Q: Why do you think this discrepancy exists?

Much of the discourse around what consists of “good code” often avoids any detailed discussion of errors altogether. For example, in A Philosophy of Software Design,I’m discussing an example from A Philosophy of Software Design here because I’ve seen it widely recommended on forums such as Hacker News and Lobsters.

John Ousterhout writes:

A module’s interface represents the complexity that the module imposes on the rest of the system: the smaller and simpler the interface, the less complexity that it introduces. [..]

The mechanism for file I/O provided by the Unix operating system and its descendants, such as Linux, is a beautiful example of a deep interface. There are only five basic system calls for I/O, with simple signatures:
int open(const char *path, int flags, mode_t permissions);
ssize_t read(int fd, const void* buffer, size_t count);
ssize_t write(int fd, const void* buffer, size_t count);
off_t lseek(int fd, off_t offset, int referencePosition);
int close(int fd);
[..] A modern implementation of the Unix I/O interface requires hundreds of thousands of lines of code, which address complex issues [..] Deep modules such as Unix I/O and garbage collectors provide powerful abstractions because they are easy to use, yet they hide significant implementation complexity.

[..] If an interface has many features, but most developers only need to be aware of a few of them, the effective complexity of that interface is just the complexity of the commonly used features.

One of the elephants in the roomI have a deeper critique of the Unix file I/O API, but there is not enough space in this margin to write it. I may write a more detailed review of Ousterhout’s book some time later this year.

that Ousterhout ignores here is errors. Let’s just consider open. If you look at man7.org, open can have the following different error cases:

EACCES, EBADF, EBUSY, EDQUOT,
EEXIST, EFAULT, EFBUBG, EINTR,
EINVAL, EISDIR, ELOOP, EMFILE,
ENAMETOOLONG, ENFILE, ENODEV, ENOENT,
ENOMEM, ENOSPC, ENOTDIR, ENXIO,
EOPNOTSUPP, EOVERFLOW, EPERM, EROFS,
ETXTBUSY, EWOULDBLOCK

That’s 26 different error cases. Out of these, let’s look at when EACCES can be hit:

The requested access to the file is not allowed, or search permission is denied for one of the directories in the path prefix of pathname, or the file did not exist yet and write access to the parent directory is not allowed. (See also path_resolution(7).)

Where O_CREAT is specified, the protected_fifos or protected_regular sysctl is enabled, the file already exists and is a FIFO or regular file, the owner of the file is neither the current user nor the owner of the containing directory, and the containing directory is both world- or group-writable and sticky. For details, see the descriptions of /proc/sys/fs/protected_fifos and /proc/sys/fs/protected_regular in proc_sys_fs(5).

If a kernel needs to return an EACCES error for any of these problems, it must detect them. Which means, at the time of detecting the error, it must have some contextual information about what failed. However, because of the API signature, the kernel cannot return this information to the caller. This means that either the kernel and caller need to separately cooperate on having a “side channel” for passing extra metadata, or the kernel can just drop the information.

Q: Have you ever hit an EACCES error when opening a file? What did you do to debug it the first time you hit it? What if you could go back to your past self, in the middle of debugging this for the first time, and tell them that a well-regarded book called A Philosophy of Software Design considers that as a “beautiful example of a deep API”, what do you think your past self’s reaction would be? How much does that matter?

The aesthetics and pragmatics of PL design

There are five essential components to learning a language:

Syntax: [..]

Semantics: By semantics, we mean the rules that define the behavior of programs. [..] The dynamic semantics define the run-time behavior of a program as it is executed or evaluated. The static semantics define the compile-time checking that is done to ensure that a program is legal, beyond any syntactic requirements. The most important kind of static semantics is probably type checking: the rules that define whether a program is well typed or not.

Idioms: [..]

Libraries: [..]

Tools: [..]

– Cornell CS 3110 program course notes

In some sense, program semantics has a sense of timelessness – they talk about a program and a semantics. But our programs (and their semantics) do change over time!

This leads to two different notions, which I’m going to call evolution semantics and migration semantics respectively.I believe there are no standard terms for these in the literature but happy to be corrected!

Here are some loose definitions:

Evolution semantics: This covers the interoperability (or lack thereof) between program fragments and their static semantics as they evolve over time (e.g. as different versions). Some examples of work which would fall in this bucket:
- cargo-semver-checks which programmatically analyzes Rust crates for SemVer violations.
- Swift’s Library Evolution feature.
- Russ Cox’s writing on Go & versioning.
Migration semantics: This covers the interoperation (or lack thereof) of running program fragments and their dynamic semantics as they evolve over time (e.g. as different versions). Some examples of work which would fall in this bucket:
- Various SQL constructs such as ALTER TABLE, and more generally, database migrations.
- Debuggers which allow modification of control flow and/or data.
- Hot reloading functionality found in many game and UI frameworks, and natively supported in Erlang.

I believe both of these areas are worth studying in their own right, because reasoning about time is hard, but reasoning about programs and their data over time is increasingly important as our codebases and databases grow older and larger.

“That’s all well and good,” you say, “but what does time have to do with error models?” Please hold that thought for a moment.

Perhaps while you’ve been reading this post, you might’ve had the reaction that there is something unaesthetic, perhaps deeply so, about Everr’s choice to make product types first-class and sum types second-class.

On the other hand, you might think of ML family languages as being more aesthetic, putting sum types and product types on equal footing, mostly.I have to say “mostly” because a purist version of this would prevent inline record syntax for the cases of a sum type, such as in pre-4.03 OCaml.

I get that. To be honest, I had a similar reaction when I first read about Cap’n Proto’s decision to make records first-class and (tagged) union types second-class (🧠: “Surely, there has to be a better way!”). I also had a similar reaction when I first learnt about the support for IF EXISTS in various SQL DDL commands (🧠: “You can’t just extend syntax in ad-hoc ways like that! That doesn’t seem orthogonal to other features!”).

There’s also something unaesthetic about having implicit projection for a special field name case (🧠: “Why should one field name be privileged over others!”).

The bounded union type syntax also has an unaesthetic quality (🧠: “The presence of the type name ruins the symmetry present in typical union type syntax!”). If we accept the frame of “programmers should be able to read/write Everr code in their favourite editing environments, which have text editing at the heart” as a premise, then there needs to be some way of representing type syntax using ASCII characters, and I do not have the audacity to suggest that programmers write something like:

       Dessert
 /----- union -----\
  .Cake | .IceCream

Each of these decisions fulfills specific needs, but many of them still feel weird, perhaps with the exception of the Ultimate Conditional Syntax (which I take zero credit for, because I literally just copied it from a paper).

Perhaps moreso than these individual decisions, you might’ve felt an overall unaesthetic quality about the kind of reasoning used throughout this post. Much of the reasoning is by case analysis, instead of using induction or by providing a handful of core primitives and combining them in elegant ways to address every challenge posed.

Everr’s language design itself reflects a pattern of accreting up and sanding down, more like seeing an igloo or a sand castle being built over time, and less like walking into a museum and having a Michelangelo sculpture come into view.

Can an igloo built by a novice be as beautiful as a Michelangelo sculpture? Does the answer to that depend on whether you’re looking for shelter from an on-going blizzard?

I believe the “lumpiness” in the landscape of computing – be it heterogeneity in hardware, the variations in the lifecycles of packages and the interpersonal dynamics of package maintainers, or all the “weird error cases” which defy easy classification – is a omnipresent backdrop for the design for general-purpose PLs.

Which parts of this lumpy backdrop should be treated as a canvas to paint on and which parts should be treated as something to be smoothened over – that is a question of both one’s emotions and one’s values.

If we would like to paint certain parts of this canvas that are high up in the clouds, and time is of the essence, perhaps, we need to be willing to stand on the shoulders of Giants rather than insisting on climbing our way up all by ourselves. Perhaps, it makes sense to befriend as many Giants as possible and try to cajole them into forming a Giant pyramidGrammar note: A Giant pyramid is necessarily a giant pyramid, because Giants are giant by definition, but a giant pyramid may not be a Giant pyramid, because it may be made of many individually small objects.

we can climb on.

Two such Giants which I believe are under-utilized in PL design today are:

Special-purpose languages such as Cap’n Proto, where the design involves a fair number of considerations around time and evolution.
Tools which can help us formally reason about complex systems, such as:
- Model checkers such as Alloy and TLA+. In particular, my initial experiments with Alloy makes me believe that Alloy can help with modeling complex relationships across packages, and potentially across time, without writing code.
- Proof assistants.
- Special-purpose tools such as Turnstile(+) for type-checking and herdtools for memory models.

While many of these tools are academic in origin, and thus have a steep usability curve, I think there is value in exploring them, as well as coming up with more user-friendly versions of them.

If you want to start somewhere and have a design problem, I think Alloy is likely a good place to start. Its syntax is easy to understand, and it comes with a visualizer out-of-the-box. You can also save the raw XML for counter-examples somewhere and visualize it using your own code.

Zooming back in to errors specifically, we now arrive at the final question.

What are we to do?

All happy families are alike; each unhappy family is unhappy in its own way.

– Leo Tolstoy, Anna Karenina

Pithy statements such as “only use exceptions for exceptional cases”(EDIT: Mar 10, 2025): Removed the mention of “Let it crash” here. I originally meant it in the sense of how people casually recommend “just follow the Let it crash philosophy” in forum discussions, without considering trade-offs involved in different systems. However, I did not write this out it in full. My mention of “Let it crash” here was (understandably) interpreted as a criticism of Joe Armstrong’s work. I have nothing but the deepest work for Joe Armstrong. As I wrote in the introduction, Joe Armstrong’s thesis is one of the most holistic pieces of writing on errors, grounded in deep experience. (Thanks to user namaria on Hacker News for providing critical feedback.)

while catchy, do not do justice to the complexities that programmers need to deal with when deciding how to define, propagate, handle and reason about errors.

The fundamental nature of errors is that they are often multi-faceted, and complex to reason about by virtue of being multi-faceted. The wide variety of socio-technical contexts that software is created, delivered and used in add further complexity.

It is important that we are able to reason about how software works, but also about how it doesn’t quite work. Reasoning enables better decision-making when we’re making a choice on whether we should tweak, fix, disable or delete the software, or maybe even just keep it as-is.

If we are to reason about and debug software when it doesn’t work as expected, our languages must empower us to do so. While a programming language cannot magically help fix all software issues at any scale, I believe they can help move the needle, by changing the often-implicit frame of reference about how we think about errors.

I believe that it behooves us as an industry to try to move the needle, as increasingly more parts of the world rely on software working correctly.

Appendix

Appendix A1: Everr type system discussion

Everr’s enums are closer to Scala’s case classes rather than enums/sum types in languages like Rust, Swift, Haskell and OCaml.

Similar to union types in other languages like Scala, Pony and TypeScript, Everr’s union types (both the exhaustive and non-exhaustive variants) satisfy commutativity, associativity and idempotence. Additionally, an exhaustive union type is a subtype of the corresponding non-exhaustive union type.

However, unlike Scala, Pony and TypeScript, Everr’s union types are more limited. The elements of the union with a certain top type must either be:

Base case: A type representing a case of the top type.
Recursive case: Another union type with the same top type (interpreted via coalescing cases).

Since Everr has no universal top type, arbitrary types cannot be unioned together. This allows for an efficient run-time representation (the same as enums) without needing whole-program analysis or a JIT, while still providing the benefits of being able to deal with arbitrary subsets.

Everr’s choice of making enum cases first-class and enums themselves second-class is distinct from Rust’s proposed enum variant types, where enums continue to stay first-class but enum cases have second-class status.

Here are the major differences:

In Everr, enum cases can separately carry type parameters, but Rust’s enum variant types cannot.
In Everr, enum cases can separately implement different traits, but Rust’s enum variant types cannot.
Going from an individual case type to the enum type requires attaching a tag in Everr (but the T:union[] form shares the same layout as the enum). In Rust, the enum variant type shares the same layout as the enum.
Everr supports arbitrary subsets of cases; the initial Rust proposal does not support this. However, adding this would be compatible with the Rust proposal.
Everr’s design allows composition of cases across enums, but Rust does not.

Everr’s delegation mechanism is similar to inheritance, but without any association with subtyping, and correspondingly there is no notion of up-casting or down-casting.

Imperative languages such as Go and Rust provide similar functionality using different techniques: Go has struct embedding, and Rust has Deref coercions.

Appendix A2: Niche optimizations with second-class enums

Rust can do type layout optimization (also called “niche optimization”) by considering excluded states. For example, the type: Option<NonZeroU8> will take 8 bits, because the 0 state is reused by None.

In Everr, going from a type MyEnum.MyCase to MyEnum:union[.MyCase] with zero or more other cases requires the compiler to generate a function which attaches the corresponding enum tag. For the Optional type discussed, one needs two such functions for the two different cases, which will have the following signatures:

fn _none_to_union[A](v: Optional.None) -> Optional[A]:union[.None] {
    // implementation elided
}

fn _some_to_union[A](v: Optional.Some[A]) -> Optional[A]:union[.Some[A]] {
    // implementation elided
}

Since the generation of these functions is entirely under the compiler’s control, it is possible for the compiler to directly generate specialized versions of these functions for different types with different niches during monomorphization, instead of attempting to generate one version.

This means that when a compiler sees an implicit conversionThis is syntax-directed implicit conversion rather than being a subtyping rule, because it requires a change to the run-time representation. If it were a subtyping rule, and Everr were to support covariant and contravariant type parameters, that would require potentially-arbitrarily-expensive bridging conversions, like Swift.

from None to Optional[A]:union[.None], it can specialize the conversion based on what A is instantiated to.

Appendix A3: Hare type system discussion

The Hare docs state that tagged union types are “commutative, associative, and reductive” and that “the order of types, inclusion of the same type more than once, or use of nested tagged unions has no effect on the final type.”

So Hare’s tagged unions implement union types in the type-theoretic sense, not sum types, unlike most other languages with tagged unions. Hare does not support parametric polymorphism, so the choice to implement union type semantics via tagged unions:

Offers a high degree of flexibility, like Scala, Pony and TypeScript, by not requiring upper bounds (unlike Everr) and allowing the union of arbitrary types (unlike Everr).
Does not require a uniform representation, like Everr but unlike Scala, Pony and TypeScript.
Does not increase type system complexity significantly, because there is no need for subtyping rules involving type inference and polymorphism.

Appendix A4: Safe and modular memory management

The research work on Verona has surfaced one potential direction Reference Capabilities for Flexible Memory Management:

Verona is a concurrent object-oriented programming language that organises all the objects in a program into a forest of isolated regions. Memory is managed locally for each region, so programmers can control a program’s memory use by adjusting objects’ partition into regions, and by setting each region’s memory management strategy.

The overall type system is more flexible than Rust’s in some ways, but if you look closely, the paper points out that field accesses may throw an exception if the region corresponding to the field is already open. If I understand correctly, this is similar to Rust’s Cell type which does dynamic borrow-checking.

Entering a region borrows and/or buries the variable or field referencing the bridge object. In the case of a stack variable, the variable is buried to prevent the region from being multiply opened

In the case of a field, we instead resort to a dynamic check of the region’s state.

I have not been able to absorb the paper deeply, but such a design seems potentially concerning.

Technically, this could be replaced with a more local form of returning an error, but it’s unclear how widespread the need for this would be in typical programs.

Appendix A5: Modular static analysis for stack usage

I believe it should be possible to support modular static analysis for controlling stack usage without requiring eliminating indirect calls, which can be useful with basic operations like map, filter etc.

The problem with an indirect call is that the stack usage for it will be unknown. So the most direct “fix” is to equip calls with stack usage information.

Specifically, function types could be equipped with two kinds of stack usage budgets:

Self budget: The maximum memory the function is allowed to use for temporaries.
Calls budget: The maximum stack usage for calls inside the function.

When a function body is lowered to an IR which makes temporaries explicit, after some set of relatively stable baseline optimizations (e.g. sparse conditional constant propagation and copy propagation, but no inlining), a compiler can check the following:

Does the self budget exceed the sum of stack usage for all temporaries assuming no lifetime contraction/further optimization.
Does the calls budget exceed the total stack usage budget for each called function (these calls may be indirect)

Finally, after inlining and other optimizations but before generating assembly, one could perform a validation check only for the total budget (but not for the self and calls budgets, because of inlining).

In such a system, if you annotate main with a stack budget, then you’d essentially trigger errors for each function call inside main and so on until you’ve added stack budgets for every function in the call graph. Yes, this would necessitate writing your own stack usage aware minimal standard library.

I believe such a system should be “workable” in practice in the limited sense that compiler optimizations typically do not increase stack usage of call trees, and the number of temporaries generally goes down as the optimization crank is turned more. So I suspect that the final validation check should fail not too often.

Depending on the desired properties about where errors should be handled (e.g. is it OK to emit errors after monomorphization?), and which language features need to be supported in concert with stack budgets (e.g. is it OK to only allow setting budgets on monomorphic functions?), one could:

Potentially have the budget checks run on a polymorphic IR instead of the post-monomorphization IR
Allow the budget to not just be an integer, but a more general expression, allowing references to stack budgets of parameters, some basic numeric operations like + etc.

I suspect that it’s not really possible to have a much simpler solution than what I’ve described above unless one is willing to give up on (1) modularity or (2) move the check to be dynamic.

Of course, one might ask: is this much complexity worth it “just” for statically preventing stack overflows in a modular fashion?

For that, the answer is I don’t know. If we believe existing languages, the answer seems to largely be No.

Appendix A6: Defining new types using diffs

Say Everr supported type diffs. This could help reduce the boilerplate involved in extending a type defined upstream from:

@exhaustive
struct DetailedStoringError {
    @delegate
    base: ImageProcessingError.StoringError,
    dbName: Str,
    dbURL: URL,
}

@exhaustive
enum DetailedImageProcessingError {
    | type ImageProcessingError.DownloadingError
    | type ImageProcessingError.ProcessingError
    | type DetailedStoringError
}

to something like:

@exhaustive
enum DetailedImageProcessingError diff ImageProcessingError {
      | ...
    * | base: StoringError -> DetailedStoringError {
          ...,
          + dbName: Str,
          + dbURL: URL,
        }
}

Specifics of concrete syntax aside, it would technically be possible to parse the syntax below and desugar it to the one above. This may be particularly useful if there are lots of cases in the upstream type being extended.

The Everr language server could show the desugared version as a large multi-line “inlay hint” inside the editor.However, this runs into the further issue of how to represent the upstream type if the package containing the definition of the upstream type is not pinned to a specific version.

However, because this implicitly adds a @delegate attribute to preserve the ability to do field projections and method calls, chaining such type definitions makes it easy to have multiple levels of delegation, which can be confusing to debug (similar to deep inheritance hierarchies). The interaction with the implicit project for .case also needs to be thought through, and may be confusing.

On the other hand, forbidding multiple levels of definitions of type diffs may be too restrictive if the target use case also needs to cover code which needs to support 3+ versions of a data type over time.

This is only considering additive diffs. Subtractive diffs are likely simpler – because @delegate cannot be involved without breaking abstraction – but also less useful, since it’s more common to want to create extended versions of third-party types in practice.

Appendix A7: Optimism

Recently at work, I discovered that my assumption that our DBs was configured so that queries which are running fast would continue running fast was proven wrong as the DBs went through a Postgres major version upgrade, and the loss of statistics – despite having autovacuum turned on – contributed to an incident.

The incident was resolved late on a Thursday night; I was due to go on vacation the subsequent week. When on vacation, when I had some time to kill, I spent digging around mailing list threads, Postgres source code, blog posts and asking Claude, attempting to answer the question “What are all the different possible situations under which Postgres can have statistics that are woefully out-of-date despite autovacuum being turned on, and how can those be detected?”

When I came back, I noticed that nobody had really asked the same question, at least in public. At first, I was puzzled, “Surely, people are not just hoping that this doesn’t happen again, right?” Then I realized I was operating with a different mental model altogether. My trust in Postgres’s ability to maintain statistics had gone from confident to low (and the scarce documentation did nothing to allay that fear), whereas my colleagues were believing that this was likely a one-off issue specific to major version upgrades.

Both of these beliefs make some sense in different ways.

The more optimistic point-of-view assumes that given that we did not have any such statistics related issues earlier when running Postgres 12, and that Postgres upgrades generally bring improvements, it was likely that autovacuum in Postgres 16 is at least as good and less buggy than Postgres 12.
The more pessimistic point-of-view assumes given the presence of one undesirable behavior in the autovacuum daemon that hadn’t yet been fixed despite Postgres being one of the most mature and widely used DB systems in existence, it was possible that more undesirable behaviors in the same area were still lurking, just waiting to be hit.

When I was thinking about this, I was reminded of The Mythical Man Month, where in the titular chapter, Brooks devotes the very first section to discussing programmers’ optimism.

All programmers are optimists. [..]

In a single task, the assumption all will go well has a probabilistic efect on the schedule. It might indeed go as planned, for there is a probability distribution for the delay that will be encountered, and “no delay” has a fine probability. A large programming effort, however, consists of many tasks, some chained end-to-end. The probability that each will go well becomes vanishingly small.

Here, Brooks is discussing optimism it in the context of planning. Overall, the section is largely anecdotal and speculative about causes. This point about optimism in the context of planning has been repeated by Kent Beck and Jeff Atwood.

However, the planning fallacy is common across professions.

I’m curious: is there research showing programmers tend to be more optimistic than other professions? (🧠: “Am I the weird one?**) And does this optimism apply when reasoning about error cases, and how programmers adjust their trust levels in particular bits of code as they discover bugs? I searched for a bit but didn’t get much, but if you’re reading this and know some research in this area, please let me know. 😃

Appendix A8: Resumable exceptions, continuations and algebraic effects

(EDIT: Mar 9, 2025): This section was added based on discussion in Lobste.rs and Hacker News threads.

A condition system like that of Common Lisp may be thought of as being similar to passing a structure mapping prompts to handlers implicitly to various functions, and at the point of a resumable exception, the appropriate handler must be identified based on the provided prompt and which existing handlers are available.

Since Common Lisp is dynamically typed, the choice for which handler is invoked must happen at run-time.

My understanding (potentially wrong) is that this is not too dissimilar from algebraic effects, except that in the case of algebraic effects, one can optimize this further, such as not needing to search through the evaluation context (e.g. as in Generalized Evidence Passing for Effect Handlers).

Based on the Rust PR for the removal of the condition system in Rust, it seems like the condition system originally in Rust did not track “which handlers are needed in this context”. Technically, this particular problem might have been solvable using the recent research on algebraic effects.

In the absence of passing type information to the point of resumption, some form of dynamic type checking is necessary. This presents two broad options in terms of language design:

Allow data types containing borrowed references to be passed to a condition handler (i.e. non-'static values in Rust). This may require reifying lifetime/ownership/region information at runtime for run-time type-checking, depending on the exact rules of run-time type-checking.
Require that resumption functions only be able to deal with owned values or references guaranteed to live arbitrarily long.

Depending on threading/task structure, further restrictions may be needed. For example, can the resumption function capture “owned” values? If so, what if it destroys them after/during the first invocation? Can the resumption function be invoked concurrently?

Rust has several traits for expressing various kinds of functions - Fn, FnMut, FnOnce. Function invocation under concurrency further involves the Send and Sync traits.

In essence, if one desires type safety and memory safety, if we consider the structure containing the “handlers” as described at the start of this section, the types for the functions involved would need to be checked either dynamically (à la Common Lisp or pre-1.0 Rust) or statically (à la algebraic effects) at the point of resumption.I believe this point applies even if the exact implementation is not based on literally passing an extra hidden parameter.

However, given the complexities of function types in both Rust and Swift (as two examples of languages which ensure data race freedom while targeting low-level use cases), it seems that implementing either of these options is unlikely to be palatable to programmers.

As far as I know, there are no widely deployed programming languages integrating algebraic effects with lifetimes/ownership/borrowing.

The closest work I know related to this is that on Granule, but I do not know enough about it to comment on its limitations. As of Mar 2025, the project homepage mentions that the type-checker uses the Z3 SMT solver, which is not a common practice outside of theorem provers.