Procedural vs Declarative Abstractions

A Ladder of Abstractions

The successful execution of a piece of software rests upon a tower of dependable abstractions; each layer, building on the previous, provides a more logical, less physical platform to realize computation. We have, from a bird's eye view (expanding upon [1]):

Domain	Topics of Concern	Abstractions Exposed
Particle Physics	Electromagnetic fields	physical interactions / differential equations
Materials science	Building up chemicals / objects with desirable physical properties for electronic systems	Material spec sheets for composition / manufacturing
Electrical engineering	Practical electronic systems	Basic electrical components (like logic gates) with known propagation delays, impedances, acceptable voltages, ...
Hardware engineering	CPU / peripheral designs	ISA, physical and data link layer protocols, ...
Operating systems	Sensibly organizing hardware into bootable machines with process-level abstractions[a]	(Virtual) address space, scheduler policy, thread library, filesystem, ...
Software translation	Collection of work/tooling around converting "code" in high level languages into executable code	Compilers, linkers, programming language semantics, ...
Algorithms	Procedures operating atop these abstractions to compute	Guarantees on the data computed given suitable inputs

These are only some of the basic abstractions typically used to build up programs that usefully compute[b]. When it comes to networking, the OSI model provides a rough guide of layered abstraction to provide a separation of concerns [c].

The platform we have today is remarkable, the result of thousands of lifetimes spent specializing, building, and iterating so that the combined platform may evolve and improve. The degree of specialization at each layer is so great that the full picture is beyond the comprehension of any one person. Progress comes from the interplay of individuals who only fully grasp a small portion of the tower.

As we move down the ladder, the "stuff" exposed becomes less physical and increasingly mental (logical) -- more psychologically in the realm of thoughtstuff. And it is here where it is easier and more rapid to make changes and introduce more ad-hoc abstractions (hence, the soft in software).

Amongst these, I have to admit to having acquired a kind of sense, a taste for what works well... and what doesn't.

Effective Abstractions

One aspect of this is how "clean" the abstraction proposed is and how little, ideally, I need cut the veil and peer into the black box to understand it.

What I mean is, some abstractions are so good that I don't need to think about them most of the time.

What makes an abstraction "better"? What makes it easier to use?

Four unscientific criteria come to mind:

an abstraction should be simpler than the abstracted thing. A more complex abstraction saves no mental energy to use.
an abstraction should contain all the relevant behaviors of the abstracted thing. An incomplete abstraction wastes mental energy when it fails.
an abstraction should be efficient. A slow abstraction will always be tempting not to use.
an abstraction should be relevant. An irrelevant abstraction is of no use.

These goals are in tension, and there may not be a satisfactory answer in all cases.

Certainly, there are different users of an abstraction with different goals and usecases.

There are times when it is also practically better to consolidate on one (suboptimal) answer/decision for an abstraction "question" than fragment[d], and times when fragmenting is the right decision[e].

Procedural and Declarative Flavor

All abstractions describe an interface of what they produce. Loosely speaking, an interface is the shape of the thing produced.

Abstractions differ in the style of description of the interface[f]:

Declarative abstractions state the desired outcome without committing to how it is achieved (e.g., a sorting specification).
Procedural abstractions are characterized primarily by the steps they perform (e.g., “shift/reduce until acceptance” for an LR parser).

These should be thought of as lying on a sort of spectrum -- the cutoff is subjective.

While you might surmise that hewing more declarative is "better", this isn't necessarily the case. A declarative abstraction might be less effective, say because it is too inefficient, or it declares irrelevant details.

Some examples illustrate this general philosophy:

Examples

Sorting

Perhaps the canonical introductory algorithms problem, with good reason. Given some order-comparable input data $a_1,a_2,\ldots, a_n$, a sorted transformation is a permutation $\sigma$ such that data is weakly-ascending after applying the permutation. That is, for $i,j \in [n]$, $$ \sigma : [n] \to [n] \\ i \neq j \Rightarrow \sigma(i) \neq \sigma(j) \\ \sigma(i) \leq \sigma(j) \Rightarrow a_{\sigma(i)} \leq a_{\sigma(j)} $$

Like many algorithms topics, the problem is well defined declaratively, and there are many solutions. Quicksort, mergesort, bubblesort, ... all solve the problem (with different tradeoffs).

We can add more constraints, like stability, which may constrain us further (eg: in-place quicksorts are the canonical violator).

What makes the sorting problem so nice is that it has a simple interface (give me a list of data, maybe some custom comparator), and the sometimes-intricate details of the actual implementation can be often ignored. On top of that, there is room for variation in implementation within the same interface when that is merited.

LR Parsers, and variants

LR parsers build a syntax tree by performing shift and reduce actions (determined from a lookup table) for each token of input (possibly with lookahead).

A class of languages that are in a sweet spot of human comprehensibility and sufficiently orderly to be useful are the context free languages. Unfortunately, the full class of context-free languages have some pathological examples that can't be parsed in linear time, but these pathologies are somewhat rare in practical applications.

A variety of algorithms working on useful subsets of the context-free languages come from the family of LR parsers. They are generally highly efficient, able to parse through an input file in a single pass. Unfortunately, the exact description of what grammars a given variant (like SLR, LALR(1), GLR, ...) can be used for is basically "what an SLR/LALR(1)/GLR/... parses". There is no convenient shorthand. The interface is procedural—accept if the table drives you to acceptance—and that makes it harder to reason about coverage or errors without looking at the mechanics.

These parsers also typically fall short in matters of error handling. Also, when these parsers fail to parse, they often lack sufficient context of what went wrong to meaningfully diagnose why the parse failed. They often cannot recover and try to meaningfully parse the remainder of the document, rendering them of limited use in incremental parsing. The somewhat inscrutable "shift/reduce error" messages generated are only useful if you know a fair bit about how the parsers work, and know how to lookup to actual parser table/state (hopefully parsing tool helps). This limits their relevance.

As a result, for contemporary applications like language servers, these language parsers have fallen into relative disuse.

Hash Tables

Hash tables solve the problem of doing quick lookups from a key to a value (the dictionary problem).

With sufficient assumptions on the randomness of keys/hashes, the asymptotic analysis of hash table behavior is well understood for various hashing and collision handling schemes including separate chaining and various forms of probing. Even though obtaining a concrete arrangement of slots is essentially "perform all inserts in order," the inherently procedural behavior still admits effective analysis.

All the same, a common problem is that often keys are not sufficiently random. In some applications, tail latencies (say, a hash lookup with high collisions/lots of probing) are material.

As it relates to efficiency, when the distribution of data or the set of keys is well-known, pre-processing can reduce or even eliminate the possibility of collisions[g].

Nonetheless, due to their workhorse utility and applicability in so many scenarios, hash tables are practically useful and highly effective. Imagine life without them!

Floating Point Arithmetic

Doing computation over a numeric type that behaves like the reals is useful across many domains. The trouble is that digital computers can't store or retrieve real numbers. Balancing the physical reality of limited memory, the needs for reasonable accuracy and precision, general error handling semantics, and efficient performance (chip area, instruction latency, throughput, energy / op) is a difficult problem. Add to that the reality that most authors of numerical programs will not be experts in numerical methods, but nonetheless aim to be productive, and the challenge only multiplies.

Various approximate numeric schemes have been considered over time: fixed-point (decimal) numbers, rational numbers, "bignum", and (many) incompatible versions of floating point numbers. However, the dominant format is IEEE-754, first standardized in 1985. While alternatives continue to exist and have uses in specific cases[h], IEEE-754 floats are undeniably ubiquitous and a general default for "real" calculations.

The actual specification (IEEE-754 2008) is divided into sections on representation/book-keeping, rounding rules, actual operations (eg: arithmetic), special values (infinity/NaN's), errors and error propagation, recommended elementary functions to implement, and evaluation rules (eg: when to round a result). For such a procedural topic, the specification is remarkably descriptive. At the same time, the principles underlying the description represent a clear awareness of hardware implementability while resolving ambiguities in other formats of the era.

For end-users, the specification is based on a reasonable "psychologism" of how reals "should" behave in many common cases. It is both usable by non-experts, while creating clear enough boundaries for specialists to continue to innovate (eg: through the development of efficient elementary and special functions[2] in both software and hardware).

POSIX signals

POSIX signals are a famously procedural abstraction. The interface surface is small integers mapped to dispositions, but most of the semantics leak from the implementation details of a given kernel and libc.

Delivery ordering is unspecified (see signal(7)), yet applications tend to "fit" to actual observed behavior. Multi-threaded delivery is tacked on to the basic APIs, and by default, signals sent to the process are delivered to an arbitrary, random thread (see pthreads(7))... handling this is an exercise left to the reader.

The lack of queuing for most standard signals means bursts can collapse into one pending delivery, and which signal survives is procedural.

The net result is an API that looks simple but forces users to reason operationally about interleavings, handler reentrancy, and hidden global state, rather than giving a clean declarative model to build on.

Linkers

These are a workhorse of assembling programs. Most users are able to lean on their default behavior most of the time.

... But when there is an error, the tools are usually opaque about why an error occurred.

A conclusion?

The criteria are not clear; many are competing. Nonetheless, with experience using, building, and living with abstractions or design patterns over time, a sense of "good taste" emerges. As Potter Stewart wrote in Jacobellis v Ohio, "I know it when I see it."

These notes hopefully put some meat on the bones of those intuitions and what to look out for when evaluating a proposed abstraction pattern.

Footnotes

[a] To actually create these abstractions involves the use of tools/practices from later levels (like compilers to generate the operating system kernel program, algorithms to compute memory swapping, ...).

We should all be grateful for the hard work that came before which allows us to bootstrap/iterate at all layers of abstraction more effectively now. Back to text
[b] Not strictly necessary, for example circuits doing useful computation can be built up by hand, or programs may execute in an environment without operating system support. Back to text
[c] Although there is some disagreement about the particulars... Back to text
[d] eg: the consolidation of focus on linux makes it better suited for generic applications than a myriad of alternatives which receive a fraction of the investment. Back to text
[e] eg: there are a myriad of specialized operating systems that are better suited to specific applications than linux (say, hard realtime control in automotive or avionics, or low power draw).

And often you don't even want an OS/microkernel in embedded settings. Back to text

[f] If the terms seem overly particular, it may be helpful to compare the terminology for analogous concepts in other domains:

Conceptual Axis	"Descriptive" side	"Procedural" side
Language theory	Declarative	Procedural / Operational
Semantics	Denotational	Operational
Logic / Math	Extensional	Intensional
Formal methods	Specification	Implementation
Relational model	Relational	Functional / Constructive

Back to text

[g] A simple perfect hash example: if your keys are the integers $K := [100, 200) \cup [250, 300)$, let

\begin{align} h : K &\to [0,150) \\ x &\mapsto x - 100 - \max(0, x - 200) + \max(0, x - 250). \end{align}

For $100 \le x < 200$, both max terms vanish and $h(x) = x - 100 \in [0, 100)$; for $250 \le x < 300$, the first max contributes $x-200$ and the second adds $x-250$, so $h(x) = x - 150 \in [100, 150)$. Thus $h : K \to [0,150)$ is a perfect hash without branching (godbolt). Back to text
[h] At the time of writing, there is enormous investment in machine learning and large language model design to increase scale. The memory-pressure of representing large models can be reduced by using more compact floating point representations such as tf32, fp16, bfloat16, fp8, fp4, and more exotic variants in the limit. (Eventually the model architecture will simply degenerate...)

There are also occasional attempts to come up with a genuinely broadly applicable alternative, eg posits (though they haven't really caught on). Back to text

References

P. P. Chu, RTL Hardware Design Using VHDL: Coding for Efficiency, Portability, and Scalability. Hoboken, NJ, USA: Wiley-Interscience, 2006.
W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes: The Art of Scientific Computing, 3rd ed. Cambridge, U.K.: Cambridge Univ. Press, 2007.