The Most Elegant of Arrays

~sirodoht published at April 1, 2022

Abstract

We show that it is Scheme, of the Lisp family, the language that deterministically emerges when one opts for elegance.

Problem

Had someone asked us to create the most elegant programming language that could ever exist, what would it look like?

Solution

Let’s start with the fundamentals: logic. We need logic operators such as and, or, not. We also need equality and inequality checks: = and < and >.

Nextly, we’d love to do arithmetic operations; addition, subtraction, multiplication, etc. We would definitely have functions as well. However, in the name of elegance, arithmetic operations can also be considered functions, so maybe we could find a unified way for these two.

The problem is that arithmetic operations are infix notated, eg 4 + 5, while function calls are prefix notated, eg calc(a, b).

In the name of elegance, let’s choose one of these notations and do both operations with it. Since we might have more than two arguments, infix doesn’t make sense. 4 + 5 6 makes less sense than + 4 5 6.

What about the function call parentheses? Less is more. No parentheses. Unless it’s to indicate operation precedence, of course.

Now, some languages use one keyword to define functions and another to define something else, say a variable. We don’t need more than one. Everything is simply defined with one keyword; which one? the simplest: define.

In this elegant language of ours, we’d surely need some data structures as well. Would it be possible to only define one simple data structure and then built up everything else from there?

If such a data structure would exist, it’d probably be an array. It’s the simplest one—yet with one intricacy. The length of an array can be any integer. A language of simplicity and elegance would never allow such imprecision. Such a language would not allow mutability either. It is well known that mutability is the root of all evil. Changing something implies its creation was less than ideal. In our elegant language, there is no such creation.

We have arrays with no ability for user-defined length—they have a fixed length. Which of all the lengths, though? Surely, a length that is a prime member—anything else wouldn’t be elegant enough. Which prime number, if not the first, then?

All arrays in our chic language are length 2.

Actually, arrays are too mainstream for our language. We will call them constructs, instead. How would one create a new 2-length construct? By the simplest way possible. Just write construct item-1 item-2.

Although—elegant keywords are short keywords. Let's do cons instead of construct.

Now that our fashionable constructs exist, we just need to figure out how to get items out of them, and then our language is done!

Let’s have two built-in functions. One that returns the first element and one that returns the second item. The harder question to answer now is: what would be the name of these functions?

Surely, it should be something like cons-first-element. We should elegantify futher, though. Maybe something like cons-a-element, as in a, the first letter of the alphabet. Now, element is too common. How about visualising our elements in a registry? Then, we have cons-a-registry. That looks nice.

Onto the second function, and due to the ingenious way we have named our first, we can only name our second: cons-b-registry.

Although—here's a thought—does anyone else think the latin alphabet is too mainstream?

I envisage our imaginary registry of construct elements in a mountain slope. Mountains are nature’s elegance. In that case, the construct would gaze the summit. The first element would ascend the mountain. At the same time, the second element would—inevitably—face the polar opposite and—again inevitably—have the notion of descension in their mind.

This conceptualisation necessitates the following changes: cons-ascension-registry and cons-descension-registry, owning to their alpine biography.

With this, the need for succinctness becomes more clear than ever. We can now finalise their name only as such:

car
cdr

Complex constructs

Now that we can create elegant two-item constructs, we can compose them with themselves and reach celestial abundance. What I mean is: what if inside the first element of a construct, there was another construct?

With this revolutionary idea we can create construct-chains of arbitrary (!) length—and of immense elegance at the same time.

Given that we cannot update a construct’s contents, we can only create complicated constructs from the end towards the beginning. In other words, we can create a construct with another construct as content if and only if the second construct already exists. This approach—in which we arrive deterministically—also implies that one element of our constructs will always be a non-construct item, while the second element will be either a construct or—and that’s how our complex constructs’ boundaries are marked—null.

Now that we have assembled this magnificent superstructure there is one practice—that cannot be law but must always be followed—that occurs. We must agree that in all chained constructs it will always be the same element position that will hold the item and the other one that holds the chained construct.

It would be utterly graceless to have the chained construct in the beginning.

Indexing

Our complex constructs are in essence an imitation of a classic array. It would be nice to have a way to refer to elements inside the imitated array through their index, as one would do in an ordinary language.

For example:

(define a (cons 21 (cons 42 (cons 69 null))))

This way we build this complex construct:

[ 21 [ 42 [ 69 null ] ] ]

This can also be visualised like so, with X being any element:


[  X          X         ]
[ 21 [ .............. ] ]
[ 21 [ 42 [ ....... ] ] ]
[ 21 [ 42 [ 69 null ] ] ]

This, in essence, imitates this array a = [21 42 69] but by using only 2-length arrays:

Practically: a[0] is 21, a[1] is 42, a[2] is 69. What we want is to have a similar indexing operation in which we can get an element given the index number. We don’t want special new syntax, such as the brackets (eg. a[x]), though. We should be able to achieve everything with the building blocks we already have. Given that we can get the first and the second element of our simple constructs, we can build a recursive function to get any single one of them!

Let’s name our function list-ref. Its arguments will be a complex construct, items, and an index number, the position, pos, of the element :

(define (list-ref items pos))

Since it will be recursive we need a base case; the simplest case, i.e. the element at index zero. We can get that element (i.e. the first element) with our elegant car function.

(define (list-ref items pos)
  (if (= pos 0)
      (car items))

In the rare case that we are not asked for the zero-th positioned element, we can call ourselves (i.e. our list-ref function) again with arguments the complex construct except for the first element (which, luckily for us, is just the second element—since it includes a cascade of all other elements) and the integer/position (which is the the initial position minus 1—since the updated complex construct is our initial array minus one (the first) element):

(define (list-ref items pos)
  (if (= pos 0)
      (car items)
      (list-ref (cdr items) (- pos 1))))

Eventually, list-ref will be called with pos = 0 and the first base case will resolve.

Examples of using our new function:

(= (list-ref a 0) 21)
(= (list-ref a 1) 42)
(= (list-ref a 2) 69)

Length

It would be great the have the length of our complex constructs as well!

We should definitely be able to build it with everything we have. We just need to count our nested constructs until the second element of one is null. This indicates the complex construct’s end.

Recursively, we can set a base case of a construct with null elements as length zero. Then, we just call the length function with the argument of our complex construct’s second element—which, again, luckily, is the cascade of all the other constructs—isn’t that magnificent?

(define (length items)
  (if (null? items)
      0
      (+ 1 (length (cdr items)))))

Example usage:

(length a)
(= (length a) 3)

Append

Wouldn’t it be nice to append one complex construct to another?

In order to do that—we can use one construct as is, as we can just inject it in the end of our first. But, to construct the first, we need to break it down first (in a recursive way), so that we can build it with injecting the second first.

What’s the base case? Probably one construct is empty/null. Then we can just consider the second construct as the merge of both:

(define (append a b)
  (if (null? a)
      b))

In the alternative case, surely we want to cons:

(define (append a b)
  (if (null? a)
      b
      (cons X Y)))

The first part of our new cons is probably the first element of the first a: (car a):

(define (append a b)
  (if (null? a)
      b
      (cons (car a) Y)))

The second part should be where the recursion happens—since we cannot just say add all the a elements and then the b elements. So, we need to call append again:

(define (append a b)
  (if (null? a)
      b
      (cons (car a) (append W Z))))

W should be the rest of the a complex construct (which will be broken down recursively when going down). Z can always be just b, since we can inject it as is in the end! Thus, we arrive at:

(define (append a b)
  (if (null? a)
      b
      (cons (car a) (append (cdr a) b))))

Example:

; a is [21 42 69]
; b is [1 2 3]
(define c (append a b))
; c is [21 42 69 1 2 3]

Last element

Since we have car to get the first element, we’d love a symmetrical last-elem function to get the last one as well.

Predictably, we would start like this:

(define (last-elem items)
  (if (null? (cdr items))

We don’t want to start like this:

(define (last-elem items)
  (if (null? items)

because that would mean that at that point items is already empty. We want to have the base case’s construct populated, so that we can do car items. So:

(define (last-elem items)
  (if (null? (cdr items))
      (car items)

For the other if branch we recursively call ourselves with just the remainder (aka the second element) of our construct:

(define (last-elem items)
  (if (null? (cdr items))
      (car items)
      (last-elem (cdr items))))

Reversal

This one is the hardest; it cannot work recursively as the others above. We—inevitably—start with something like this (which doesn’t work):

; wrong
(define (reverse items)
  (if (null? (cdr items))
      (car items)
      (cons (reverse (cdr items)) (car items))))

This makes sense when one first writes it. But then one realises that they do not follow the HEPOC (Holy Elegant Practice of Constructs): the cascade constructs in a complex construct should always be in the second position. The combination of the position reversal along with recursion makes everything blurry.

We bargain. What if we exchange the cons positions?

; wrong
(define (reverse items)
  (if (null? (cdr items))
      (car items)
      (cons (car items) (reverse (cdr items)))))

Right? Unfortunately, no. In the first iteration, the first cons will just place the already first element in the first position.

At this point we realise: this cannot work in the style of our previous functions—just because of how recursion works. But we can cheat. What if we use our beautiful, already built, append?

; yes!
(define (reverse s)
  (if (null? s)
      null
      (append (reverse (cdr s)) (cons (car s) null))))

This works equally beautifully! But who could be happy with reverse using append? Can it not be more self-contained? Can it not be more elegant?

(define (reverse items)
  (define (iter rest result)
    (if (null? rest)
        result
        (iter (cdr rest) (cons (car rest) result))))
  (iter items null))

That’s probably elegance ceiling for reversal. We have to use an iter function (which the trio Harold / Gerald / Julie calls iterative, although iter still calls itself).

The most interesting line is the else branch. We call iter with the rest/cdr of our items, while “aggregating” the result in the second argument.

So simple yet so complex. Nothing less than utmost elegance.

Epilogue

This text was written a few hours after finishing the magnificent Structure and Interpretation of Computer Programs.