You can edit almost every page by Creating an account and confirming your email.

Binary lambda calculus

From EverybodyWiki Bios & Wiki



Binary lambda calculus (BLC) is a minimal, purely functional programming language invented by John Tromp in 2004,[1] based on a binary encoding of the untyped lambda calculus in De Bruijn index notation.

Background

BLC is designed to provide a very simple and elegant concrete definition of descriptional complexity (Kolmogorov complexity), where the complexity of an object is the length of its shortest description.

This is made precise by identifying a description method with a computable function that transforms bitstrings (descriptions) into objects. Objects are usually also just bitstrings, but can have additional structure as well, e.g., pairs of strings.

Originally, Turing machines, the most well known formalism for computation, were used for this purpose. But they are somewhat lacking in ease of construction and composability. Another classical computational formalism, the Lambda calculus, offers distinct advantages in ease of use. BLC is the result of incorporating a notion of binary I/O into lambda calculus, so as to turn it into an effective description method.

Binary strings in BLC

BLC represents bits 0 and 1 as the standard lambda booleans B0 = True and B1 = False:

True = λxλy.x
False = λxλy.y

which can be seen to directly implement the if-then-else operator.

The standard pairing function

,=λxλyλz.zxy

applied to two terms M and N

M,N=λz.zMN

can be applied to a boolean to yield the desired component of choice.

BLC represents a string s = b0b1bn−1 by repeated pairing as

Bb0,Bb1Bbn1,z which is denoted as s:z .

The z works as a list continuation, that could be a nil list (to end the string) or another string (that would be appended to the original string).

Delimited versus undelimited

Descriptional complexity comes in two distinct flavors, depending on whether the input is considered to be delimited.

Knowing the end of your input makes it easier to describe objects. For instance, you can just copy the whole input to output. This flavor is called plain or simple complexity.

But in a sense it is additional information. A file system for instance needs to separately store the length of files. The C language uses the null character to denote the end of a string, but this comes at the cost of not having that character available within strings.

The other flavor is called prefix complexity, named after prefix codes, where the machine needs to figure out, from the input read so far, whether it needs to read more bits. We say that the input is self-delimiting. This works better for communication channels, since one can send multiple descriptions, one after the other, and still tell them apart.

In the I/O model of BLC, the flavor is dictated by the choice of z. When kept as a free variable, and required to appear as part of the output, then the machine must be working in a self-delimiting manner. If on the other hand z is a lambda term specifically designed to be easy to distinguish from any pairing, then the input becomes delimited. BLC chooses False for this purpose but gives it the more descriptive alternative name of Nil. Dealing with lists that may be Nil is straightforward: since

x,y M N=M x y N, and
Nil M N=N

one can write functions M and N to deal with the two cases, the only caveat being that N will be passed to M as its third argument.

Universality

One can find a description method U such that for any other description method D, there is a constant c (depending only on D) such that no object takes more than c extra bits to describe with method U than with method D. BLC is designed to make these constants relatively small. In fact the constant will be the length of a binary encoding of a D-interpreter written in BLC, and U will be a lambda term that parses this encoding and runs this decoded interpreter on the rest of the input. U won't even have to know whether the description is delimited or not; it works the same either way.

BLC not only represents bitstrings as lambda calculus terms, but the other way around as well.

Lambda encoding

First, lambda terms are written in a particular notation using what is known as De Bruijn indices. The encoding is then defined recursively as follows

λM^=00M^
M N^=01M^N^
i^=1i0

For instance, the pairing function λxλyλz.xzy is written λλλ.132 in De Bruijn format, which has encoding 00 00 00 01 01 10 1110 110.

A closed lambda term is one in which all variables are bound, i.e. without any free variables. In De Bruijn format, this means that an index i can only appear within at least i nested lambdas. The number of closed terms of size n bits is given by sequence OEISA114852 of the On-Line Encyclopedia of Integer Sequences.

The shortest possible closed term is the identity function λ1^=0010. In delimited mode, this machine just copies its input to its output.

The universal machine U in BLC is then, in De Bruijn format (all indices are single digit):

(λ11)(λλλ1(λλλλ3(λ5(3(λ2(3(λλ3(λ123)))(4(λ4(λ31(21))))))
(1(2(λ12))(λ4(λ4(λ2(14)))5))))(33)2)(λ1((λ11)(λ11)))

This is in binary:

0101000110100000000101011000000000011110000101111110011110
000101110011110000001111000010110110111001111100001111100
0011100110111101111100111101110110000110010001101000011010
(only 232 bits (29 bytes) long)

A detailed analysis of machine U may be found in.[1]

BLC Complexity

In general, complexity of an object can be conditional on several other objects that are provided as additional argument to the universal machine. BLC defines Plain (or simple) complexity KS and prefix complexity KP by

KS(x|y1,,yk)=min{(p) | U (p:Nil) y1  yk= x    }KP(x|y1,,yk)=min{(p) | U (p: z  ) y1  yk=x,z}

Basic Theorems

The identity program λ1 proves that

KS(x)(x)+4

The program λλ1((λ11)(λλλλ2(44)(λλ32(32(2(51(21)))))))(λλλ1(3((λ11) (λλλλ1(λ55(λλ356(λ1(λλ612)3))(λλ5(λ143)))(31))(λλ1(λλ2)2)(λ1))(λλ1))2)

proves that

KP(x|(x))(x)+188

The program

(λ11)(λλλ1(λ1(3(λλ1))(44(λ1(λλλ1(λ4(λλ52(52(31(21))))))4(λ1)))))(λλλ1(3((λ11) (λλλλ1(λ55(λλ356(λ1(λλ612)3))(λλ5(λ143)))(31))(λλ1(λλ2)2)(λ1))(λλ1))2)

proves that

KP(x)(x)+338

where x is the Levenstein code for x defined by

0=0n+1=1 (n) n

in which we identify numbers and bitstrings according to lexicographic order. This code has the nice property that for all k,

(n)(n)+((n))++k1(n)+O(k(n))

Furthermore, it makes lexicographic order of delimited numbers coincide with numeric order.

Number String Delimited
0 0
1 0 10
2 1 110 0
3 00 110 1
4 01 1110 0 00
5 10 1110 0 01
6 11 1110 0 10
7 000 1110 0 11
8 001 1110 1 000
9 010 1110 1 001

Halting probability

The halting probability of the prefix universal machine is defined as the probability it will output any term that has a closed normal form (this includes all translated strings):

Ωλ=xNFU(p:z)=x,z2(p)

With some effort, we can determine the first 4 bits of this particular number of wisdom:

Ωλ=.00012

where probability .00012 = 2−4 is already contributed by programs 00100 and 00101 for terms True and False.

BLC8: byte sized I/O

While bit streams are nice in theory, they fare poorly in interfacing with the real world. The language BLC8 is a more practical variation on BLC in which programs operate on a stream of bytes, where each byte is represented as a delimited list of 8 bits in big-endian order.

BLC in the IOCCC 2012

An implementation of both BLC and BLC8 in the C programming language won the "Most Functional" award in the 2012 edition of the International Obfuscated C Code Contest.

References

  1. 1.0 1.1 John Tromp, Binary Lambda Calculus and Combinatory Logic, in Randomness And Complexity, from Leibniz To Chaitin, ed. Cristian S. Calude, World Scientific Publishing Company, October 2008. (The last reference, to an initial Haskell implementation, is dated 2004) (pdf version) Archived March 4, 2016, at the Wayback Machine

External links


This article "Binary lambda calculus" is from Wikipedia. The list of its authors can be seen in its historical. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.