Zero-based numbering
Zero-based numbering is a way of numbering in which the initial element of a sequence is assigned the index 0, rather than the index 1 as is typical in everyday non-mathematical or non-programming circumstances. Under zero-based numbering, the initial element is sometimes termed the zeroth element,[1] rather than the first element; zeroth is a coined ordinal number corresponding to the number zero. In some cases, an object or value that does not (originally) belong to a given sequence, but which could be naturally placed before its initial element, may be termed the zeroth element. There is not wide agreement regarding the correctness of using zero as an ordinal (nor regarding the use of the term zeroth) as it creates ambiguity for all subsequent elements of the sequence when lacking context.
Numbering sequences starting at 0 is quite common in mathematics notation, in particular in combinatorics, though programming languages for mathematics usually index from 1. In computer science, array indices usually start at 0 in modern programming languages, so computer programmers might use zeroth in situations where others might use first, and so forth. In some mathematical contexts, zero-based numbering can be used without confusion, when ordinal forms have well established meaning with an obvious candidate to come before first; for instance a zeroth derivative of a function is the function itself, obtained by differentiating zero times. Such usage corresponds to naming an element not properly belonging to the sequence but preceding it: the zeroth derivative is not really a derivative at all. However, just as the first derivative precedes the second derivative, so also does the zeroth derivative (or the original function itself) precede the first derivative.
Computer programming
Origin
Martin Richards, creator of the BCPL language (a precursor of C), designed arrays initiating at 0 as the natural position to start accessing the array contents in the language, since the value of a pointer p used as an address accesses the position p + 0 in memory.[2][3] Canadian systems analyst Mike Hoye asked Richards the reasons for choosing that convention. BCPL was first compiled for the IBM 7094; the language introduced no indirection lookups at run time, so the indirection optimization provided by these arrays was used at compile time.[3] The optimization was nevertheless important.[3][4]
Edsger W. Dijkstra later wrote a pertinent note Why numbering should start at zero[5] in 1982, analyzing the possible designs of array indices by enclosing them in a chained inequality, combining sharp and standard inequalities to four possibilities, demonstrating that to his conviction zero-based arrays are best represented by non-overlapping index ranges, which start at zero, alluding to open, half-open and closed intervals as with the real numbers. Dijkstra's criteria for preferring this convention are in detail that it represents empty sequences in a more natural way (a ≤ i < a ?) than closed "intervals" (a ≤ i ≤ (a−1) ?), and that with half-open "intervals" of naturals, the length of a sub-sequence equals the upper minus the lower bound (a ≤ i < b gives (b−a) possible values for i, with a, b, i all integers).
Usage in programming languages
This usage follows from design choices embedded in many influential programming languages, including C, Java, and Lisp. In these three, sequence types (C arrays, Java arrays and lists, and Lisp lists and vectors) are indexed beginning with the zero subscript. Particularly in C, where arrays are closely tied to pointer arithmetic, this makes for a simpler implementation: the subscript refers to an offset from the starting position of an array, so the first element has an offset of zero.
Referencing memory by an address and an offset is represented directly in computer hardware on virtually all computer architectures, so this design detail in C makes compilation easier, at the cost of some human factors. In this context using "zeroth" as an ordinal is not strictly correct, but a widespread habit in this profession. Other programming languages, such as Fortran or COBOL, have array subscripts starting with one, because they were meant as high-level programming languages, and as such they had to have a correspondence to the usual ordinal numbers which predate the invention of the zero by a long time.
Pascal allows the range of an array to be of any ordinal type (including enumerated types). APL allows setting the index origin to 0 or 1 during runtime programatically.[6][7] Some recent languages, such as Lua and Visual Basic, have adopted the same convention for the same reason.
Zero is the lowest unsigned integer value, one of the most fundamental types in programming and hardware design. In computer science, zero is thus often used as the base case for many kinds of numerical recursion. Proofs and other sorts of mathematical reasoning in computer science often begin with zero. For these reasons, in computer science it is not unusual to number from zero rather than one.
Hackers and computer scientists often like to call the first chapter of a publication "Chapter 0", especially if it is of an introductory nature. One of the classic instances was in the First Edition of K&R. In recent years this trait has also been observed among many pure mathematicians, where many constructions are defined to be numbered from 0.
If an array is used to represent a cycle, it is convenient to obtain the index with a modulo function, which can result in zero.
Numerical properties
With zero-based numbering, a range can be expressed as the half-open interval, [0,n), as opposed to the closed interval, [1,n]. Empty ranges, which often occur in algorithms, are tricky to express with a closed interval without resorting to obtuse conventions like [1,0]. Because of this property, zero-based indexing potentially reduces off-by-one and fencepost errors.[5] On the other hand, the repeat count n is calculated in advance, making the use of counting from 0 to n−1 (inclusive) less intuitive. Some authors prefer one-based indexing as it corresponds more closely to how entities are indexed in other contexts.[8]
Another property of this convention is in the use of modular arithmetic as implemented in modern computers. Usually, the modulo function maps any integer modulo N to one of the numbers 0, 1, 2, ..., N − 1, where N ≥ 1. Because of this, many formulas in algorithms (such as that for calculating hash table indices) can be elegantly expressed in code using the modulo operation when array indices start at zero.
Pointer operations can also be expressed more elegantly on a zero-based index due to the underlying address/offset logic mentioned above. To illustrate, suppose a is the memory address of the first element of an array, and i is the index of the desired element. To compute the address of the desired element, if the index numbers count from 1, the desired address is computed by this expression:
- a + s × (i − 1)
where s is the size of each element. In contrast, if the index numbers count from 0, the expression becomes:
- a + s × i
This simpler expression is more efficient to compute at run time.
However, a language wishing to index arrays from 1 could adopt the convention that every array address is represented by a′ = a – s; that is, rather than using the address of the first array element, such a language would use the address of a fictitious element located immediately before the first actual element. The indexing expression for a 1-based index would then be:
- a′ + s × i
Hence, the efficiency benefit at run time of zero-based indexing is not inherent, but is an artifact of the decision to represent an array with the address of its first element rather than the address of the fictitious zeroth element. However, the address of that fictitious element could very well be the address of some other item in memory not related to the array.
Superficially, the fictitious element doesn't scale well to multidimensional arrays. Indexing multidimensional arrays from zero makes a naive (contiguous) conversion to a linear address space (systematically varying one index after the other) look simpler than when indexing from one. For instance, when mapping the three-dimensional array to a linear array L[M⋅N⋅P], both with M⋅N⋅P elements, the index r in the linear array to access a specific element with L[r] = A[z][y][x] in zero-based indexing, i.e. [0 ≤ x < P], [0 ≤ y < N], [0 ≤ z < M], and [0 ≤ r < M⋅N⋅P], is calculated by r = z⋅M⋅N + y⋅M + x. Organizing all arrays with 1-based indices ([1 ≤ x′ ≤ P], [1 ≤ y′ ≤ N], [1 ≤ z′ ≤ M], [1 ≤ r′ ≤ M⋅N⋅P]), and assuming an analogous arrangement of the elements, gives r′ = (z′ − 1)⋅M⋅N + (y′ − 1)⋅M + (x′ − 0) to access the same element, which arguably looks more complicated. Of course, r' = r + 1, since [z = z′ – 1], [y = y′ – 1], and [x = x′ – 1]. A simple and everyday life example is positional notation which the invention of the zero made possible. In positional notation, tens, hundreds, thousands and all other digits start with zero, only units start at one.[9]
-
Zero-based indices xy0 1 2 .. .. 8 9 0 00 01 02 08 09 1 10 11 12 18 19 2 20 21 22 28 29 .. .. 8 80 81 82 88 89 9 90 91 92 98 99 The table content represents the index r -
One-based indices x'y'1 2 3 .. .. 9 10 1 01 02 03 09 10 2 11 12 13 19 20 3 21 22 23 29 30 .. .. 9 81 82 83 89 90 10 91 92 93 99 100 The table content represents the index r′
This situation can lead to some confusion in terminology. In a zero-based indexing scheme, the first element is "element number zero"; likewise, the twelfth element is "element number eleven". Therefore, an analogy from the ordinal numbers to the quantity of objects numbered appears; the highest index of n objects will be n − 1 and it refers to the nth element. For this reason, the first element is sometimes referred to as the zeroth element, in an attempt to avoid confusion.
Science
In mathematics, many sequences of numbers or of polynomials are indexed by nonnegative integers, for example the Bernoulli numbers and the Bell numbers.
In both mechanics and statistics, the zeroth moment is defined, representing total mass in the case of physical density, or total probability, i.e. one, for a probability distribution.
The zeroth law of thermodynamics was formulated after the first, second, and third laws, but considered more fundamental, thus its name.
In biology, an organism is said to have zero order intentionality if it shows "no intention of anything at all". This would include a situation where the organism's genetically predetermined phenotype results in a fitness benefit to itself, because it did not "intend" to express its genes.[10] In the similar sense, a computer may be considered from this perspective a zero order intentional entity as it does not "intend" to express the code of the programs it runs.[11]
In biological or medical experiments, initial measurements made before any experimental time has passed are said to be on the 0 day of the experiment.
In genomics, both 0-based and 1-based systems are used for genome coordinates.
Patient zero (or index case) is the initial patient in the population sample of an epidemiological investigation.
Other fields
The year zero does not exist in the widely used Gregorian calendar or in its predecessor, the Julian calendar. Under those systems, the year 1 BC is followed by AD 1. However, there is a year zero in astronomical year numbering (where it coincides with the Julian year 1 BC) and in ISO 8601:2004 (where it coincides with the Gregorian year 1 BC) as well as in all Buddhist and Hindu calendars.
In many countries, the ground floor in buildings is considered as floor number 0 rather than as the "1st Floor", the naming convention usually found in the United States of America. This makes a consistent set with underground floors marked with negative numbers.
While the ordinal of 0 mostly finds use in communities directly connected to mathematics, physics, and computer science, there are also instances in classical music. The composer Anton Bruckner regarded his early Symphony in D minor to be unworthy of including in the canon of his works, and he wrote 'gilt nicht' on the score and a circle with a crossbar, intending it to mean "invalid". But posthumously, this work came to be known as Symphony No. 0 in D minor, even though it was actually written after Symphony No. 1 in C minor. There is an even earlier Symphony in F minor of Bruckner's that is sometimes called No. 00. The Russian composer Alfred Schnittke also wrote a Symphony No. 0.
In some universities, including Oxford and Cambridge, "week 0" or occasionally "noughth week" refers to the week before the first week of lectures in a term. In Australia, some universities refer to this as "O Week", which serves as a pun on "orientation week". As a parallel, the introductory weeks at university educations in Sweden are generally called "nollning" (zeroing).
The United States Air Force starts basic training each Wednesday, and the first week (of eight) is considered to begin with the following Sunday. The four days before that Sunday are often referred to as "Zero Week."
24-hour clocks and the international standard ISO 8601 use 0 to denote the first (zeroth) hour of the day.
King's Cross station in London, Edinburgh Haymarket, and stations in Uppsala, Yonago, Stockport and Cardiff have a Platform 0.
Robert Crumb's drawings for the first issue of Zap Comix were stolen, so he drew a whole new issue which was published as issue 1. Later he re-inked his photocopies of the stolen artwork and published it as issue 0.
The Brussels ring road in Belgium is numbered R0. It was built after the ring road around Antwerp, but Brussels (being the capital city) was deemed deserving of a more basic number. Similarly the (unfinished) orbital motorway around Budapest in Hungary is called M0.
Zero is sometimes used in street addresses, especially in schemes where even numbers are one side of the street and odd numbers on the other. A case in point is the landmark Christ Church Cambridge on Cambridge, Massachusetts's Harvard Square, whose address is 0 Garden Street.
In Formula One, when a defending world champion does not compete in the following season, the number 1 is not assigned to any driver, but one driver of the world champion team will carry the number 0, and the other, number 2. This did happen both in 1993 and 1994 with Damon Hill carrying the number 0 in both seasons, as defending champion Nigel Mansell quit after 1992, and defending champion Alain Prost quit after 1993.
A chronological prequel of a series may be numbered as 0, such as Ring 0: Birthday or Zork Zero.
The Swiss Federal Railways number certain classes of rolling stock from zero, for example, Re 460 000 to 118.
In the realm of fiction, Isaac Asimov eventually added a Zeroth Law to his Three Laws of Robotics, essentially making them four laws.
References
- M. Seed, Graham (1965). An Introduction to Object-Oriented Programming in C++ with Applications in Computer Graphics (2nd ed.). British Library: Springer. p. 391. ISBN 1852334509. Retrieved 11 February 2020.
- Martin Richards (1967). The BCPL Reference Manual (PDF). Massachusetts Institute of Technology. p. 11.
- Mike Hoye. "Citation Needed". Retrieved 28 January 2014.
- Tom Van Vleck (1995). "The IBM 7094 and CTSS". Retrieved 28 January 2014.
- Dijkstra, Edsger Wybe (May 2, 2008). "Why numbering should start at zero (EWD 831)". E. W. Dijkstra Archive. University of Texas at Austin. Retrieved 2011-03-16.
- Brown, Jim (December 1978). "In Defense of Index Origin 0". ACM SIGAPL APL Quote Quad. 9 (2): 7. doi:10.1145/586050.586053. S2CID 40187000.
- Hui, Roger. "Is Index Origin 0 a Hindrance?". jsoftware.com. JSoftware. Retrieved 19 January 2015.
- Programming Microsoft® Visual C#® 2005 by Donis Marshall
- Sal Khan. Math 1st Grade / Place Value / Number grid. Khan Academy. Retrieved July 28, 2018.
Youtube title: Number grid / Counting / Early Math / Khan Academy
- Byrne, Richard W. "The Thinking Ape: Evolutionary Origins of Intelligence". Retrieved 2010-05-18.
- Dunbar, Robin. "The Human Story - A new history of mankind's Evolution". Retrieved 2010-05-18.
- This article is based on material taken from the Free On-line Dictionary of Computing prior to 1 November 2008 and incorporated under the "relicensing" terms of the GFDL, version 1.3 or later.