C.C is everywhere and in everything. C powers the Mars Curiosity rover, every computer operating system, every mobile OS, the Java Virtual Machine, Google Chrome, ATM machines, the computers in your car, the computers in your robot surgeon, the computers that designed the robot surgeon, the computers that designed those computers, and, eventually, C powers itself as its own implementation language.
Advertisement
When techno-human civilization has finally collapsed, perhaps the result of a nuclear war programmed in C or the result of a bacterial superstrain isolated by software implemented in C, and we have been returned to our caves to gnaw bones and fight over rotten meat, there will still be a program written in C executing somewhere.All of this isn't just because a lot of people really like coding in C, though it's been estimated that almost 20 percent of all coders use the language (see below). C is far deeper than what we normally think of when we think of "programming language." There are languages that we consider to be more or less foundational—Java, Python, Ruby, Lisp, etc.—which are the very general-purpose languages. C is also general-purpose programming language, but the difference is that C has become the de facto language of machines themselves, whether it's a five dollar microcontroller or a deep-space probe.A single Know Your Language treatment of C would be dangerously incomplete. A 10 minute read on Processing should equate to about a six hour read on C, at least. Even across two or maybe three, I still won't be providing quite that much, but I'll at least be getting closer to a proper primer on computing's unassuming alpha language.So, we should start at the beginning. Where did C come from? How did it become the force of nature that it is and almost certainly will continue to be?
Advertisement
Dumb luck? Timing? Prescience?Need, mostly.
Moving Unix
Advertisement
The result offered a simple command-line interface along with a compiler and interpreter. The coup of this Unix version was that it was self-supporting: software could be written and debugged within the target computer. Previously, the Unix-less PDP-7 required engineers to program on a different variety of computer, the GE-635, print it all out on paper tape, and haul it to the target computer for testing and verification.Now, the PDP-7 could be used to develop its own programs, and, what's more, it could be used to continue development on its own Unix operating system."Thompson's PDP-7 assembler outdid even DEC's in simplicity," Ritchie recalls, "It evaluated expressions and emitted the corresponding bits. There were no libraries, no loader or link editor: the entire source of a program was presented to the assembler, and the output file with a fixed name that emerged was directly executable."Almost immediately, Malcolm Douglas McIlroy, a Bell Labs engineer who would go on to become one of the crucial figures in Unix development, had written a high-level language for Unix, called TMG. It was a short-lived tool intended to be used as a "compiler-compiler," e.g. a tool meant to write new compilers, the crucial meta-programs that convert higher-level languages (like Fortran, at the time) into assembly language.TMG inspired Thompson to create a system programming language (SPL) for the then still-unnamed Unix. An SPL is meant to program the software that exists to interface with the machine itself: operating systems, utilities, drivers, compilers. It's an intermediary between any other programming language and the actual guts.
Advertisement
A system programming language can be imagined as existing one level of abstraction away from assembly and then machine instructions. But, unlike an assembly language, it gets to be machine-agnostic. As an abstraction, it operates on and within an idealized computer system, a rough schematic or sketch of what a computer is and does. This isn't the same thing as a virtual machine, a la Java, it's an actual machine-language correspondence, but one that doesn't have to specify the details of the machine, which can be done as the system itself compiles the system language into assembly language.You could say that C is a language of computation itself—just the right amount of abstraction to maintain universality, but with the ability to communicate with hardware.It is the why of C—its elegance, persistence, and ubiquity—which looks a lot like the why of computers themselves, and we'll be looking at it in much more detail later on.
A, B, C
Advertisement
In C, however, a programmer can specify whether this or that unit of data needs just a single byte of storage or up to eight or more bytes. This was a new need, as newer computers were able manipulate data at the level of individual bytes rather than the two-byte packages known as words. C was there at just the right time for the transition.Thrilling, right? Data types.It's a bit dry, but also completely crucial to what C has become. B viewed memory as a collection of uniformly sized "cells," which equated to a hardware machine "word," or two bytes of data. Newer and more capable machines emerging circa 1970, particularly the PDP-11, made this idea increasingly silly, requiring elaborate procedures for editing and manipulating strings of related data. The first step forward came in 1971 via NB or "new B," a fleeting language crafted by Ritchie that added a small handful of data types.These were "int" and "char," used to store whole numbers and whole numbers corresponding to characters (like letters), respectively. NB allowed programmers to specify not just individual letters and numbers, but lists of them. So, if I wanted to store this blog post into memory, I could specify a single variable with a single name, but make that single variable correspond to a list of many thousands of consecutive characters. These single variable names corresponded to memory addresses within the machine itself, with the result allowing a flexible yet machine-literal layer of data organization.
Advertisement
A few other things happened in the transition from B to C—more generalizable data types, direct translatability to assembly language—and it became clear that Ritchie had gone beyond B and "NB" into something very different. C.Once the language had been named, things happened very fast and C started to look much more like the C we use today, including the addition of boolean operators like && and ||. These are basically just tests, where I can specific two different expressions and put them together with a boolean operator and, in the case of &&, ask the computer to return "true" if both expressions are true and false if not. With ||, I can ask the computer to return true if one or both are true. These are extremely crucial programming building blocks for not just C but most any language since.Also around this time, the concept of a "preprocessor" arose. Now, a programmer could include entire other files and, thus, libraries full of code with a simple "#include" statement. They could also now define macros, which in the beginning were just ways of writing short dictionaries of sorts, where one could #define some name to correspond to a small fragment of code. The preprocessor is conceptually just a way of telling the compiler to do some extra junk before actually compiling the program.Again: a crucial and very general programming tool. It was a level of organization ready for the eye-crossing sprawls of code that software would become.
Advertisement
The White Book
In the coming decade, C compilers were written for mainframe computers, minicomputers, microcomputers, and, eventually, the IBM PC. In 1983, the American National Standards Institute (ANSI) starting working on a standard specification for C, with the result ratified finally in 1989 as ANSI C. This basically enshrined a One True C, where programmers could be assured that the language would behave the same no matter how or where it was implemented. ANSI C was the first definitive C.A definitive C looks a lot like the abstract definition of a computer. I could write a computer program right now that manipulates the individual bits within a memory location within an actual machine, whatever the machine, using standard C operators. I can say with some reasonable certainty how my C code will be translated into machine instructions and I can use that knowledge to write faster and leaner code, or I can use that knowledge to write entirely new programming languages. An alphabet that writes new alphabets: A,B,C.Read more Know Your Language.