What is Unix?

What you think is going on, is going on.

The course, CSC 322, is about C and Unix. Unix is the most influential operating system ever. At over 30 years of age, it is still the tip of the avant garde, the one to beat. It has absorbed all developments with scant change to its basic structure and architecture. It blazed the way into the Internet, produces movies like Toy Story, defends the our national cyber borders from attack.

Remember, whatever is your first impression, the truth is Unix is a simple operating system. Whatever you think is going on, is going on. By extrapolation, whatever you think should work, will work. Whereas other operating systems are products with owner's manuals to be read and warantees to be voided under any number of conditions, Unix is a natural phenomena, a wave to be riden.

When Joe Condon, the owner of the PDP-7 that Thompson first used for UNIX, started using UNIX himself, he asked a co-worker how to do a certain function. "'What do you think is the reasonable thing to do, Joe?'" he was asked in return.

"That was a very interesting clue to the philosophy of how UNIX worked," Condon later said. "The system worked in a way which is easy to understand. It wasn't a complex function, hidden in a bunch of rules and verbiage and what not."

"Cognitive engineering" is what Condon called it, "...that the black box should be simple enough such that when you form the model of what's going on in the black box, that's in fact what is going on in the black box."

See http://www.bell-labs.com/history/unix/goingon.html

The Unix Kernel

An operating system is a manager of hardware resources, and provides them to a user as convenient abstractions. It makes sure the resources are well used, fairly shared, and safely operated. The first layers of software around the hardware resources is the called the kernel. Among other things, Unix is a kernel built according to a certain architecture providing a certain set of hardware abstractions.

The unix kernel provides for,

And so on.

The kernel is a piece of software, the first bit of the operating system loaded into memory when the operating system boots. It wraps around the hardware and becomes the center of the remaining activities of the operating system. For this reason it is called the kernel.

For all its centrality, the kernel has no life of its own. It has no agenda, no action that it feels compelled to undertake. Left to itself, the kernel would not consume a single CPU cycle. It provides for linkage between programs and hardware, and between programs; it allows them to get on with their work. It more like the meeting of roads than anything more substantial. It is crucial, and it is nothing at all.

Unix is a monolithic kernel because it all the functionality is compiled into one big chunk of code, including substantial implementations for networking, file systems, and devices. Micro kernels, for instance, attempt to eliminated networking and file systems from the kernel, loading them in as auxiliary software elements after boot. Ironically, micro kernels are much larger and more complicated than the Unix monolithic kernel, hence have not yet displaced the unix kernel.

The unix kernel is a program stored, as any program, in the file system. In FreeBSD, it is easily located at the root of the file system, the file aptly named kernel. On my machine it is about 2.5 megabytes in size. Other variants of unix find the kernel in a slightly different location. It might be more or less dynamic in terms of run-time construction from modules (despite its monolithic heritage!).

The kernel provides services to running programs by a series of service entries called syscalls, short for system calls. The core of these are common for all unices, but each unix differs in the details. In FreeBSD there are 212 different syscalls. In linux there are 323.

You will learn more about kernels in your operating systems course. For the purposes here, kernels provide a certain set of abstractions by wrapping around hardware. A unix operating system includes a unix kernel providing the unix defined abstractions.

The Unix distribution

Typically we interact with the computer at a command line interface, typing commands which have various effects. In unix, each command is a program. Even the software we are interacting with in order to run commands is itself a command, the shell. What we perceive as the operating system is in fact the shell program. The shell takes each line and finds a program whose name matches the first word of the command. It then runs that program, passing as parameters the rest of the command line.

The second answer to "What is unix?" is therefore: unix is the collection of programs, and particularly a shell program, which make up the personality of unix as seen by the command-line user.

In the first version of unix there were about 60 commands, that is, 60 programs. Most had two letter names. All the original are still with us today. Also, any program that you write becomes a command. After all, the shell simply looks for a file with a name matching the first word of the command and runs that file. However, there is a customary core of commands which are considered crucial for the system to "be" unix. They are found in the file system in directories /bin and /usr/bin. These include sh, ls, cp, rm, mv, ln, cat, man, chdir, mkdir, rmdir and chmod. One of your first tasks is to become familiar with all these commands, and many more.

Besides just being capable of things, Unix utilities do things in a consistent way and embody a particular philosophy. Joining commands together into value-added chains by pipes, is a Unix philosophy. This is a natural result of deeper kernel choices to treat files and Input/Output in a consistent, stream-of-character model. All files in unix are considered streams of bytes, often with each byte a single character of text. Unix prefers all its files to be human-readable text. Devices such as the keyboard and the console are also files. The keyboard is a file which delivers a stream of characters, one character each time the use presses a key. The console accepts character and draws them on the screen.

The abstraction of all input and output as character streams, and devices as acceptors or generators of such streams, was an important innovation of unix.

This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.

Peter Salus, A Quarter Century of Unix

Each unix command, these programs, are expected to follow a common pattern for its use.

Commands read from stdin and write to stdout (and stderr), and by the default the shell connects these files to the keyboard and console devices just before invoking the command. But the shell is easily instructed to connect these files otherwise. The three characters >, < and the pipe |, indicate redirection.

Example:

   ls > file-name
will send the output of the ls command to the file file-name rather than to the console. The command:
   ls | wc 
will make the output of the ls command the input of the wc command. This is how you count the number of files in a directory.

Other Unix Things

A Unix system also refers to C language and the program development tools in which Unix itself is written, and provided to developers to write program which use unix. This includes a project builder called make, and a few source code control systems. Standard API's for Unix, adapted to C, are found in the normal man pages, and the API defintions have a standard location, almost wired-in, in the directory hierarchy.

The directory hierarchy can also be said to be part of unix. All unix systems have a core, standard set of directories where things are found. These locations have evolved over time, and the evolution is driven by the fact that there are variants of unix which, in effect, propose new conventions. Brilliant fights break out over the virtues and drawbacks of the ideas, and eventually, with usage, the correct solution becomes clear. Slowly, all variants adopt victor solutions, and the battle line moves on to the current set of problems.

In addition, there is a standard set of processes controlling Unix. These are not built into the kernel, but some are as old as the kernel, such as the init process. These process, the problems they solve, and how they solve them, is also part of Unix. Their solution style is often typically Unix - programming, small, flexible, obvious in its working, economical in extraneous innovation.

This gives a very broad answer to the question: what is unix? Perhaps it is better to be more conservative and say that it is the kernel, a shell supporting re-directable stream I/O, and a set of utilities having a consistent user interface.

Variants of Unix and Open Source

See various web locations for a history of early unix releases. The most popular unix release today is Linux. Linux is defined by a specific kernel: a unix kernel implement by Linux Torvalds, which to this day, has his supervision over every modification. Various other sources provide the remaining, user level code. That code is inherited from various places and brought together as a distribution. For example, Red Hat is a specific distribution bundled with a Linux kernel.

FreeBSD, OpenBSD and NetBSD are other popular unix releases. We use FreeBSD. It has a kernel descended from an implementation of Unix at Berkeley, University of California. The kernel differs from Linux. Memory management is different. The syscalls are somewhat different, although compatibility packages are installed so that Linux software will compile and run native on FreeBSD. The politics of FreeBSD are different as well. By politics, I mean the group organization of whose responsible. FreeBSD kernel and distribution come from a single source of volunteers. They are organized with some formality, various people having specific jobs, and sometimes they retire, seeking replacements. Decisions are argued over and voted on, and the outcome is adhered to.

Both Linux and FreeBSD are part of the Open Source Movement. This movement believes that the crucial computer code of the operating system should not be hidden. It can be owned, but it cannot be hidden. There is some writing about why open source is good. Traditional companies worry that open source allows people to steal their great ideas, or that it will just make it impossible to make money.

It remains to be seen.