Using unix


There are so many good how-to unix books and web sites. Please google for the most recent web sites. I prefer O'Reilly books, such as the Nutshell series. Some people think these are just the man pages printed out. That might be so, but they add some organization and you get to read it like a book, which has advantages.

The hardest part is getting started, because it seems like you can't know X before knowing Y, and Y before knowing X. So you just have to be mildly confused at times, and the trick is that the confusion be released in manageable doses.

She sells c-shells,

Pick a shell:

It gets confusing, a bit. But to understand the scripts you see on a unix machine you should be able to read Bourne shell. If you write scripts, you are best off writing then in Bourne as well. They will always run. Like 10 years from now, they will probably run.

All shells have wildcards, I/O redirection, variables, and are programable. To get work done, you will want more than than just that. The new shells also have,

There are built-in commands to a shell, but most things done at the shell are really separate programs. The shells finds and runs a program with the same name as the command requested. It looks for the program in the directories named in the PATH variable. All shells agree on this sort of thing. Finding the program implementing the command, the shell runs it, turning over its input and output streams to the program for the duration of its run, and then reclaiming control when the command exits.

To help the user, all commands (programs), follow a standard usage. The better the development community defines and adheres to standards, the more productive are the users. Unix also has a certain way of doing things. Commands have outputs intended to be read by other programs, as much as by users. For this reason, unix command output is terse and stylized.

Who are you?

When you log in, type who to see the list of users currently on the system. Read the man page for a description of the output and who options. Type id to see your user ID. Your login name is associated permanently with a user number, the uid, user ID. uid's are small integers. You are also part of a group, which has a name and an integer identifier gid, the Group ID. Original unix had a user in one and only one group. Now a user can be in multiple groups, and you can see that in the output of the id command.

As a user, you have associated with you, thanks to your entry in the /etc/passwd file, a home directory, a perferred shell, and some personal information, to be used by finger. Look at your entry in the passwd file and identify your shell and home directory. Does it agree with the output of pwd?

The super user has ID 0, and can do anything. Anyone else has a variety of uid's, real, effective, saved (this is an implied ID) and file system (this is special to Linux). The point is SUID programs can be used for one user to impersonate another. A SUID program will have and effective UID of the owner of the program, rather than the runner of the program, more explicitely, a copy of the uid of the invoking process. However, the invoker's uid is not forgotten, and the program can toggle between having the euid or the real uid control for access decisions.

What's in a name?

The unix filesystem is a hierachical name space of several types of objects, the most notable being: regular files, directories, and devices. There are a few more sorts of objects including links and mount points.

The unix file system is an hierarchical file system which supports mountable file systems. Hierarchical means that the filesystem supports the abstraction of folders, where files and other folders can be "inside" folders. This gives rise the the idea of a pathname - that the location of a file in a filesystem includes a list of containing folders. The unix convention is that the namelist is slash-delimited. An absolute path is the all inclusive list of folders starting from the base of the file system, called the root, denoted by one lonely slash. The relative path starts from the current location, defined in context, and can descend or ascend in tracing its route to destination.

The files which implement folders are called directories. Although it has the appearance that files themselves are in the directory, this is only appearence. Directories are in the file system just as anything else is. It is only a pairing of a name with inode (an ID number for a file), and the type of the file. That is all.

This simplicity has consequences. A file has only one owner. That is part of the file. If owner ship were stored in the directory, the same file could have different owners, depending on which directory entry it was refered to.

However the same file can have different names! The inode is the file, the name is contents of a directory. Having entries in mutliple directories for a single file is called a link, and in particular a hard link. You will probably not use hard links for awhile. Softlinks are more useful, and are the equivalent of Windows Links.

Filenames is unix can contain any character except the slash, "/". However, you might want to keep in mind certain things for portability. Windows has a difficult time with upper and lower case. Old windows (DOS) did not distinguish upper and lower case, and NT does but tries to accommodate DOS users. It used to be unix users kept it simple by using all lower case and no spaces. But now Java requires upper case file names, and it is just too hard not to have spaces.

Further more, traditionally the type of a file is distinguished by the extension, a little code following the rightmost dot "." in the file name. In Windows, the extension is a special piece of the filename. It inherited this behavior both from DOS and from VMS. Unix files have extension by custom only. That is, there is nothing magic about the extension, and no different treatment deep in the operating system for the extension. But everyone expects that it will be handled a certain way.

Also customary is that a filename begining with a dot is "hidden". This means that standard tools such as ls will treat such a file differently, making it seem hidden. The kernel has no notion of hidden files, it is not a property of a file, just a convention about a file name.

Two dot-started filenames are always found in every directory: dot and dot-dot. Dot is the name of this very directory, and dot-dot is the name of the parent directory.

Originally, unix filenames were limited to 14 characters. Now it is longer (255 for POSIX).

Do you have permission?

Unix security on the file system amounts to matching your uid and gid against a own and group attached to the file. The world of relationships between user and file has only tree kinds,

There are only tree types of permissions you can have, depending on your relationship with the file, There are therefore 9 elements of security, the three permissions the owner grants himself, these permissions granted to group members, and these permissions granted to all others. These are visually represented by a string, or are often talked about as a mode, a 9 bit number written in octal, base 8. Octal used to be cool, when the PDP-8 was around. But then some people started using base 16, hexidecimal. After some fights, octal became extinct except for the usage in unix permission bits.

The terminology uses user for owner, and other for those neither owner nor in the group.

The display of the permissions are a string of nine characters, in order read, write, execute, abbreviated r, w and x, and appearing as triplets in the order user, group, other. For example, all permissions to everyone is written,

   rwxrwxrwx
and no permission to anyone is written,
   ---------
Usually a file is read/write for the user (owner!), read only for group and world, giving,
   rw-r--r-- 

Go with the flow.

Unix organizes files and almost all environment interactions along the concept of a stream of character, also known as stream I/O. A stream of characters is a a sequence of characters either read or written in the order of the sequence. The stream can be pulled from any file or device open for read and the stream can be used into any file or device open for write.

This stream concept is used by the command line. The idea in Unix is to make simple programs and chain them together to build up the command that you wnat. It is impossible to predict what someone will want. So Unix just tries to give the most useful and simple building blocks. The output of one program, its output stream, can be collected into a file by the > symbol. A file can be the source of input by the < symbol. Program A can feed its output to the input of program B with the pipe operator: A | B .

The stream model does not fit all purposes. There are random access files, files which are an array of bytes. You can go to any byte at any time, remove it, and insert into the file at any place. This cannot be well understood using a stream model of IO. However, often a stream model of the file is sufficient.

Devices also show departure from the stream model. Devices have also sorts of properties, specific to the device. Serial ports have baud rates, and so forth. For these there are ioctls, I/O controls, a general purpose call that can be made against a device to manipulate its properties. Nevertheless, devices as files are found in the filesystem, in the /dev directory. Do a df to see how the file system is constructed. It gives the device names of the harddrives.

Unix file systems is a mountable system. While the idea of hierarchal system is well known, mountable is not so universal. In Windows, there are drive letters, and each file system gets a drive letter. Which makes you wonder what happens after the 26 letters have been used up. Something like a cdrom has its own file system, so it has its own drive letter. In Unix there is no such thing. Whatever the various machines attached to the box, their disparate filesystems are glued together into one unified view. This includes devices such as cdroms, external disks, and also network file systems. Do a df to see how the mounts are arranged for your system. Look at /etc/fstab to see what mounts are requested at system boot.

Kill 9.

All operating systems support processes. Unix has some specific processes which solve operating system problems, just has it has some specific directories in its filesystems to solve problems. Being familiar with these forms a certain knowledge about unix.

Peculiar to unix is its reliance on programable programablity. Rather than write a program called "how to boot", unix writes a program called "how to read the configuration file that describes booting". Similar to how the shell sends all the real work to programs which implement the commands, most of unix process work are shells which invoke other programs under the direction of configuration files, generally written in plain text.

Each process has an integer ID, with the first process being 1. Sometimes this process is the swapper process, else it is /sbin/init, or this is process 2. In anycase, init is the first true process. It refers to configuration files in /etc, particularly /etc/rc and friends, for further guidance in the boot. These are shell scripts, and invoke a wide variety of commands, using many of the shell's programming possiblities.

As described in unix documentation, all processes are fork-cloned from other processes, except of course the first process which can't be cloned from anything else. So after init finishes the boot it begins cloning itself into various other processes. Some listen for logins from users. Others turn into services. Just as unix is init and this particulary process of bringing up the system and the collection of processes that results, unix is the set of services that are provided after boot.

These services are called daemons. They are connected to no user. They are available almost immediately after boot. They tend to be strongly interconnected with network services. For instance, a web server is a form of daemon. A web server, as in a piece of hardward, is a unix machine whose primary purpose is to run the web server daemon.


Burton Rosenberg
Last Update: 8 Feb 2005