My-Tiny.Net :: Networking with Virtual Machines



Command Redirection



Command redirection is used frequently in shell scripts, and looks cryptic until you get used to it.

stdin, stdout, stderr

File descriptors are numbers that index a per-process data structure in the kernel that records which I/O channels (file, device, socket, pipe) a process has open. Each I/O system call (read, write, etc.) takes a file descriptor that indicates which channel the call should operate on.

By convention, the first entry in the table has index 0 and is called standard input (stdin), 1 is the standard output (stdout) and 2 is the standard error (stderr) channel. This is just a convention; as far as the kernel is concerned, there is nothing special about these numbers.

On that note, we also need to recognise that what and whether a program writes something to stdout or stderr (or both) is entirely up to the programmer. Good programming practice dictates that error messages should go to stderr and normal output to stdout, but you will find sloppy programming that mixes the two or otherwise ignores this convention.

When you log in and are working from the command line, standard input (stdin, FD0) is taken from your terminal keyboard and both standard output (stdout, FD1) and standard error (stderr, FD2) are sent to your terminal screen. In other words, the shell expects to be getting its input from the keyboard and showing the normal output and any error messages on the terminal screen.

Shells will allow you to set up connections between these file descriptors (with bash there are 9 available to each process) and specific files, devices, or even pipelines to other processes before your process starts. Some of the possible manipulations are rather clever; the implementation that makes it possible is also rather clever.

Alongside the device nodes in /dev/ these links should exist on all systems:
      /dev/fd      points at  /proc/self/fd
      /dev/stdin   points at  /proc/self/fd/0
      /dev/stdout  points at  /proc/self/fd/1
      /dev/stderr  points at  /proc/self/fd/2
Remember that in the Linux /proc filesystem, those aren't real files, but tightly controlled gateways to kernel information. /dev/stdin is a symlink to /proc/self/fd/0 which refers to the first file descriptor that the program doing the open call (the "currently running program") has open. So, what is pointed to by /dev/stdin will change from program to program, because /proc/self/ is really just a fast and generic way to say /proc/{process_id}/

Pipes

Shells allow you to set up connections between process file descriptors.

Perhaps the most commonly used character is |, which is referred to as "pipe". This enables you to pass the output of one command through the input of another. Essentially, when this operator is present on the command line, the shell points the left side command's stdout (FD1) to the right side command's stdin (FD0). This connection applies only to this particular process, and lasts only as long as the process is running.

So, for example, if we run the command ls -l | more
the output (stdout) of the ls command will be "piped through more". In detail, this command tells the shell to
1. create a process for more
2. create a process for ls
3. connect ls stdout (FD1) to more stdin (FD0)
Another example: the first step to compiling a package is often    make configure
which shows a lot of output when it checks for libraries and versions. To avoid seeing this, pipe the stdout of the command through grep with the "discard lines" argument    make configure | grep -v Checking

Redirects

The other characters that are used quite often are < to redirect stdin and > or >> to redirect stdout.

Redirecting stdout to a file is very common, for example the command    ls -1 /bin >myfile
means
1. create a process for ls
2. point ls stdout (FD1) to a file in the current directory named myfile
Using > will create a new file every time
Using >> will add to the end of an existing file, or create a new one
Redirecting stdin is not so common, but can be quite useful. For example,    wc -w inputfile.txt
will output the word count and file name, something like 2468 inputfile.txt

However, when wc doesn't know the filename, it simply outputs the count.
You can hide the file name from it by using input redirection:    wc -w <inputfile.txt
which will:
1. create a process for wc
2. point wc stdin (FD0) to inputfile.txt
wc doesn't know the file name because the shell never passes inputfile.txt to wc as a command line argument. Note that the shell does not "open the file and send the contents", which implies a data copy that does not happen - wc reads inputfile.txt just like it would read from the device file that represents the keyboard.

Another example: cat opens a file and writes the content to stdout. grep searches stdin and writes all of the lines with the search string to stdout. We have to use  -e  to protect a pattern that begins with  -  because grep is aggressive about its command line arguments (it does not respect the quotes alone).

So     cat /var/log/dnsmasq.log | grep -e "-dhcp"
and       grep -e "-dhcp" < /var/log/dnsmasq.log
are two different ways to get the same output: all of the lines in the file that have -dhcp someplace.

/dev/null

If you direct output to /dev/null it disappears into the "bit bucket" (or cyberspace). This device is used a lot in shell scripts where you do not want to see any output or error messages. You can redirect stdout and/or stderr to /dev/null and the messages vanish.

The redirect operators are evaluated from left to right, and the current settings are used whenever duplication of the descriptor occurs. So,

Output and error messages go to the same place: command >/dev/null 2>&1
1. create a process for some command
2. point command stdout (FD1) to /dev/null
3. point command stderr (FD2) to where command FD1 currently points ( /dev/null )
Or, stderr goes to the screen while stdout goes to the file specified: command 2>&1 >/dev/null
1. create a process for some command
2. point command stderr (FD2) to where command FD1 currently points ( the screen )
3. point command stdout (FD1) to /dev/null
This can be useful - for example, what is the effect of this? make configure 2>&1 >/dev/null

Pipes and Redirection

Redirection and pipes can be combined in various ways.
For example, to save all of the lines from the process list with sh someplace to a file called shellprocs
  ps | grep sh > shellprocs
1. create a process for grep
2. create a process for ps
3. connect ps stdout (FD1) to grep stdin (FD0)
4. point grep stdout (FD1) to the file shellprocs in the current directory
or, to see the first ten lines of the sorted file on the screen
   sort < mylist | head
1. create a process for sort
2. create a process for head
3. connect sort stdout (FD1) to head stdin (FD0)
4. point sort stdin (FD0) to the file mylist in the current directory
The important thing to remember is that the pipe is always processed first, and then the redirect operators for each side are evaluated from left to right, with the current settings being used whenever duplication of the descriptor occurs. This is important: since the pipe was set up first, the FD1 (left side) and FD0 (right side) are already changed, and any duplication of these will reflect that.

So, with command >/dev/null 2>&1 | grep 'something'

all stdout and stderr messages from "command" go to /dev/null. Nothing goes to the pipe, and thus "grep" will not display anything on the screen.

But with command 2>&1 >/dev/null | grep 'something'

all output that "command" writes to its FD2 (stderr) makes its way to the pipe and is read by "grep" on the other side. All output that "command" writes to its FD1 (stdout) makes its way to /dev/null.

Named Pipes

Technically, a Unix "named pipe" is a First In First Out (FIFO) interprocess communication mechanism. In contrast to "unnamed pipes" (represented by  |  between two commands) a named pipe appears as a special file in the filesystem, which can be accessed by independent processes that were not spawned by the same parent process. Using a named pipe is straightforward, just write something to it in one process and read from it in another. Data written to the named pipe will be stored until another process reads from the pipe. The first process will exit when it's done writing and sends EOF, and the second process stops when it sees the EOF and exits.

Let's try it out:

In one virtual terminal, create a named pipe like this (you only need to do this once): mkfifo /tmp/piper

See what it looks like with ls -l /tmp/piper
    total 0
    prw-r--r--  1 root  root  0 Aug 11 16:59 piper
The p in the left column of the output indicates that the file is a pipe. You can change the permission bits as with any other file.

Now start a process and write to the pipe: echo "logged in as: $(whoami)" >/tmp/piper

Nothing will happen until you switch to another virtual terminal with, for example, Alt F2 and read from the pipe: cat /tmp/piper

When you switch back to the "echo" virtual terminal you will see the command prompt is back.
Try it again: use Up Arrow and change the command to echo "current directory: $PWD" >/tmp/piper

Switch to the "cat" virtual terminal and use Up Arrow to run the same command as before: cat /tmp/piper

Persistent Listener Loop

Now let's run a command to "listen" on the named pipe. This command strings together the bash commands to create loop that reads whatever arrives and prints it to stdout. Run this in the "cat" virtual terminal:

while read LINE </tmp/piper; do echo $LINE; done

Switch to the "echo" virtual terminal and use Up Arrow to retrieve the last command and run it a few times; when you will switch back to the other virtual terminal you will see the output for each one. Use Ctrl c to stop listening.

Here are two shell scripts (as a bonus) for further experimentation.

The listener script for one VT The sender script for another VT
  #!/bin/bash
    while read LINE </tmp/piper; do 
      if [ "$LINE" == "quit" -o "$LINE" == "stop" ]; then  
        break
      fi
      echo $LINE
    done
    echo "Listener exiting"
  #!/bin/bash
    if [ "$1" != "" ]; then
    # send the input
      echo "$1" >/tmp/piper
    else
    # send current process number
      echo "Hello from $$" >/tmp/piper
    fi


Persistent Listener Using exec and file descriptor

Creating a new file descriptor is as easy as picking a number, keeping in mind that file descriptors 0, 1 and 2 are reserved for stdin, stout and stderr respectively, and numbers above 9 could conflict with file descriptors used internally by the shell.

The bash built-in exec command is generally used to replace one process with another, and apply any redirects which are specified to the bash process itself. That may sound a bit tricky, but it is very useful in practice.

Attaching the fifo to a file descriptor causes the shell to buffer data written to the fifo. Without it, any write to the fifo blocks until something reads what is written. With the file descriptor, the buffer means it can be read later, or read continuously until there is an "End Of File" marker. EOF is a concept, not a character or an "event" that can be sent to the pipe - the only way to generate an EOF on the reading end of a pipe/fifo is to close all open handles to its writing end.

In the "echo" virtual terminal, we attach to the fifo a file descriptor like this:
(Careful! There is only one space!)     exec 4<>/tmp/piper

Switch to the "cat" virtual terminal and use Up Arrow to run the same command as before:    cat /tmp/piper

Switch back, use Up Arrow, and write to the pipe using either the pipe name or the file descriptor
echo "logged in as: $(whoami)" >/tmp/piper    or    echo "current directory: $PWD" >&4

cat will keep reading until the file descriptor is closed in the "echo" virtual terminal with   exec 4>&-

If you are wondering about a practical application, this is how tail -f works internally.