Cover V02, I04
Article
Listing 1
Listing 2
Listing 3
Sidebar 1

jul93.tar


Using the UNIX Pipe in C

Ed Schaefer

Virtually all UNIX system administrators exploit the power and convenience offered by pipes when they are working from the command line or writing shell script. By connecting general purpose filters with pipes, the system administrator can quickly generate tidy custom reports and analytic tools without the overhead and housekeeping problems associated with myriad temporary files.

The pipe with all its advantages is also available in C (using the standard library functions popen() and pipe()), but, typically system administrators are less inclined to use pipes in their C programs. In this article I'll first explain how to use popen() from within a C program to read directly from the output of another program. Then I'll present two more general C functions - one for reading from the pipe and one for writing to the pipe - and use these functions to illustrate how pipes are used from within a C program.

The popen() Example

The popen() function allows a C program to spawn a task and then attach itself to the input or output of that task - which is most commonly a set of pipelined commands. In its simplest application, the popen() function allows you to connect a special purpose C program to a sequence of standard commands without writing a separate piece of shell script. For example, if a special usage report required you to analyze a sorted list of current users, you could either write a shell script to generate and sort the list and pipe the results into a special C program which performs the analysis:

who | sort | analyze

or you could avoid the separate script file by using popen(). The call

input_stream = popen( "who | sort", "r")

executes the pipelined task "who|sort" and connects its output (in read mode - r) to the pipe input_stream. The calling program (analyzed in this example), can now read sort's output directly from the resulting pipe.

Like fopen(), popen() returns a pointer to a FILE structure, but with popen() the opened object is a command line to the shell rather than a file. Since pipes are first-in, first-out constructs, the program may not use fseek() to adjust the next read or write position. Calling popen() with a w parameter opens the pipe for writing - popen() will execute the command string and attach the resulting task's input to the calling program's output.

The example program in Listing 1 uses popen() in the read mode. This do-nothing program uses popen() to read the output from a fairly complex pipeline. The program reads the pipeline's output, stores the output in memory, and then prints the saved result.

To simplify memory management, the program copies strings read from the shell to an array buffer and stores a pointer to the beginning of each string in an array-of-pointers. Both of these arrays and the piped string are passed to the main function, run_popen().

The run_popen() function is just an open, a for loop (which reads and saves each string), and a close. This function has several points worth mentioning: the new-line character that fgets() returns is removed (line 48), and a pointer is alternately saved and updated by the length of the returned string plus one as long as fgets() doesn't return NULL (lines 51 to 53). The only error conditions it traps are not being able to open the pipe and trying to copy a string beyond the end of the buffer. The function ends by returning the number of strings read from the pipe.

Besides creating the variables needed and calling the function, main() prints the output from the pipe saved in the array. Notice that since (\) is a special character it must be escaped (\\) when used literally.

Skipping the Shell

A popen() call implies substantial overhead, since popen() must invoke a full shell to process the command string used to specify the communicating task. If your C program only needs to communicate with a single other command, you can avoid this overhead by using pipe() and fork() instead of popen().

The approach involves creating another process with fork(), taking control of standard input and output, and executing a child process with one of the exec() functions. In his book Using C on the Unix System, David Curry refers to this as doing your own plumbing.

What allows one process to talk to another is that open file descriptors are kept open across exec() calls. A file descriptor is an integer and standard input, standard output, and standard error have the values 0, 1, and 2 respectively. The key to working with pipes is coupling standard input or output to the reading or writing end of the pipe by means of another file descriptor. The next example, run_rpipe(), does this by having a child process write to the standard output and the parent read the standard input.

The Read Pipe() Example

Listing 2, the run_rpipe() function, corresponds to this pseudocode:

1) Create a pipe with pipe().

2) Create another process with fork().

3) Couple standard output to the writing end of the pipe.

4) Close file descriptors.

5) Execute the command string.

6) Close the write side of the pipe.

7) Open the read side of the pipe using fdopen().

8) Read to the end of the pipe.

9) Close the file pointer.

Creating a Pipe

The pipe() library call, used to create a pipe, takes as an argument an array of two integers which identifes the pipe. Given that p[2] is the array, p[0] identifies the read descriptor and p[1] the write.

Creating a Child Process

Once a pipe has been created, a child process is created by a call to fork(). If the fork() call is successful, it returns the process ID of the child (line 55). While the child is executing, the wait() call suspends execution of the parent (line 85). Without the wait() call, the parent would continue execution without waiting for the child, effectively running the child in the background.

Coupling the Standard Output to the Write Side of the Pipe

While the standard descriptors 0, 1, and 2 may not be changed or reassigned, they can be duplicated, which has the effect of giving another descriptor the same file pointer as one of the standard descriptors. Since the child process will be writing to the pipe, I close the standard output (line 60).

I use the dup() command to create another file descriptor (line 61). Since the dup() call returns the lowest descriptor not in use, closing the standard output, or descriptor 1, guarantees that the write side of the pipe, p[1], is coupled to the standard output.

Closing File Descriptors

In the child process, once a pipe has been coupled to standard input or output it no longer has to be open. It should be closed since there is a limit to the number of file descriptors a process may have open (lines 62 and 63).

Executing the Command String

Once the coupling to standard output is completed, I execute the command string with a call to the shell using execlp() (line 64).

Closing the Write Side of the Pipe

After the child process finishes executing, I close the write side of the pipe since the parent isn't writing (line 67).

Opening the Read Side of the Pipe

After the child has finished writing to the pipe, the parent opens the pipe for reading using fdopen() and the reading file descriptor p[0]. Fdopen() is analogous to fopen(), except that it uses a low-level file descriptor rather than a file. The rest of the function (lines 71 to 83) is the same for loop for gathering output lines described in the fopen() example.

The Write Pipe() Example

Writing to the pipe is possible as well as reading. The write coupling is the reverse of the read. Couple the read side of the pipe, p[0], to the standard input. Fdopen() is opened for writing.

Listing 3, run_wpipe(), is a function that receives an array-of-pointers to strings, the number of the strings, and the command string to be written to. This example writes three strings to an ascii file.

Conclusion

By mastering the basics of interprocess communications in C, system administrators can exploit the advantages of pipes from within their C programs. In particular, C programs which create pipes directly can reduce the need for trivial shell script files and temporary files.

References

Curry, David A. Using C on the UNIX System. Sebastopol, CA: O'Reilly & Associates, 1989.

Wood, Patrick and Stephen Kochan. UNIX System Security. Hayden Books, 1987.

Kochan, Stephen and Patrick Wood. Topics in C Programming. Hayden Books, 1987.

About the Author

Ed Schaefer is an Informix software developer and UNIX system administrator at jeTECH Data Systems of Moorpark, CA, where he develops Time and Attendance Software. He has been involved with UNIX and Informix since 1987 and was previously head of software development at Marie Callendar Pie Shops of Orange, CA.