Using the UNIX Pipe in C
Ed Schaefer
Virtually all UNIX system administrators exploit the
power and convenience
offered by pipes when they are working from the command
line or writing
shell script. By connecting general purpose filters
with pipes, the
system administrator can quickly generate tidy custom
reports and
analytic tools without the overhead and housekeeping
problems associated
with myriad temporary files.
The pipe with all its advantages is also available in
C (using the
standard library functions popen() and pipe()), but,
typically system administrators are less inclined to
use pipes in
their C programs. In this article I'll first explain
how to use popen()
from within a C program to read directly from the output
of another
program. Then I'll present two more general C functions
- one
for reading from the pipe and one for writing to the
pipe - and
use these functions to illustrate how pipes are used
from within a
C program.
The popen() Example
The popen() function allows a C program to spawn a task
and
then attach itself to the input or output of that task
- which
is most commonly a set of pipelined commands. In its
simplest application,
the popen() function allows you to connect a special
purpose
C program to a sequence of standard commands without
writing a separate
piece of shell script. For example, if a special usage
report required
you to analyze a sorted list of current users, you could
either write
a shell script to generate and sort the list and pipe
the results
into a special C program which performs the analysis:
who | sort | analyze
or you could avoid the separate script file by using
popen(). The call
input_stream = popen( "who | sort", "r")
executes the pipelined task "who|sort" and
connects
its output (in read mode - r) to the pipe input_stream.
The calling program (analyzed in this example), can
now read sort's
output directly from the resulting pipe.
Like fopen(), popen() returns a pointer to a FILE
structure, but with popen() the opened object is a command
line to the shell rather than a file. Since pipes are
first-in, first-out
constructs, the program may not use fseek() to adjust
the next
read or write position. Calling popen() with a w parameter
opens the pipe for writing - popen() will execute the
command
string and attach the resulting task's input to the
calling program's
output.
The example program in Listing 1 uses popen() in the
read mode.
This do-nothing program uses popen() to read the output
from
a fairly complex pipeline. The program reads the pipeline's
output,
stores the output in memory, and then prints the saved
result.
To simplify memory management, the program copies strings
read from
the shell to an array buffer and stores a pointer to
the beginning
of each string in an array-of-pointers. Both of these
arrays and the
piped string are passed to the main function, run_popen().
The run_popen() function is just an open, a for
loop (which reads and saves each string), and a close.
This
function has several points worth mentioning: the new-line
character
that fgets() returns is removed (line 48), and a pointer
is
alternately saved and updated by the length of the returned
string
plus one as long as fgets() doesn't return NULL (lines
51 to 53). The only error conditions it traps are not
being able to
open the pipe and trying to copy a string beyond the
end of the buffer.
The function ends by returning the number of strings
read from the
pipe.
Besides creating the variables needed and calling the
function, main()
prints the output from the pipe saved in the array.
Notice that since
(\) is a special character it must be escaped (\\)
when used literally.
Skipping the Shell
A popen() call implies substantial overhead, since popen()
must invoke a full shell to process the command string
used to specify
the communicating task. If your C program only needs
to communicate
with a single other command, you can avoid this overhead
by using
pipe() and fork() instead of popen().
The approach involves creating another process with
fork(),
taking control of standard input and output, and executing
a child
process with one of the exec() functions. In his book
Using
C on the Unix System, David Curry refers to this as
doing your
own plumbing.
What allows one process to talk to another is that open
file descriptors
are kept open across exec() calls. A file descriptor
is an
integer and standard input, standard output, and standard
error have
the values 0, 1, and 2 respectively. The key to working
with pipes
is coupling standard input or output to the reading
or writing end
of the pipe by means of another file descriptor. The
next example,
run_rpipe(), does this by having a child process write
to the
standard output and the parent read the standard input.
The Read Pipe() Example
Listing 2, the run_rpipe() function, corresponds to
this pseudocode:
1) Create a pipe with pipe().
2) Create another process with fork().
3) Couple standard output to the writing end of the
pipe.
4) Close file descriptors.
5) Execute the command string.
6) Close the write side of the pipe.
7) Open the read side of the pipe using fdopen().
8) Read to the end of the pipe.
9) Close the file pointer.
Creating a Pipe
The pipe() library call, used to create a pipe, takes
as an
argument an array of two integers which identifes the
pipe. Given
that p[2] is the array, p[0] identifies the read descriptor
and p[1] the write.
Creating a Child Process
Once a pipe has been created, a child process is created
by a call
to fork(). If the fork() call is successful, it returns
the process ID of the child (line 55). While the child
is executing,
the wait() call suspends execution of the parent (line
85).
Without the wait() call, the parent would continue execution
without waiting for the child, effectively running the
child in the
background.
Coupling the Standard Output to the Write Side of the Pipe
While the standard descriptors 0, 1, and 2 may not be
changed or reassigned,
they can be duplicated, which has the effect of giving
another descriptor
the same file pointer as one of the standard descriptors.
Since the
child process will be writing to the pipe, I close the
standard output
(line 60).
I use the dup() command to create another file descriptor
(line
61). Since the dup() call returns the lowest descriptor
not
in use, closing the standard output, or descriptor 1,
guarantees that
the write side of the pipe, p[1], is coupled to the
standard
output.
Closing File Descriptors
In the child process, once a pipe has been coupled to
standard input
or output it no longer has to be open. It should be
closed since there
is a limit to the number of file descriptors a process
may have open
(lines 62 and 63).
Executing the Command String
Once the coupling to standard output is completed, I
execute the command
string with a call to the shell using execlp() (line
64).
Closing the Write Side of the Pipe
After the child process finishes executing, I close
the write side
of the pipe since the parent isn't writing (line 67).
Opening the Read Side of the Pipe
After the child has finished writing to the pipe, the
parent opens
the pipe for reading using fdopen() and the reading
file descriptor
p[0]. Fdopen() is analogous to fopen(), except
that it uses a low-level file descriptor rather than
a file. The rest
of the function (lines 71 to 83) is the same for loop
for gathering
output lines described in the fopen() example.
The Write Pipe() Example
Writing to the pipe is possible as well as reading.
The write coupling
is the reverse of the read. Couple the read side of
the pipe, p[0],
to the standard input. Fdopen() is opened for writing.
Listing 3, run_wpipe(), is a function that receives
an array-of-pointers
to strings, the number of the strings, and the command
string to be
written to. This example writes three strings to an
ascii file.
Conclusion
By mastering the basics of interprocess communications
in C, system
administrators can exploit the advantages of pipes from
within
their C programs. In particular, C programs which create
pipes directly
can reduce the need for trivial shell script files and
temporary files.
References
Curry, David A. Using C on the UNIX System.
Sebastopol, CA: O'Reilly & Associates, 1989.
Wood, Patrick and Stephen Kochan. UNIX System
Security. Hayden Books, 1987.
Kochan, Stephen and Patrick Wood. Topics in C
Programming. Hayden Books, 1987.
About the Author
Ed Schaefer is an Informix software developer and UNIX
system
administrator at jeTECH Data Systems of Moorpark, CA,
where he develops
Time and Attendance Software. He has been involved with
UNIX and Informix
since 1987 and was previously head of software development
at Marie
Callendar Pie Shops of Orange, CA.
|