Cover V11, I03

Article

mar2002.tar

Creating Black Box Functions in the Korn and Bash Shells

Ed Schaefer

The article "Creating Global Functions with the Korn Shell" by Rainer Raab (Sys Admin, March 2001, http://www.sysadminmag.com/documents/sam0103f/) describes using and autoloading global Korn shell functions. Although global functions are useful and definitely serve a purpose, good software design dictates using loosely coupled, "black box", user-defined functions.

Loosely coupled functions are defined as having flexible relationships to other objects depending little, if at all, on other functions. My design philosophy is to limit the use of globals. Steven McConnell, in his book Code Complete, states that using one global in a function as being acceptable, but accessing ten globals is pathological.

User-defined functions are "black box" if they do one thing and only one thing and limit accessing globals. Also, if the function does that one thing well, a programmer should only have to know what the function does, not how it does it.

Building on Rainer's work, in this article I present function examples using local variables, passing parameters to functions, returning values from functions, and setting the exit code from functions. The function examples presented are:

  • yesorno -- Identifies user input parameter as 1 or 0.
  • user_exist -- Identifies whether user id parameter is assigned to the OS.
  • down_shift -- Down shifts the input parameter to lower case.
  • new_down_shift -- Down shifts the input parameter to lower case, but handles positional parameters containing white space.
  • is_digit -- Determines whether the input parameter is a number.
  • is_regex_set -- Matches an input parameter and regular expression parameter.
  • readkey -- Returns a single keystroke eliminating the carriage return.
  • recursion_test -- Tests at what level recursion fails in the shell.

Also, I discuss dealing with positional parameters containing white space and conclude by commenting on function recursion in the shell.

The Development Environment

The functions in this article run under the Korn shell with SCO Open Server V, the Korn shell with Solaris 7 on Intel Architecture, and Red Hat 7.1 Bash shell on Intel Architecture.

The yesorno Function Example

In an interactive script requiring yes or no answers, do you want a consistent method for processing user input? The following function, yesorno, satisfies this requirement and demonstrates a simple black box function:

# yesorno:  This function takes one argument and if the
# value is Y or y returns 1 else it returns 0.  Pressing
# <CR> returns 0;
# usage: $(yesorno $answer)
function yesorno {
   case $1 in
      y|Y) echo 1;;
      *)
      echo 0;;
   esac
} # end yesorno
Think of a user-defined function as a mini shell script with the function parameters emulating the shell positional parameters. Thus, function parameter 1 is $1, function parameter 2 is $2, etc. Also, special characters $#, argument count, and $@, positional parameter list, are set for the function. More on the special characters later.

The echo command in the function serves to return the value. Instead of being sent to standard output, it is trapped and returned to the calling function where, optionally, it's assigned to a shell variable.

The following code block prompts the user for an answer, reads the input, calls the yesorno function with the answer as an argument, and returns 1 or 0, depending on the user's choice:

printf "Answer the question: Y/N: "
read answer
ret_val=$(yesorno $answer)
echo $ret_val
Calling the function in an if statement eliminates the need for the "ret_val" variable:

if [ $(yesorno $answer) -eq 0 ]
then
    echo "answered No"
else
    echo "answered Yes"
fi
Since the shell traps the output of a called function, command substitution works as an alternative to the $() function syntax:

retval='yesorno $answer' # grave mark or back-tick, not single quote
echo $retval
Command substitution syntax works adequately, but consider it obsolete because a future shell upgrade may eliminate its use.

The user_exist Function Example

Besides the output trapping method, shell functions may return integer values 0 to 255 via the UNIX exit code $?. Using the return keyword, return an integer value from the function and immediately check the exit code. Returning values greater than 255 or less than 0 yields inconsistent and inaccurate results.

The user_exist function determines whether the user id parameter has been system assigned by checking the beginning of the colon-delimited /etc/passwd file. Return 1 if the user exists or return 0 if not:

# user_exist:  This function takes an argument which is the
# user id, greps the first field of /etc/passwd and sets
# the exit code, 1 if user exists and 0 if not.
# sample the exit code $? from calling program.
function user_exist
{
    if grep "^$1:" /etc/passwd > /dev/null 2>&1
    then return 1
    else return 0
    fi
} # end user_exist
Sample the exist code immediately after returning from the user_exist function:

loginame=alexb
user_exist $loginame
if [ "$?" -eq 0 ]
then printf "User %s does NOT exist\n" $loginame
else printf "User %s exists\n" $loginame
fi
The down_shift Function Example

Use the UNIX typeset command to force a variable to lower or upper case:

typeset -l x   # x defined always lower case
typeset -u y   # y defined always upper case
If your shell's typeset command doesn't support the upper- and lowercase arguments (Red Hat 7.1's Bash shell doesn't), use the down_shift function:

# downshift: this function return the argument as downshifted string
# usage: cmmd=$(down_shift $cmmd)
function down_shift {

# alternate translate command
#echo $1 | tr "[:upper:]" "[:lower:]"
echo $1 | tr '[A-Z]' '[a-z]'
} # end down_shift

string="ALLLOWERCASE"
string=$(down_shift $string)
printf "%s\n" $string
Executing the down_shift function test:

string="ALLLOWERCASE"
string=$(down_shift $string)
echo $string
The down_shift function declares string variable to be local. This local declaration guarantees any other invocation of string in the script is not affected.

Dealing with Multiple Positional Parameters

If a parameter contains white space, the shell interprets the argument as multiple parameters, one parameter for each word.

Here's an example given a variable comprising of five words:

newstring="STRING OF LOWER CASE WORDS"
Executing the down_shift function with newstring as a parameter returns only the down-shifted word string, the first parameter. Since the shell evaluates newstring as five parameters, use the special character $@ to down shift all the parameters:

# new_down_shift: This function down shifts all positional
# parameters to lowercase and returns the string
function new_down_shift {

echo $@ | tr '[A-Z]' '[a-z]'
} # end new_down_shift

string="STRING OF LOWER CASE WORDS"
string=$(new_down_shift $string)
echo $string
The is_digit Function Example

The is_digit function uses a regular expression to determine whether an argument contains all digits:

# is_digit:  This function matches an all digits regular expression
# against an argument and returns 1 if the argument is digits else 0
# if isn't.
# Note the use of nawk.  Use gawk for the Bash shell.
function is_digit
{
echo $1|nawk ' { if ($0 ~ /^[0-9]+$/)
                    print 1
                  else
                    print 0 } '
} # end is_digit

num="345"
ret_val=$(is_digit $num)
echo $ret_val
The is_regex_set Function Example

Because of the white space problem, I seldom write functions that require more than one argument. The is_regex_set function is an exception. This function, a modification of is_digit, matches the first argument against the second argument regular expression:

# is_regex_set:  This function takes two arguments:
# argument 1 is the object to check
# argument 2 is a regular expression to match arg 1 against.  
# arg 2 is passed to the awk interpreter as a variable.
# if the match is true return 1 else return 0.
# Check for the argument count not equal 2.
function is_regex_set {
local pattern

if [ "$#" -ne 2 ]
then
   printf "ARGCNTNE2:"
fi

echo $1|nawk ' { if ($0 ~ pattern)
                    print 1
                 else
                    print 0 } ' pattern="$2"
} # end is_regex_set
To test is_regex_set, use a pattern that is a decimal with optional sign and fraction:

pattern="^[+-]?[0-9]+[.]?[0-9]*$"
number="-345.03"
ret_val=$(is_regex_set $number $pattern)
echo $ret_val
Remember that white space in either variable number or pattern can break the above function call. In a function where the parameter count is an issue, I often enter a warning string if the count isn't what it should be.

The readkey Function Example

The readkey function returns a single keystroke from standard input (i.e., the keyboard, without pressing the carriage return):

# readkey:  This function saves the current terminal settings and
# sets the terminal to raw mode with the Unix stty command.  Retrieve
# 1 character with the dd command and reset original terminal settings
# with stty before returning the character to calling script.
function readkey {
local anykey oldstty

   oldstty='stty -g'
   stty -icanon -echo min 1 time 0
   anykey='dd bs=1 count=1 <&0 2>/dev/null'
   stty $oldstty
   echo $anykey
} # end read_key
The following code block prompts the user for one character. Pressing uppercase "Y" tests whether anychar variable is really set:

printf "Get one character from the keyboard: "
anychar=$(readkey)
echo $anychar
if [ $anychar = 'Y' ]
then
   printf "Pressed Upper case Y\n"
fi

The readkey function returns a single character by manipulating the UNIX terminal driver. For more information, see my article "Returning a Single Character in a UNIX Shell Script" (Sys Admin, April 1997, http://www.sysadminmag.com/documents/sam9704f/).

Recursive Functions and the Shell

The shell supports recursion, but not very deeply. The following recursion test adds one to a local variable until it reaches an iteration level value passed on the command line:

iteration_level=$1 # How many times to call recursion_test

# recursion_test: This function adds one to it's argument and keeps
# calling itself until it equals the global iteration_level value
# or the function fails.
function recursion_test {
local x

   x=$(($1+1))
   if [[ x -ge $iteration_level ]]
   then
      echo $x
      exit 1
   fi
   x=$(recursion_test $x)
   echo $x
} # end recursion_test

x=0
recursion_test $x
This recursion test doesn't fail gracefully. Expect a "process fork failing", an "out of swap space", "too many open files", or an "out of memory" error. My minimal testing produces failures at these levels:

                         iteration_level
    Open Server 5 Korn:        429
        Solaris 7 Korn:        440
Red Hat 7.1 Linux Bash:       3500
Conclusion

This discussion has presented seven user-defined shell functions illustrating passing parameters and returning values. Another shell function tested the recursion limits of the shell.

Some UNIX professionals might complain that shell functions hinder performance. Possibly, but is not a small sacrifice in system performance worth easing shell script maintenance later?

References

McConnell, Steven C. 1993. Code Complete. Redmond, WA: Microsoft Press.

O'Brien, Dennis and David Pitts 2001. Korn Shell Programming by Example. Indianapolis, IN: Que.

Rosenblatt, Bill 1993. Learning the Korn Shell. Sebastopol, CA: O'Reilly & Associates, Inc.

Ed Schaefer is a frequent contributor to Sys Admin. He is a Software Developer and DBA for Intel's Factory Integrated Information Systems, FIIS, in Aloha, Oregon. Ed also hosts the UnixReview.com monthly Shell Corner column. He can be reached at: olded@ix.netcom.com.