Monday, March 30, 2009

CSH scripting basics

Introduction
Rather than going into a "how to program" tutorial, we'll simply show a few examples of scripts and what they do, how to modify them, etc. In general, a shell script is a program just like any other and thus may have bugs and "features" that you did not expect. Try to keep them simple, put the "real" programming into your executable, not the script. Use the script to set up data files, pre- and post-process data, and sequence operations. We'll show csh syntax, although any other shell can be used - the conditionals and loop constructs may be slightly different.

#!/bin/csh
#
cd data
./runme

Note that lines beginning with a # (pound) character are comments. The first line specifies what shell is to be used, csh in this example. While it is technically inside a comment line, the shell always looks at the first line for this type of construct, then it knows how to interpret or run the script.

As with the csh shell itself, variables may be referenced with a dollar sign. So after we "set p = 2", we can refer to variable p by "echo $p", for example. Note that variables are assumed to be strings unless explicitly operated on with a mathematical operation using the @ functionality (at-sign, see below).

Loops
Here's a simple loop example:

#!/bin/csh
#
cd data
foreach l ( a b c 123 456 red blue green )
runprog < datafile.$l.in > datafile.$l.out
end

The foreach line loops through all entries in the paren list, setting the variable l to each one and then executing the loop body. In this case, the loop will be executed 8 times. During the first trip through the loop, variable l will have the value a. The second trip will have l equal to b, etc. When the loop executes with l equal to 123, remember that it is the string "123" and not a number.

Note that we are using input and output redirection, that is, we are using a file to simulate "keyboard input" to the program and we are capturing any screen output to a second file.

If..Then

#!/bin/csh
#
cd data
foreach l ( a b c 123 456 red blue green )
if( ! -e datafile.$l.out ) then
runprog < datafile.$l.in > datafile.$l.out
endif
end

Inside the loop, we now test a conditional. The '!' means logical-negation and the '-e' means "does the file exist". So the end results is: if a file called datafile.$l.out does not exist, then we run the program runprog (and any other statements until we hit the endif). There are a number of tests you can do in the 'if' clause. Logical and file-test operations are described in the next sections.

Finally, the script executes the program, using input and output redirection with the angle brackets (greater-than and less-than signs). Note that if you have done "set noclobber" then you cannot redirect output to an existing file - an error will occur. You can use >> to append the current output to an existing file. If you want to force the overwritting of an existing file, you can also use >!. To redirect the "standard error" output, use >& or >>&.

There is an option for an 'else' section, which will execute if the test condition is false. For several 'If..Then' clauses in a row, you may use 'else if' statements in between, then close the whole group with a single 'endif'.

Note the idea of the above script is that the exact same script can be submitted over and over again and it will ONLY run jobs that have not been completed yet. Once a job has run, it will have produced an output file called "datafile.$l.out" and thus the "if" clause will stop it from being executed again.

Logical Operators
The exclamation point signfies logical NOT - ie. "! -e file" is true if the file does NOT exist. Logical AND is performed with && (double ampersands), logical OR is done with || (double bar).

You can compare strings using '==' (equal) and '!=' (not equal), as well as '=~' and '!~' which allow for wildcard characters ('*' and '?') on the right-hand side of the test. The '*' wildcard will match any number of any character, the '?' will match exactly one character. You can also use square brackets to allow a match to one-of-N choices, e.g. '[ABC]' will match only one character of 'A', 'B', or 'C'.

For numeric values, you can use '>', '>=', '<=', '<'

File-test Operators
There are a number of operators you can use to test different attributes of a file:

-e file
true if file exists
-o file
true if file exists and is owned by the user
-r file
true if file exists and is readable
-w file
true if file exists and is writable
-x file
true if file exists and is executable
-z file
true if file exists and is zero size
-d dir
true if dir exists and is a directory

Doing Math
To illustrate some math in a script, we'll use the following example:

#!/bin/csh
#
cd data
#
foreach p ( 1 2 4 8 16 )
@ s = $p * 25
foreach l ( a b c )
prog -p=$p -size=$s
end
end

Again, recall that all variables are assumed to be strings. The '@' (at sign) indicates that we want to interpret the string as a number and then do some math with it.

So in this script, the variable "$p" is a string, but we temporarily interpret it as a number, multiply it by 25, and then set that result (as a string) into variable "$s".

Print this post

No comments: