Along with this document are some "helper" REXX scripts that your main REXX (CGI) script may call to faciliate dealing with a CGI interface. These helper scripts are based upon the cgi-lib.rxx REXX library of functions, and modified to suit the Reginald REXX interpreter for Windows.
For more information on Server Side CGI, see the WWW Virtual Library. Also, you may wish to peruse the Web Development Center.
Since there are security and other risks associated with executing user scripts in a WWW server, you may wish to first view a document providing information on a SLAC Security Wrapper for users' CGI scripts.Besides improving security, this wrapper also simplifies the task of writing a CGI script for a beginner.
Before embarking on writing a script, you may also want to check out some rough notes on SLAC Web Utilities Provided by CGI Scripts.
The CGI is an interface for running external programs, or gateways, under an information server. Currently, the supported information servers are HTTP (the Transport Protocol used by WWW) servers.
Gateway programs are executable programs which can be run by themselves (but you wouldn't want to except for debugging purposes). They have been made executable to allow them to run under various (possibly very different) information servers interchangeably. GCI can run such a gateway program and pass it data from some form on a web page, and have the gateway program return some contents to be sent to the client.
To be able to run a REXX script as a gateway program, you need to tell the CGI to run Reginald's script launcher as the gateway program (and also pass the name of the script that you wish to run). A special version of Reginald's script launcher has been included, called REGINALD.EXE. You should use this to run your script as a gateway program (rather than using RXLAUNCH.EXE).
To determine how to set up your CGI server to run REGINALD.EXE and pass it the name of the script to run, consult the documentation with your CGI server.
QUERY_STRING
The CGI may set the environment variable named QUERY_STRING to any text which follows the first ? in the URL used to access your gateway. Such text could be added by an HTML ISINDEX document, or by an HTML Form (with the GET action). It could also be manually embedded in an HTML hypertext link, or anchor, which references your gateway. This text will usually be an information query (e.g. what the user wants to search for in databases) or perhaps the encoded results of your feedback Form. The text can be retrieved in REXX as so:PATH_INFOstring = GETENV('QUERY_STRING')This text is encoded in the standard URL format which changes spaces to +, and encoding special characters with %xx hexadecimal encoding. You will need to decode it in order to use it. The included REXX function CgiDeWeb() shows how to decode those special characters.If your server is not decoding results from a Form, the CGI will also pass the query string (decoded for you) onto the command line. This means that your REXX script can get the query string via a PARSE ARG instruction.
For example, if you have the URL...
http://www.slac.stanford.edu/cgi-bin/foo?hello+world...and you use the REXX instruction...PARSE ARG Arg1 Arg2...then Arg1 will contain hello and Arg2 will contain world (i.e. the + sign is replaced with a space, so PARSE ARG will break up the argument at that space, into Arg1 and Arg2). If you choose to use the PARSE ARG to retrieve the input, you need to do less processing on the data before using it because it is already decoded.
Much of the time, you will want to send data to your gateways which the client shouldn't muck with. Such information could be the name of the Form which generated the results they are sending.Standard InputCGI allows for extra information to be embedded in the URL for your gateway which can be used to transmit extra context-specific information to the scripts. This information is usually made available as "extra" information after the path of your gateway in the URL. This information is not encoded by the server in any way. The CGI sets the environment variable named PATH_INFO to this extra text, and your REXX script can retrieve it as so:
string = GETENV('PATH_INFO')To illustrate, let's say I have a REXX script which is accessible to my server with the name foo. When I access foo from a particular document, I want to tell foo that I'm currently in the English language directory, not the Pig Latin directory. In this case, I could access my script in an HTML document as:
<A HREF="http://www/cgi-bin/foo/language=english">foo</A>
When the server executes foo, it will set PATH_INFO to /language=english, and my script can retrieve this with GETENV(), decode this and act accordingly.
The PATH_INFO and the QUERY_STRING may be combined. For example, consider the URL:
http://www/cgi-bin/htimage/usr/www/img/map?404,451The above URL will cause the server to run the script calledhtimage
. It would pass the remaining path information "/usr/www/img/map" to htimage in the PATH_INFO environment variable, and pass "405,451" in the QUERY_STRING variable. In this case, htimage is a script for implementing active maps supplied with the CERN HTTPD.
If a web page has METHOD="POST" in its FORM tag, your REXX script will receive the encoded Form input via the standard input stream. (ie, The GCI will write the data to standard input before your script starts, and that data will be waiting for your script to read it via CHARIN(). You omit the filename arg to CHARIN in order to read from the standard input stream). The GCI will also set the environment variable named CONTENT_LENGTH to the number of characters in the data.Here's how you would read the data:
data = CHARIN(, 1, GETENV('CONTENT_LENGTH'))Note Once you read that data via CHARIN(), then you can no longer retrieve it again with another call to CHARIN. So, if you need to call another script that requires access to that data, you can pass the data to the other script, and the other script can utilize a USE ARG (or PARSE ARG, or ARG()) instruction to access it.
You can review the REXX script testinput.rex for an example of how to read the various form of input into your script.
The helper functions CgiReadForm.rex and CgiReadPost() may be used to simplify the task of reading input from a Form.
Form data is a stream of name=value pairs separated by the ampersand (&) character. Each name=value pair is URL encoded (i.e. spaces are changed into plus signs and some characters are encoded into hexadecimal). To decode the Form data, you must first parse the Form data block into separate name=value pairs tossing out the ampersands. Then you must parse each name=value pair into the separate name and value. Use the first equal sign you encounter to split the data. (If there is more than one, then something is wrong with the data). Toss out the equal sign. Finally, undo the URL encoding of each name and value. The helper function CgiGetVariables.rex can perform this task and stuff the results into stem variables of your choosing.
When parsing the name and value information in the script, you need to be aware that:
To send data back to the server, you simply use the SAY instruction (followed by the data you wish to send). Each time you use SAY, another line of data is sent to the CGI interface (and ultimately, the client).
In order to tell the CGI interface what kind of document you are sending back, you must first send back a "header". This consists of two lines that you must SAY. (ie, You'll make two SAY instructions to send the header).
The first line must indicate the MIME type of the document you will be outputting. Typically, there is a content type, followed by a slash, and then a sub-type.
Some common MIME types are:
Content-type: type/subtypewhere type/subtype is the MIME type and subtype for your output, as listed above.
The second line should be blank (ie, You use a lone SAY instruction, with nothing after it). Once the CGI interface retrieves this line, it knows that you're finished telling the CGI interface about your document type, and you will now begin SAY'ing the actual content of your document. If you skip this second line, the CGI interface will attempt to parse your output trying to find further information about your document type and you will become very unhappy.
For example, if you wish to send back an HTML document, your header would be sent as so:
SAY 'Content-type: text/html' SAY /* Here you would SAY the actual HTML contents starting with an <HTML> tag */The helper function CgiPrintHeader.rex can assist in outputting a header.
After these two lines have been SAY'ed, anything more you SAY will be included in the document sent to the client. This output must be consistent with the Content-type header. For example, if the header specified Content-type text/html then the following lines you SAY must include HTML formatting such as using <BR> or <P> for starting new lines or <PRE> to remove HTML's automatic formatting.
For example, here we write a simple web page that says "Hello world":
SAY 'Content-type: text/html' SAY SAY "<HTML><HEAD><TITLE>" SAY MyTitle SAY "</TITLE></HEAD><BODY><H1>Hello World</H1></BODY></HTML>"
If your script encounters errors (e.g. no input provided when you need it, invalid characters found in the input, requested an invalid command to be executed, invalid syntax or undefined variable encountered in the REXX script), your script should provide detailed information on what is wrong etc, and SAY this to the CGI interface so that the information is relayed to the client.
CGIerror.rex demonstrates writing out an HTML document with an error message.
chmod o+x /u/sf/cottrell/bin/cgi1.rxx chmod u+x /u/sf/cottrell/bin/cgi1.rxx
The Windows operating system does not require you to do anything special to mark a script as capable of being executed.
The Web-Master will want to insure that Security Aspects of your script have been addressed before adding your script to the Rules file.
Function | Owner | Group | Comment |
---|---|---|---|
minimal.rex | cottrell | sf | A simple example of a Form CGI Script |
testinput.rex | Mwww | oh | An example to show processing of input |
CgiCleanQuery.rex | cottrell | sf | Removes all occurences of unassigned variables from a CGI query string string |
CgiError.rex | cottrell | sf | Sends an error HTML page to the CGI interface |
CgiDelQuery.rex | cottrell | sf | Removes an item from a CGI query string |
CgiDeweb.rex | cottrell | sf | Converts ASCII Hex coded %XX to ASCII characters |
CgiFullUrl.rex | cottrell | sf | Returns the complete CGI query URL |
CgiHtmlBot.rex | cottrell | sf | Returns the HTML tags at the end of a page |
CgiHtmlTop.rex | cottrell | sf | Returns the HTML title and h1 tags at the top of a page |
CgiHTtab | cottrell | sf | Converts a tab delimited file to an HTML table |
CgiMethGet.rex | cottrell | sf | Returns true if the form is using METHOD="GET" |
CgiMethPost.rex | cottrell | sf | Returns true if the form is using METHOD="POST" |
CgiMyUrl.rex | cottrell | sf | Adds the URL of the script to the page |
CgiPrintHeader.rex | cottrell | sf | Returns the Content-type header (to SAY) |
CgiPrintVariables.rex | cottrell | sf | Adds a listing of the Form name=value& variables to the page |
CgiReadForm.rex | cottrell | sf | Reads a Form's "GET" or "POST" input and returns it decoded |
CgiReadPost | cottrell | sf | Reads the standard input for a form with METHOD="POST" |
CgiStripHtml.rex | cottrell | sf | Removes HTML tags from a string |
CgiWebify | cottrell | sf | Encodes special characters in hex ASCCII %XX form |
REXX Routines to Manipulate CGI input
cottrell@slac.stanford.edu
http://www.slac.stanford.edu/~cottrell.html/cottrell.html
These routines are modelled on a set of Perl routines from S.E.Brenner@bioc.cam.ac.uk, with some additions suggested by "Gateway Programming I: ..." in "HTML and CGI Unleashed" by John December and Mark Ginsberg, published by Sams/Macmillan.
For more information on Steve's functions, see:
http://www.bio.cam.ac.uk/web/form.html
http://www.seas.upenn.edu/~mengwong/forms/
For more information on "HTML and CGI Unleashed" see:
http://www.rpi.edu/~decemj/works/wdg.html
This document and/or portions of the material and data furnished herewith, was developed under sponsorship of the U.S. Government. Neither the U.S. nor the U.S.D.O.E., nor the Leland Stanford Junior University, nor their employees, nor their respective contractors, subcontractors, or their employees, makes any warranty, express or implied, or assumes any liability or responsibility for accuracy, completeness or usefulness of any information, apparatus, product or process disclosed, or represents that its use will not infringe privately-owned rights. Mention of any product, its manufacturer, or suppliers shall not, nor is it intended to, imply approval, disapproval, or fitness for any particular use. The U.S. and the University at all times retain the right to use and disseminate same for any purpose whatsoever.
Copyright (c) Stanford University 1995, 1996.
Permission granted to use and modify this library so long as the copyright above is maintained, modifications are documented, and credit is given for any use of the library.