THE FORMAT OF A .CACHE FILE version 2.18 Starting with release 1.0, the internal format of a .cache file has changed. This change is necessary since the file must contain more information to support HTTP/1.0. The 1.0 server can read .cache files created by earlier versions of mkcache. Nevertheless, it is important to run mkcache on your menu files (or mkcache -r) when upgrading. This is particularly true if you have used the 1.0beta version of mkcache since some remote links will not work if the 1.0 server is used with .cache files created with a 1.0beta mkcache. For the interested user here are some details about the format of a .cache file. The file consists of primary lines followed by some number (currently 1) secondary lines with additional information about the item referred to in the primary line. A primary line has the form XMenu title of the itemselectorhostport The first character is the single character gopher type designator, e.g. '0' for text, '1' for menus, '9' for binary, etc. This is followed by the title which will appear on the client menu. Next a separates the title from the selector or Path= entry in the menu file. Then comes , host fully qualified name, and port. Following this primary line there is an optional secondary line. The most common secondary line has the format content_typesuffixencodingattribute The leading indicates this is a secondary line. The content_type is the MIME content type, e.g. text/plain, image/gif, etc. The suffix is a string which is non-empty if the item referred to is a file whose name ends in '.' followed by 1 to 4 characters in which case suffix equals the characters after (but not including) the '.' converted to lower case. The "encoding" field is empty or one of "x-compress" or "x-gzip" indicating that the file in question has been compressed with the UNIX compress utility or with the GNU compression utility gzip. The "attribute" field is the contents of the menu Attribute= field converted to lower case. The only currently allowable values for this are, invisible, gopheronly, httponly, nosearch. (The value "gnlink" is used in the menu but not here -- the gn_link content type is used instead, see below.) There are two special values which the content_type can assume that are not, in fact, MIME content types. These are "gopher_link" and "gn_link" and they indicate to the server that the item referred to is a remote link to either a gn sever or a gopher server. Since these are links to remote items it is not possible (and fortunately not necessary) to know their true content-type. The gn_link and gopher_link types are only used internally and are not transmitted in an HTTP header. HOW GN PARSES AN HTML REQUEST (new with version 2.08) Here's what GN will do with some URL's from HTTP clients (behavior with gopher clients should be essentially unchanged from earlier versions). Given the URL http://host/1/dir/file GN looks for an item with Path=1/dir/file. As it searches it records the first partial match (if any). A partial match means something like 0/dir/file or 9/dir/file (but the directory and file must match exactly). If there was no exact match it assumes that the first partial match should be the document returned and does so. This much should make relative links in HTTP documents work. Given the URL http://host/dir1/dir2/file GN will first assume (wrongly) that dir1 is the type part of the Path= field. It will likely find no matches (unless dir1 has a name like "0", for example). Similarly there are not likely to be partial matches. At this point GN assumes the client wants http://?/dir1/dir2/file where ? is something unknown and it looks for a partial match of this form and returns it if it is found. The downside of this is that GN is trying to return a document with incomplete information specifying it. Sometimes the "wrong" document can be returned. In particular if a menu contains two entries with the same file path then the only way to access the second one is with the old style URL including the leading 0, 1, 7g, or whatever. However, there should be no security problem because gn will only return items listed in a .cache file. IMPORTANT NOTE: To use this new GN feature you don't make any changes in your menu files or .cache files! They stay the same. Also the URL's produced by GN in menus will not change they will still look like http://host/0/dir/file and not http://host/dir/file. Another thing to keep in mind is that the URL's http://host/dir1/dir2 and http://host/dir1/dir2/ represent the same menu to GN but will behave differently for relative links. A link with HREF="file" for the first will refer to /dir1/file and for the second to /dir1/dir2/file. DECOUPLING THE GN HIERARCHY FROM THE FILE SYSTEM HIERARCHY Normally the gn menu hierarchy is closely linked to the hierarchy of the file system containing the files being served, and this is as it should be. However, in certain circumstances it may be desirable to decouple these two hierarchies. One might, for some reason, want all the files to be in a single directory but still have an extensive hierarchy of menus. One reason for doing this would be to facilitate using gn as a front end for some other programs which produce data. This document describes a mechanism for completely separating the the gn menu hierarchy from the filesystem hierarchy. Recall that a a primary entry in the .cache file looks like 0Menu Title selector host port The first character (0 in this prototype) is one of the single character standard gopher types (see the gopher protocol from Univ of Minn for for more information), i.e. it is "0" for a file, "1" for a directory, "7" for a search, "I" for an image, etc. The phrase "Menu Title" or anything replacing it (which must not contain a ) is what will appear on the clients menu. The host is the fully qualified hostname and the port is the port at which the server is running. The selector is allowed by the gopher protocol to be more or less any string used by the server to identify the file or menu which the client wants. The normal transaction is the server sends the contents of a cachefile to the client which displays a menu based on it. The user picks an item and the client sends the selector to the server. This tells the server to send a document to the client if that was what the user chose or a new menu (i.e. another cachefile) if the user chose that. For gn the selector in a cachefile entry produced by mkcache is precisely the contents of the "Path" entry in the menu file. Thus a menu file which contains Name=My favorite file Type=0 Path=0/dir1/dir2/filename will result in a primary cachefile line like 0My favorite file 0/dir1/dir2/filename host.edu 70 or for a directory a menu which contains Name=My favorite directory Type=1 Path=1/dir1/dir2 will result in a primary cachefile line like 1My favorite directory 0/dir1/dir2 host.edu 70 The secondary cachefile lines for these entries (as described above) would be text/plain and /text/html respectively indicating that the first is a plain text file and the second though a menu will be converted to and html document for HTTP clients. Neither has a suffix so this entry is empty in the secondary line. When the gn server receives a selector like 0/dir1/dir2/filename it knows it should send the file rootdir/dir1/dir2/filename to the client and it knows this is a text file (because of the leading type 0). However, before sending it the gn server checks the cachefile which contained the entry with this selector to make sure the selector is a legitimate one and not one that an unscrupulous client has produced to get access to some private file on the server's host. Gn will send the file rootdir/dir1/dir2/filename if and ONLY if it finds the selector 0/dir1/dir2/filename in an entry in the appropriate cachefile, which by default is the file rootdir/dir1/dir2/.cache. The point is that gn constructs the name of the cachefile by taking the file path, deleting the file name and tacking on .cache. (There is one exception to this for structured files [type 1m]. See the installation guide.) This requires that the cachefile containing an entry for a file be in the same directory as the file and a cachefile containing an entry for a directory be in the parent directory of that directory. This is what ties the filesystem and menu hierarchies together. To decouple the menu and filesystem hierarchies we use an alternate syntax for the selector which gn versions 0.5 and later recognize. In this form the selector for the file above would be 0/dir1/dir2/filename(/dir1/dir2/.cache) We have placed the name of the cache file containing this item in parentheses and appended it to the selector. When gn receives this selector it knows what file is requested and that it is a text file and it knows the name of the cachefile in which to check for security purposes for an entry containing the selector 0/dir1/dir2/filename. (This security check only checks the selector up to the parentheses, since that is all that is necessary to know the file is legitimately being offered by the server). In this form there is no need for the cachefile to be located in any particular place relative to the files its entries reference, nor any need for the cachefile to have the name ".cache". Thus the selector 0/dir1/dir2/filename(/dir3/cfile) is fine provided the file /dir3/cfile is a legitimate cachefile containing an entry whose selector is 0/dir1/dir2/filename, so that gn can check this is a valid file to be sent to a client. There is one other very important difference in the syntax of selectors of this type. Items of gopher type 1 are menus to the client and normally directories to the server, but in this scheme for gn they no longer correspond to directories but to cachefiles which will present a menu to the client. Thus the selector 1/dir1/dir2 in the example above will translate into the selector 1/dir1/dir2/.cache(/dir1/.cache) in the alternate scheme. In other words to specify a menu (even if it corresponds to a directory) you must specify the cachefile with the menu entries. This is really different from the usual form where to refer to a directory the selector is 1/dir1/dir2 -- the path of the *directory*. Now to refer to the same directory the selector contains the path of the *cachefile* containing the items in that directory. It is the presences of the parentheses which allows gn to distinguish these two different syntaxes. Notice that in parentheses we put the cachefile *containing* this entry which is different from the cachefile we want to send, i.e. gn checks /dir1/.cache to make sure that it is ok to send /dir1/dir2/.cache. Of course, as with files, the cachefiles can have any name and be located anywhere. The selector 1/dir1/dir2/cfile(/dir3/another_cache) is fine if cfile is a cachefile containing the items in dir2 (or items in any directory for that matter) and another_cache is a cachefile containing an entry with the selector above. TWO FINAL NOTES: 1. The cachefiles listed in parentheses are used only for security checking -- to make sure permission is given to send the file. Thus they don't have to be "real" cachefiles. You could, for example, have a single file "masterlist" which contained cachefile entries for all the selectors for your whole server and put this in parentheses at the end of each selector, like 1/dir1/dir2/cfile(/masterlist) The masterlist file would be used only for security checks and never actually be a menu. Moreover the security check looks only at the selector in a cachefile entry, which it takes to be everything between the first and second character on the line. Thus masterlist could be a list of all valid selectors for the server, each on a line preceded and followed by a with nothing else on that line. I should point out that I haven't actually tried this and it might be inefficient if masterlist is large. 2. To do grep type searches of the files in a menu the search menu item's selector should look like 7g/dir1/dir2/cfile(/dir3/anothercache) where cfile contains the entries you want to be grepped and anothercache contains either the entry with this selector or an entry with selector 1s/dir1/dir2/cfile(something_else) either of which indicate that the files listed in cfile are permitted to be searched. The point is that the "g" in the "7g" at the beginning is optional (for backwards compatibility) with the old syntax, but required for the new "parenthesis" syntax. USING DECOUPLING WITH ACCESS CONTROL One interesting thing which can be done with the decoupling feature of gn is to make your root menu (or any other menu) appear different to different hosts. Here's an example that sets up a root menu which shows one item to all hosts, an additional directory to any hosts in one selected group and an additional file to any host in a second selected groups. In your root directory make the subdirectories "hostgroup1" and "hostgroup2" which contain the files and directories that you want to be visible only to a certain group of hosts. In this example we assume that there is one file in /hostgroup2/privatefile and one directory /hostgroup1/dir. In each of the directories /hostgroup1 and /hostgroup2 put a .access file containing the hosts permitted to see the information in that directory. The menu file for the root directory would look like ########################################## Name=Public Stuff # This is visible to any host Path=1/public Name=Directory for Group 1 only # This is invisible except for group1 Path=1/root1/dir/.cache(/root1/.cache) Name=File for Group 2 only # This is invisible except for group2 Path=0/root2/privatefile(/root2/.cache) ########################################## While the menu file in root1 would contain ########################################## Name=Directory for Group 1 only Path=1/root1/dir/.cache(/root1/.cache) ########################################## and the one for root2 would contain ########################################## Name=File for Group 2 only Path=0/root2/privatefile(/root2/.cache) ########################################## Directories below /hostgroup1/dir would be ordinary, i.e. they need not use the decoupling mechanism. .