^% DUPFIL %^ % by James A. Jarboe IV % The ^DUPFIL^ utility is designed to assist in the process of reclaiming disk space by locating ^duplicate files^ by ^file name^ or by ^hash code^ totals. ^DUPFIL^ provides information about file duplications so that the system operator can make an intelligent decision on which duplicate file to remove to ^free^ up disk space. ^DUPFIL^ outputs to a file, a listing of ^identical^ or ^similar^ files. ^Identical^ files are those that have the same hash code totals and may or may not have the same file name. ^Similar^ files can be defined as those that have the same file name but may or may not have different hash code totals. Not all ^file duplication^ is bad. Depending upon your situation and system set up, it may be necessary to ^have^ duplicate files or different version of certain programs located in different accounts. The culprit we are after is the ^unnecessary^ duplication of files. \ ^% DUPFIL Utility %^ Page 2 of 13 Usage: ^DUPFIL^ {listfile=} wild-spec(s) {switches} Where: ^listfile^ = The file specification where the resulting information is output is to. The default output is to the ^DUPFIL.LST^ file. ^wildspec^ = The wildcard file specification(s) defines which files are to be examined (just like DIR) Examples: DSK0:*.LIT[1,4],DSK3:*.LIT[1,4] WIN0:*.WRT[] ALL:*.*[] DSK0:[], CPU2:DSK3:[] ^switches^ = Optional operation switches as explained below. \ ^% DUPFIL Switches %^ Page 3 of 13 The following switches are supported: /^BYNAME^ This ON/OFF switch controls file name and extension sensitivity. The default is ON (/BYNAME) which will report files of the same or duplicate names. If this switch is turned OFF (/NOBYNAME), then the /BYHASH switch is automatically turned on. /^BYHASH^ This ON/OFF switch controls hash total sensitivity which will report files with the same hash totals. When the /BYHASH switch is used, the /HASH switch is automatically implied. If this switch is turned OFF (/NOBYHASH), then the /BYNAME switch is automatically turned on. /^HASH^ In conjunction with /BYNAME, this switch controls file hash code sensitivity. This will output names of files that are duplicated along with the HASH total of each file. The default is /NOHASH (off). /HASH is automatically on with the /BYHASH option. The action of /BYNAME and /HASH are best described as a table: ^% DUPFIL Utility %^ Page 4 of 13 ^/BYNAME /HASH Action^ ====== ======= =============================== Default-> ^ON^ OFF Identically named files reported. ^ON ON^ Same name files reported w/hash. OFF OFF Impossible combination. OFF ^ON^ Files w/same hash code reported. /^VERSION^ This ON/OFF switch controls reporting of version numbers. If /VERSION is specified, the version number of each reported file is given (if available). The default is /VERSION (On). /^BLOCKS^ This ON/OFF switch controls the reporting of file sizes (in blocks). If /BLOCKS is specified, the size of each reported file in number of blocks will be given. The default is /BLOCKS (On). /^TOTAL^ Controls reporting of file and block totals at the end of each matching set and at the end of the listing. The default is /TOTAL (On). ^% DUPFIL Utility %^ Page 5 of 13 /^ASCEND^ Controls reporting sequence. /ASCEND causes files of the same name or the same hash totals, to be reported in ascending size sequence. /NOASCEND causes files to be reported in descending size sequence. The default is /ASCEND (on). /^TYPE^ Controls the reporting of the file's type (^R^)andom or (^S^)equential). The default is /TYPE (On). Use /NOTYPE to disable the reporting of file types. When the /TYPE switch is used, the letter ^R^ will be output for a Random file and the letter ^S^ will be output for a Sequential file. /^MIN:^n Reports only those files of (^n^) size or greater. The default value is ^0^. Any file, with a Block size smaller than the indicated "^n^" size, will be ignored. \ ^% DUPFIL Utility %^ Page 6 of 13 /^MAX:^n Reports only those files of (^n^) size or less. The default value is the largest possible size of any AMOS file (2.x extended directory limit). Any file, with a Block size Greater than "^n^" size, will be ignored. /^KILL^ Deletes the output file if it already exists. If you specify an output file specification that already exists, ^DUPFIL^ will ^not^ delete the existing file unless the /KILL switch is set. The default is /NOKILL (off). /^WAIT^ Waits for files locked via LOKSER to become unlocked before processing file. This option is only valid when the /BYHASH, /HASH or /VERSION option is used. If a file is ^locked^ for access, ^DUPFIL^ will pause until file is unlocked so that it can process the HASH or VERSION number. The Default is /NOWAIT. In the /NOWAIT mode, ^locked^ files will be ^bypassed^ and indicated on the output terminal screen. If the /WAIT \ ^% DUPFIL Utility %^ Page 7 of 13 switch is used, ^DUPFIL^ will halt processing until the file is ^unlocked^ or until the user presses a .H - -^C-. The current locked file will be bypassed and processing will then resume with the next file specification. .H ^ /^EXT^ This ON/OFF switch controls the sensitivity of checking a file's extension. The default is ON (/EXT). This is the normal state of file name comparison. There my be instance where one might want to check only the file name and not the file's extension. With the switch OFF (/NOEXT), only a file's name will be compared for a match and the file's extension will be ignored. /^CREATE^ Outputs the file's creation date. (Available only on 2.x extended devices.) /^UPDATE^ Outputs the file's last update date. (Available only on 2.x extended devices.) \ ^% DUPFIL Utility %^ Page 8 of 13 /^SHOW^ Displays file names as they are processed. Normally, ^DUPFIL^'s progress is represented by a statistical progress display. Using the /SHOW switch will force ^DUPFIL^ to display each file name as it is processed. Displaying each file name during processing will slow down the ^DUPFIL^ procedure by having to output the file name for a terminal display. The default is /NOSHOW (off). /^?^ This switch will list on the terminal all available switches and their functions. \ ^% DUPFIL Utility %^ Page 9 of 13 Only enough of the switch specification to be unique needs to be provided. E.G., the following are valid switch specs: /^BYN^ (Same as /^BYNAME^) /^BYH^ (Same as /^BYHASH^) /^NOBYN^ (Same as /^NOBYNAME^) /^TO^ (Same as /^TOTAL^) /^NOT^ (Same as /^NOTOTAL^) /^MI:4^ (Same as /^MIN:4^) Switches may be specified anywhere after the program name. ^All^ switches are ^global^ in their action. For example, the following two commands are functionally identical: ^ DUPFIL/MIN:50/V ALL:*.WRT[100,45], DSK2:*.T[] DUPFIL ALL:*.WRT[100,45]/V, DSK2:*.T[]/MIN:50^ \ ^% DUPFIL Utility %^ Page 10 of 13 Examples: -> ^DUPFIL DSK0:*.TXT[],DSK1:*.TXT[]/HAS^ Output to DUPFIL.LST a list of all .TXT files with the same name which are on DSK0: or DSK1:. The hash code of each file is also included. -> ^DUPFIL TEST.LST=ALL:*.WRT[]/V/HASH^ Output to TEST.LST a list of all AlphaWRITE files with the same name which are anywhere on the system. Report the version number of each file and the hash total of each file. -> ^DUPFIL ALL:*.RUN[]/BYHASH^ Output to DUPFIL.LST a list of all .RUN files with the same hash total, irrespective of the file names. -> ^DUPFIL ALL:*.RUN[]/NOBYNAME^ Output to DUPFIL.LST a list of all .RUN files with the same hash code, irrespective of the file names. \ ^% DUPFIL Utility %^ Page 11 of 13 Internally, the operation of ^DUPFIL^ is quite simple. It creates a unique file containing a complete file specification, size of the file, and type of file. If the /VERSION switch and/or the /HASH or /BYHASH switch is selected then the file's version number and/or hash code total are also output to the unique working file. Once this task is completed, the unique file is sorted in ascending or descending order depending upon the switch setting on the command line. When the sorting is complete, ^DUPFIL^ will scan the unique file for file name matches or hash code total matches depending upon the switch setting and output the desired information to the ^DUPFIL.LST^ file or the file name specified on the command line as the output file specification. In order to complete this task, ^DUPFIL^ requires disk space as a work area during its operation. The more files ^DUPFIL^ needs to examine, the more disk work space is required. The ^DUPFIL^ program will do a check to see if there is enough ^available free^ disk space to process the selection and will inform the user if there is NOT enough available free disk space to complete the task. ^DUPFIL^ will always create the working file and sort the file on \ ^% DUPFIL Utility %^ Page 12 of 13 the device that the job invoking ^DUPFIL^ is currently logged on to. The resulting output file can be routed to a different device as provided in the output file specification on the ^DUPFIL^ command line. It is advised that you keep the wild-spec as concise as possible or insure you are logged to a disk with sufficient ^free^ space. ALL DISK WORK space IS returned when DUPFIL finishes. The /^HASH^ switch or the /^BYHASH^ switch causes ^DUPFIL^ to do a considerable amount of extra ^disk access^ and work to acquire hash totals and will slow the operation down. To a much lesser extent, /^VERSION^ also causes extra disk accesses. ^DUPFIL^ is compatible with all versions of AMOS 1.3, AMOS 1.4, and AMOS 2.x releases. ^DUPFIL^ is also compatible with all versions of ^AMSORT.SYS^ on AMOS 1.3, AMOS 1.4, and AMOS 2.x releases. ^DUPFIL^ knows which version of ^AMSORT.SYS^ is on the system and will sort correctly based on the current version you are using. \ ^% DUPFIL Utility %^ Page 13 of 13 To convert this to an ^AMOS^ style help file use ^DSH2AM.LIT^ available on the AMUS Network. Any ^comments^, ^suggestions^, for ^DUPFIL^ should be made to: ^James A. Jarboe IV^ Educational Video Network, Inc. 1401 19th Street Huntsville, Texas 77340 (^409^)^295^-^5767^ Send Email to ^GR^/^AM^ on the ^AMUS Network^. .