GenBank is the databank of all published nucleic acid sequences. You can use the GenBank questions to search for and fetch entries based on Accession number (a number unique to each entry) Description, Locus name, Keywords, Source, Organism, Authors, and Title of the journal article. The currently installed GenBank is described in the "genbank-release-brief" and "genbank-release-doc" entries, including it's release number, number of loci and bases. GenBank databank is maintained and distributed from NCBI at the National Library of Medicine, at ncbi.nlm.nih.gov. The version installed for searching at IUBio is updated weekly with current data, and as new full releases come out. The Swiss-Protein databank is a databank of protein sequences translated from EMBL nucleotide sequence database. It is maintained by Amos Bairoch and colleages at University of Geneva and European Bioinformatics Institute. Entries can be likewise fetched from the copy at this archive by searching for words in the descriptive part of each entry, including Id, description, keywords, taxonomic source and reference information. Items in this folder: ----------------------- Search GenBank Search GenBank (gopher form) Search Genbank (html form) These items all will let you search the GenBank databank, including weekly updated entries to the most current release. The 1st item provides a very simple search dialog where you enter a query on one line. It is compatible with the basic gopher0 protocol. The other two items are gopher+ and html forms that prompt you for more options, including output formats. The default search finds records matching all words in your query. Use 'and', 'not', and 'or' boolean terms and grouping symbols '{' '}' to refine a complex search. Use '*' (as in gen*) to match partial words. Use quotes '' or double quotes to match a "literal phrase". Use 'ref={pace and brown}' to match words in a specific field. Fields for GenBank searches are ref=references, source=taxonomy, def=definitions, key=keywords, (... later date) genbank-release-brief.txt genbank-release-doc.txt These items describe the most current release of GenBank databank. Search Swiss-Protein Search Swiss-Protein (gopher form) Search Swiss-Protein (html form) These items let you search the Swiss-Protein databank. You can write complex queries, as with the GenBank searches, to narrow your search down. The Boolean terms, grouping symbols "{}", wild-card "*", and quotes work also for these searches. Fields that are defined for Swiss-Protein data include ref=references, source=taxonomy, def=definitions, key=keywords, acc=accession number, id=id field, gene=gene link=cross-referenced data, medline=medline number, date=date of entry/update in YRMODA numeric format (940715 is 15 of July 1994). Use date>910101 and date<950515 to select range. swiss-prot.manual This item describes the current release of Swiss-Protein databank. Output Formats -------------- There are several features of these searches that are accessable if you use gopher+ or html client software. This includes a choice of output formats for the sequence queries. Of these formats, the text/plain will return the native, original format of the databank. For Genbank, this is the same as biosequence/genbank format, and for Swiss-Protein, this is the same as biosequence/embl. The other biosequence formats are produced by readseq software. This software will strip out most of the documentation of the entries, preserving the sequence data. Use these options if you prefer specific biosequence formats and don't care for the documentation. The application/rtf format is like text/plain, but includes codes for displaying a bit more nicely in common wordprocessors that understand rtf format. The text/html format is, or will be, formatting for html browsers. 1. text/plain (default) 2. application/rtf 3. biosequence/genbank 4. biosequence/fasta 5. biosequence/gcg 6. biosequence/embl 7. biosequence/nbrf 8. biosequence/phylip 9. biosequence/msf 10. biosequence/paup 11. biosequence/ig 12. biosequence/asn1 13. biosequence/pir 14. text/html More on how to use the search questions: The default WAIS and Gopher search provides an implicit "or" between words, so that all items matching any of the words in a query are returned, but those items with the most matches are at the top of the list. Only full words are matched. Case of letters is ignored. By default, all symbols are ignored and treated as word breaks. Only letters and digits are considered word parts. New features for searches thru IUBio Gopher and WAIS include boolean operators "and" and "not", partial word matches, literal phrase matches, and extended number of results. Boolean searches: The terms "and" and "not" are effective in modifying the query. For example, Query: red and green not blue Result: just those records with both the words "red" and "green", excluding all records with the word "blue". Partial words: The asterisk (*) applied at the end of a partial word will match all documents with words that start with the partial word. For example, Query: hum* Result: all records with "hum", "hummingbird", "human", "humbug", etc. Literal phrases: If quotes (') or double quotes (") surrounding a phrase, it will match that phrase exactly. For example, Query: "red rooster-39" Result: only those records with the the full string 'red rooster-39' will be matched. There are some practical limits on this. The first part of a literal must be a word that is otherwise indexed. Thus your literal cannot start with a symbol or other word delimiter. Within quotes, the boolean operators and the partial word key are not active. These features can generally be mixed in a query, for example: Query: "Df(32)-[34]red" and hum* not Brown Results limit: The maximum number of results that are returned for a query is by default up to 100. But you may set a higher, or lower, maximum by using the "maxrec=" field followed by a value, in your query. For example, Query: brown and cow* or "red rooster" maxrec=300 Result: up to 300 matches will be returned. These genbank searching modifications are available to others. See the main entry "This Gopher/" for more info. -- Don Gilbert, gilbertd@bio.indiana.edu .