Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies.
Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.
Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Free Software Foundation.
| LOCAL | Terms indexed in this way can be found only making a search inside a specific field |
|---|---|
| GLOBAL | Terms indexed in this way can be found making a global search similar to that made with old WAIS |
| BOTH | Terms indexed in this way can be found both making a search inside a specific field or making a global search |
It is the decision of the System Manager whether the content of a field must be indexed as LOCAL, GLOBAL or BOTH. Moreover, one can or cannot index some fields, saving space. For example, in a phonebook it makes no sense to index e-mail or address.
| Characteristics | Description | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Terms length |
| ||||||||
| Non-indexed terms |
| ||||||||
| Indexed characters |
| ||||||||
| Maximum number of hits | The maximum number of hits one can retrieve is NOT clear defined. In the words of Norbert Goevert: "I don't know the limits. They seem to be deep in the freeWAIS-sf stuff:-(". From a pratical point of view the maximum seems to be around 250 hits. Languages | SFgate supports booelan operators and messages in the following languages: English, French, German, Italian,
Spanish, Dutch, Swedish, and Portoguese. | So, some words of these eight languages are reserved. English and Italian are supported at BioPD. Lowercase/Uppercase | For reasons explained in the document Boolean Searching in freeWAIS-sf, it is always better to make query using lowercase terms.
| |
freeWAIS-sf uses the original WAIS Protocol. This protocol was designed just for transporting a free text query which means a list of searchable terms separated by spaces. U. Pfeifer, the author of freeWAIS-sf, decided deliberately to use this old protocol so that all clients out there could use these new features.
So he had to encode a new and richer query semantics in the query string. This means that the query had to obey a certain syntax and consequently a user might get a syntax error when submitting a query. The goal of Pfeifer was to make the syntax as easy as possible and especially leave simple free text queries valid.
In the query, categories to be searched should have been selectable
for each term. To leave the original queries valid (and to support casual users)
Pfeifer provided a default category, which is used if no category is
specified in the query.
Here an outline of the query language is given.
Please note that these examples refer to classic clients
(waissearch, waisq, etc...) NOT to
SFgate ones.
You can find LIVE examples for SFgate choosing documents in the
HOWTO section.
| molecular biology | Free text query in the global index. Find all documents containing terms molecular OR biology in ALL the fields indexed as GLOBAL or BOTH |
| molecular or biology | Same as above |
| ti=molecular biology | Find all the documents which have the term molecular in the field TITLE OR the term biology in a field indexed as GLOBAL or BOTH |
| ti=(molecular biology) | Find all the documents which have the term molecular OR the term biology in the field TITLE |
| ti=(molecular or biology) | Same as above |
| ti=(molecular and biology) | Find all the documents which have the term molecular AND the term biology in the field TITLE |
| ti=(molecular not biology) | Find all the documents which have the term molecular in the field TITLE while the term biology could be present in all other fields indexed as GLOBAL or BOTH except for the field TITLE |
| py==1990 | Find all the documents which have the field PUBLICATION YEAR numerically equal to 1990 |
| py<=1990 | Find all the documents which have the field PUBLICATION YEAR numerically equal or less than 1990 |
| py>1990 | Find all the documents which have the field PUBLICATION YEAR numerically greater than 1990 |
| ed<19930101 | Find all the documents which have the field EDITION DATE
older than January 1, 1993.
This is in fact a Date Search whose format is yyyymmdd
where:
|
| au=(soundex salatan) | This is a soundex search. Match terms which sound like to salatan, eg. "Salton" |
| ti="molecular biology" | This is a phrasal (literal) search. Find all the documents containing the term molecular immediately followed by the term biology in the field TITLE. Please note the use of quotes to delimite a literal search |
| mol* | This is a global wild-card search. All the documents containing terms having mol as a stem in all the fields indexed as GLOBAL or BOTH, will be found |
| (molecular w/10 biology) | This is a proximity search. With this feature, you could search for all the documents that have terms molecular and biology within 10 terms of each other. The order of the terms is not important, i.e., molecular can precede or follow the term biology. It is NOT implemented at BioPD |
| (molecular pre/10 biology) | This is a proximity search too. In this case the order of the terms is important. In fact this search will find all the documents that have molecular up to 10 terms before biology. It is NOT implemented at BioPD |
| ti=(molecular w/2 biology) | Another case of proximity search. In this case proximity works within the field TITLE. It will find all the documents which have in the TITLE the term molecular within 2 terms of biology. Please, note that you must use parentheses around the terms you want to look for in the field. It is NOT implemented at BioPD |
| (atleast/10 biology) | This is a "at least" search. It is a particular case of proximity search. Finds every document that has at least 10 occurrences of biology. The atleast condition has to be all lower-case and there cannot be any spaces between 'at' 'least' and the number, i.e. at least 10 biology will not work. It is NOT implemented at BioPD |