6. How to Perform a Search in a Fields with Names and/or Surnames


A. Surnames

Surnames are a big problem because they can contain apostrophes like in O'Neil, accents like in Menabò, or because they can be compound like Dalla Libera or double, like Conte Camerino.
So, at BioPD we decided to index SURNAMES following the rules reported in the following examples.

a. Simple Surname
Example: Saggin
No particular rule is followed in this case. The search should be performed like that in the LIVE example below.

Surname
Name

b. Surname (or Names) with Accent on the last character
Example: Menabò
In this case we indexed the term as Menabo' changing the accent with an apostrophe (ASCII 39).
In this case one should enclose the surname in between quotes because it follows the apostrophe rules.
See also the document How to Perform a Search of Words Containing Apostrophes.
Try the following example.

Surname
Name

c. Compoud Surnames (1)
Example: Dalla Libera.
This is indexed as composed of two different terms: dalla and libera. At BioPD we chose to have the Boolean Operator AND as the default one between terms in each field, so the search should be performed like that in the following LIVE example.

Surname
Name

d. Compoud Surnames (2)
Example: Dell'Antone
Also in this case, like in the accented surnames, one should enclose the surname in between quotes. In fact apostrophe (ASCII 39), follows particular rules.
See also the document How to Perform a Search of Words Containing Apostrophes.
Try the following example.

Surname
Name

e. Complex (Double) Surnames
Example: Conte Camerino
This is indexed as composed of two different terms: Conte and Camerino. At BioPD we chose to have the Boolean Operator AND as the default one between terms in each field, so the search should be performed like that in the following LIVE example.


B. Surnames and First Letter of the Name

Another problem related to fields containing SURNAMES is finding a subpopulation of people all having the same surname and the same first letter of the name.
Suppose we want to find all people whose surname is Smith and whose name begins with J.
At the moment at BioPD, we do not have databases with these features, but we plan to index fields containing both the surname and the initial of the name connecting them with a - (minus, hyphen) because freeWAIS-sf DOES NOT allow to index one letter word.
We also plan to index this field with stemming on. This way searches can be performed in two ways:

Example #1


In this case only people called Smith J. are found.

Example #2


In this case all people whose surname stem is Smith are found.

On the contrary, if the field has full surname and name, this rule is NOT followed, of course.



THIS PAGE REFERENCES:
© 1996-97 BioPD - University of Padova - Author: Leopoldo Saggin
Mail to: lsaggin@civ.bio.unipd.it - Last Revision: August 21, 1997
Tested on Netscape 1.22 and higher