spacer.png, 0 kB



Tutorial – Guide to Effective Searching of the Internet

 

[ Previous | Home | Index | Next ]

Part 6: Advanced Construction

This part builds on the Boolean operators and basic search concepts previously discussed to show how they can be combined into effective, complete queries. Much of the discussion concerns how to construct proper syntax. This part ends with a reprise of our sample search problem for Jan's mystery bird [see Topic 5].

Topic 19: Use of Parentheses
Topic 20: Combining Concepts for Power Searching
Topic 21: Punctuation and Capitalization
Topic 22: Multiple Queries and Query Refinements
Topic 23: Sample Information Problem Revisited

You may click on any of the topics above to go directly to their discussion.

The guidance below, however, should be generally applicable to most engines that support structured, Boolean syntax:

Topic 19: Use of Parentheses

Search services that support structured (Boolean) syntax do not always read from left to right like we do. Instead, they read "inside-out", in order of the nested levels of arguments set off by parentheses. Each bounded argument set off by parentheses is called a Boolean expression. (The entire query is also assumed to have parentheses around it, whether you put them in or not.) This is the same concept drummed home in high school math in how to evaluate an algebraic expression.

Learning how to construct this Boolean syntax structure is easy. You only need to remember four things:

  1. You define a Boolean expression through use of an open parenthesis ['('] to begin it, and a closed parenthesis [')'] to end it
  2. Make sure the first search concept you want evaluated is at the inner-most level of your Boolean expressions; followed by subsequent expressions in your desired order
  3. Make sure you have a balanced (equal) number of open and close parentheses in your entire query
  4. Expressions at the same "level" are read in order, from left to right.

It is really worth your time to master these simple rules. It adds immensely to your control over your queries and their ability to return the results you desire.

Though some search services support quite a few layers of nested Boolean expressions, in practice the amount of nesting you need or is even desirable is quite low, likely no more than three at most. To show a three-level example, consider the following dummy query: THIRD expression (SECOND expression (FIRST expression evaluated) evaluated) evaluated.

Note, you do not need to put parentheses around the entire query; the outermost layer is evaluated last in any case. But, even when you think the computer is going to do what you want, it is always safer to use parentheses if there is even a chance of confusion. Parentheses will also help you read your own searches.

Search Tip:
Don't heavily "nest" your parentheses. Remember, keep it simple!
In the absence of any nesting, or with expressions at equivalent levels, the order of query interpretation is from left to right. For example: FIRST expression AND SECOND OR THIRD AND FOURTH or, (FIRST main subject) AND THIRD expression AND (SECOND expression).

AS A GENERAL RULE, YOU SHOULD ALWAYS PLACE YOUR MAIN SUBJECT TO BE EVALUATED FIRST. This is because many search engines determine the rank order of document results by relevance, with first query terms to be evaluated ranked higher. This rule can be a bit tricky until you get used to it. For example, taking the last query example above, but forgetting the initial set of parentheses shown, produces the following: SECOND main subject AND THIRD expression AND (FIRST expression).

Using the form above, if you placed your main query subject first in your query expecting it to be evaluated first, you would get the unintended consequence of having it evaluated second.

Search Tip:
Don't assume an evaluation order. Specify the order you want by using parentheses.
Finally, Boolean operator precedence is enforced by most search engines with AND and AND NOT being evaluated before OR. If you have doubts of operator precedence, consult the help system for the search engine being used. Our recommendation: eliminate ambiguity as to how a given engine treats operator precedence by explicity putting your expressions into parentheses in the evaluation order you desire.

The OR operator should generally be used solely within nested expressions, and then mostly to capture synonyms.

For example, you may recall from our sample problem of Jan's mystery bird [Topic 5] that Jan wanted the concept of having seen the bird in the city as part of the query. Also recall there is a problem with picking up too many unwanted words when city is truncated as cit*. A good way to handle this problem is with a nested Boolean expression using OR. Thus, to capture both the singular and plural forms of city, Jan would write: (city OR cities). This expression now covers the singular and plural without inadvertently adding undesired words (such as 'citizen' or 'citrus') to the query term list.

Whenever you mix Boolean operators in a query you should always use parentheses to force the evaluation order you want. This helps avoid unintended consequences. For example, the following query (without parentheses): hawks AND eagles OR falcons AND owls OR vultures Is actually evaluated as: (hawks AND eagles) OR (falcons AND owls) OR vultures.

The result of this expression is not very useful. The expression does not require any one term. You could end up with pages containing only vultures or only owls and falcons or only hawks and eagles. This is most likely not the way you intended it.

Lastly, there are times when parentheses are not needed. This is when all operators are either AND or OR in the query. For example, hawks AND eagles AND falcons AND owls AND vultures or, hawks OR eagles OR falcons OR owls OR vultures.

The former requires all five types of bird to be included in a successful document; the latter only one. Additional examples of possible pitfall query syntax is shown in Topic 29.

Topic 20: Combining Concepts for Power Searching

A good rule of thumb when searching for relatively hard-to-find information on the Internet is to juxtapose three "concepts" in your query. The first concept should be your subject, defined at the proper level [Topic 10], with synonyms or phrases as appropriate to provide adequate yet accurate subject coverage. The other two concepts should correspond to two of the when, where, how and why concepts discussed in Topic 6.

Each of these concepts should be provided as a Boolean expression with the AND operator connecting all three. In the case of Jan's mystery bird example, the resulting query can be represented as:

icon_chart Figure 10
Example of AND NOT Operator

Search Tip:
Try to link three concepts together in your queries, joining them with the AND operator.
Note how this acts to restrict your final results space. Posing this query to AltaVista in the form: ("peregrine falcon*") AND ("endangered species") AND (city or cities) produces a results set of 1,721 documents, the first twenty of which (at least) directly respond to Jan's desired results [31]. The actual results from this search are discussed in Topic 23.

You should generally not need to exceed three concepts in a successfully constructed query; four is unusual. If you find you can't narrow them to two or three, double check to be sure all the concepts are necessary and all are at the right level.

Topic 21: Punctuation and Capitalization

Not all search engines handle punctuation equivalently. When in doubt, you should consult the help file of the search engine you are using.

Most search engines are insensitive to whether you use upper, lower or mixed case in your queries. If you use lower case, most engines will match on both upper and lower case. For general searches, it is the safest form to use. Where the engine does support upper or mixed case, if you use upper case characters the engine assumes you want an exact match. Most engines also do not care if you use upper or lower case for Boolean operators.

For the few engines that do support capitalization, you can use this fact to advantage in finding proper names or place names. See Topic 38 for the capitalization features of major services.

Topic 22: Multiple Queries and Query Refinements

Strictly speaking, only one current Internet search tool supports multiple, simultaneous queries. However, a number of the search services support being able to pose additional queries to a previous results set [see Topic 38].

These can be very valuable techniques to you as a searcher. It enables you first to cast a fairly broad query, and then successively hone in on desired results. With the search services, you can also use your browser's back arrow to try a search, evaluate, and, if you don't like the results, to back up and start over again.

As you first begin trying more advanced query techniques, we highly recommend that you start with those services that support query refinement. It gives you a way to test out ideas and put into action some of the concepts discussed here.

Topic 23: Sample Information Problem Revisited

In Topic 5, we met Jan, who encountered a mystery hunting bird. Through successive refinement of the subject, Boolean expressions and query syntax, Jan found a listing of 1721 Web documents, the most highly ranked of which met the desired results [Topic 20 ].

Here's what Jan discovered [32, 33, 34, 35]:

  • The mystery bird was a male, peregrine falcon. Nearly lost to extinction, in at least the Eastern U.S., the bird was making a stunning comeback through a combination of breeding-and-release programs and a cleaner environment free of DDT
  • Peregrine falcons had found a natural home in downtown cities, where the building ledges gave them protection as their natural cliff habitats had, and where there were plenty of delectable pigeons to feed on
  • Breeding pairs of peregrine falcons were now found in such urban areas as Cincinnati, Dayton, Columbus, New York City, Cleveland, Toledo, Chicago, Milwaukee, Toronto, Montreal, Philadelphia, Wilmington, Baltimore, Washington, DC, Salt Lake City and Pittsburgh
  • From a base of zero in the 1970s, there are more than 1,000 breeding pairs now known East of the Rocky Mountains
  • Live-cams showing peregrine falcon nests on building ledges are now being beamed 24 hrs per day over the Internet from Toronto, Montreal, Columbus and Pittsburgh
  • Jan's sighting in Minneapolis was the first recorded in that city
  • Tremendous additional information was gained about great viewing sites for peregrine falcons at nature preserves and general information about the species.

Jan came to understand that the recovery of peregrine falcons was one of the great environmental success stories of the past two decades. Jan is presently setting up Minneapolis' own live-cam to monitor the new breeding pair in that city. Jan is also a local celebrity and resident authority on peregrine falcons.

[ Previous | Home | Index | Next ]

 
spacer.png, 0 kB
spacer.png, 0 kB
spacer.png, 0 kB

Sitemap Privacy About Us Contact Us Site Use

spacer.png, 0 kB