Tutorial – Guide to Effective Searching of the Internet
[ Previous | Home | Index | Next ]
Filters provide a different dimension or perspective by which you can "slice and dice" your search results. They are totally independent of the query. Filters determine the population to which a given query can apply.
Filters provide a useful complement to queries to target and restrict your results. |
Most of the major search engines support filters to greater or lesser degrees. Some also offer filter capabilities unique to themselves. For certain specialty searches or needs, you can use these unique filter capabilities to great advantage. You may want to check out the comparison chart in Topic 38 to see how the major engines stack up and which unique capabilities they offer.
Topic 30: Site Filters
Topic 31: Size Filters
Topic 32: Date Filters
Topic 33: Specialty Filters and Search Options
You may click on any of the topics above to go directly to it.
| http: |
//completeplanet |
.com |
.us |
/works/howitworkscallout.asp |
| |
|
|
|
|
| 1 |
2 |
3 |
4 |
5 |
- The http:// is a standard prefix to all Web site addresses. You may not even see it in all cases, because if it is lacking, your browser assigns the prefix to the URL. You should ignore it when using site filters (in other words, DO NOT enter it or use it!).
- The www.completeplanet is the subdomain name. It often has a www prefix (for World Wide Web), or it may not. You can generally ignore the www in any case with site filtering. The subdomain is all information that appears between the http:// and the major domain or country name (3 and 4). It can sometimes appear in multiple parts, especially for larger organizations that may have multiple servers accessing the Internet. For example, for an educational institution you might see bigserver1.mystateu shown as the subdomain name. Most often the subdomain contains identifying information about the organization (cornell, microsoft, ibm) that can be very useful to search on.
- The generic, major domain name (com, in this case) is shown in this field. This is one of the broadest and most useful site restrictions you can apply to specialty searches. The use of generic domains is heavily oriented to United States sites. Right now, the major domain names are:
com — companies and commercial sites
edu — educational institutions
gov — government organizations
mil — military organizations
net — Internet service providers and services
org — non-profit organizations
These major domains are now being expanded to include:
arts — entities emphasizing cultural and entertainment activities
firm — businesses, or firms
info — information service providers
nom — for those wishing individual or personal nomenclature
rec — emphasizing recreation/entertainment activities
store — businesses offering goods for purchase
web — entities emphasizing activities related to the World Wide Web
- Country domains (also known as geographical or ISO3166 domains) are the top-level domains maintained by every country and territory in the world. These domains are organized by locality, and are useful to organizations and business who wish to operate overseas OR who want to protect their company or brand identity. Like generic domains, country domains are accessible to any user of the Internet. Country domains have two-letter designators, e.g. .fr for France, .uk for the United Kingdom, .au for Australia, .us for the United States (not generally used), etc. There are over 230 top-level geographical domains, of which about 190 currently accept domain registrations. You may obtain a complete listing of these abbreviations from [37].
- All information prior to this point identifies how to get to the given physical location where your Web documents reside. Field 5 represents the path and specific Web pages at that location internal to that site. This field can contain useful information, such as howitworks, but is sometimes quite cryptic and often can be quite long. Note that absent a designation in this field you are generally directed to the home, index or main page of the given site. Also note that some engines that support site filtering do not allow you to search in this field.
Generally, fields 2, 3 and 4 are the most useful to use when restricting sites. 5 is subject to much variation and is not always supported. We recommend that you only use it when you have advance information or specification of the given document(s) for which you are looking.
When using site filters, you need to be careful that you don't enter too broad a specification. For example, using 'com' as a site filter specification would result in including sites with the '.com' domain as well as sites such as commonplace.edu, commercial.net or markettips.org/commercialization.aspl. Attentive use of periods ('.') and slashes ('/') can help narrow your restrictions for those search engines that support the site filtering feature.
Presently, no major search services are known to filter documents by size.
Date filters can be especially useful when doing research on time-sensitive information. Depending on the engines that support this feature, you can restrict retrievals to documents modified since a certain date or within a range of dates.
Date filtering provides a good argument for keeping a record of your exact query and its date for very important searches. Then, should you want to see what results have been updated or added to the Internet since your last search, you can simply re-submit the initial query and select the appropriate date restriction.
There is a caveat to date filters, however. The dates shown used by the engines are (generally) the date the page was indexed, not created. (Date created fields are available to Web developers, but not all use them. Also, not all engines read this field, anyway.) Some search engines are running days to weeks behind in indexing pages. To prevent possible gaps in your date searches, you may want to consider moving the start date back by three weeks or so from the absolute date you want to filter.
In the competitive race to provide more features, many search engines are providing specialty filters and search options. For a listing of these features by major services, see Topic 38; for a listing of our specialty options. Here, however, we describe what options are available. Please note these options are supported by only a limited number of services. Also note that these features may be described slightly differently by different services; consult their specific help files.
- People's Names — only provided by Yahoo as a specific option (use of Four11); can be accomplished with other services that support mixed capitalization. Also, though not a specific option, AltaVista will search for any name entered in place of a URL. In addition, there are special engines on the Internet specifically for finding people, such as Switchboard. See the section on specialty engines, Topic 39.
- Depth — provides the ability to retrieve additional pages from a given site; 'depth' represents the nested levels to retrieve
- Anchor — finds pages that contain the specified word or phrase as contained in a link. For example, 'Click here to download' could be text associated with a link. If specified with this option, documents that contain this phrase would be scored as a result
- Applet — identifies documents with Java applets corresponding to the name provided
- Domain — finds documents restricted to the country or generic domain specified
- Host - finds documents on the specific computer specified for 'host'
- Image - identifies documents with images (graphics) corresponding to the filename specified
- Link — finds documents with links to the URL specified as the argument
- Title — identifies documents that contain the word or phrase specified in their titles
- URL — finds documents whose URLs match the word or phrase specified
- File/Media Types — identifies documents which contain the file or media type specified; useful, for example, in finding documents with audio or video
- Business Document Types — restricts retrieval of documents to those matching the document types of press releases, product reviews or job listings.
[ Previous | Home | Index | Next ]
|