|
(SIOUX
FALLS, SD) — The World Wide Web just got 500 times larger. BrightPlanet,
an Internet content company, has completed the first-ever study
documenting the "deep" Web, a massive storehouse of databases and
information that is unseen to existing search engines.
In addition, BrightPlanet researchers today released unique technology
called the LexiBot™ that automatically identifies deep Web sites, and
then retrieves, qualifies, and classifies content from all relevant
sites with pinpoint accuracy. Nearly 20,000 of these deep Web
searchable sites are listed on the CompletePlanet Web site, also
released today.
"Others have
termed searchable databases the 'invisible Web,' a misnomer because the
content is only 'invisible' to search engines, not to our direct-query
search technology," Mike Bergman, BrightPlanet's Chairman said. "But
frankly, what's been missed until now is the absolutely huge scale,
importance and quality of information within the deep Web."
The BrightPlanet study estimates there are more than 100,000
content-rich searchable databases publicly available within the deep
Web. Bergman said these sites collectively have information relevant to
any need, citing as examples IBM's patent site, 10KWizard's database of
SEC company filings, genome databases, the Costa Rica Supersite,
genealogy records, historical sports statistics, NIH PubMed biomedical
publications, and law cases and decisions.
Other findings from BrightPlanet's 41-pp white paper, The "Deep" Web: Surfacing Hidden Value are:
- The
deep Web contains nearly 550 billion individual documents compared to
the 1 billion of the "surface" Web indexed by search engines
- The deep Web contains 7,500 terabytes of information, compared to 19 terabytes of information in the surface Web
- The deep Web is the fastest growing category of new information on the Internet
- Total quality content of the deep Web is at least 1,000 to 2,000 times greater than that of the surface Web
- Deep Web content is highly relevant to every information need, market and domain
- A full 95% of the deep Web is publicly accessible information - not subject to fees or subscriptions
The reason the deep Web has been hidden in plain sight is today's
reliance on search engines for content discovery on the Web. Existing
search engines catalog the surface Web using spiders or crawlers that
follow links on static Web pages, akin to ripples spreading across a
pond.
The deep Web is made up of
searchable databases, with results that are only served up dynamically
in answer to a direct query. Though search engines may point to the
doorways of these databases, they can not find or search the contents
housed inside. Search engines can knock on the door but not get in.
BrightPlanet's technology uniquely and automatically identifies deep
Web sites and retrieves their contents. The technology asks a direct
query to information sources: 'Do you have what I want?', in a distinct
language that the sources understand. BrightPlanet's direct-query
technology searches multiple sources simultaneously and then uses
proprietary computational linguistics techniques to automatically
qualify and organize only the most relevant results.
Thane Paulsen, BrightPlanet's General Manager, likens traditional
search engines to trawlers moving through the ocean, using coarse nets
that are wide, but only reach a few feet deep. He contrasts
BrightPlanet's technology to multiple fishing lines precisely guided by
sonar to find, capture and pull up specific information from the deep
and surface Web.
"What this means in practical terms is that entire new realms of
information can now be made available through the use of our
technology," Paulsen said. "We're packaging our technology to provide
automated quality content to Web-enabled businesses, as a standalone
search tool for individual power searchers, and through our own Web
site as a service to the public."
"Our search technology can be easily customized to meet the specialized
needs of vertical markets," Paulsen said. "By pre-qualifying content
from the surface Web and industry-specific deep Web databases, we can
automatically build a superior server-side search function for B2B
portals or other content sites. We are now licensing this remarkable
technology to businesses seeking a way to differentiate their Web sites
in a competitive marketplace."
Professional Internet searchers can also use BrightPlanet's technology
through a standalone search tool called the LexiBot™, available to the
public for $89.95 (Currently $289.95 for Version 2.5) after a free
thirty-day trial. LexiBot customers can simultaneously search the
content of both surface Web search engines and deep Web databases to
obtain accurate, sorted results automatically. "We're also configuring
the LexiBot with market-specific enhancements as a private label
client-side search tool for business," Paulsen added.
BrightPlanet also unveiled today its CompletePlanet Web site, the
premier portal to search engines and databases, with nearly 20,000
search sites listed. "We intend CompletePlanet to remain the complete
source to all things search on the Internet," Paulsen said. "It is also
an example of how we automatically harvest, qualify, organize and
summarize relevant content for other markets."
BrightPlanet's deep Web white paper is available online or for download from http://www.completeplanet.com/Tutorials/DeepWeb/index.asp.
And BrightPlanet's CompletePlanet access point to search engines and searchable databases is found at http://www.completeplanet.com.
For more information or to set up interviews, contact Bryan Bjerke, Media Relations, at 605.331.6012.
***
BrightPlanet's Enterprise Services division is a premier provider of
Internet content infrastructure and search data to Web-enabled
businesses using automated technologies for content discovery,
retrieval, aggregation, qualification and classification. Its
Professional Services division provides search content tools and
authoritative search sites to individual information professionals.
BrightPlanet.com LLC is a privately held company founded in 1999 and based in Sioux Falls, SD. Its corporate Web site is at http://www.brightplanet.com.
|