spacer.png, 0 kB
BrightPlanet Unveils the 'Deep' Web: 500 Times Larger than the Existing Web
(SIOUX FALLS, SD) — The World Wide Web just got 500 times larger. BrightPlanet, an Internet content company, has completed the first-ever study documenting the "deep" Web, a massive storehouse of databases and information that is unseen to existing search engines.

In addition, BrightPlanet researchers today released unique technology called the LexiBot™ that automatically identifies deep Web sites, and then retrieves, qualifies, and classifies content from all relevant sites with pinpoint accuracy. Nearly 20,000 of these deep Web searchable sites are listed on the CompletePlanet Web site, also released today.

"Others have termed searchable databases the 'invisible Web,' a misnomer because the content is only 'invisible' to search engines, not to our direct-query search technology," Mike Bergman, BrightPlanet's Chairman said. "But frankly, what's been missed until now is the absolutely huge scale, importance and quality of information within the deep Web."

"Deep" Web Study

The BrightPlanet study estimates there are more than 100,000 content-rich searchable databases publicly available within the deep Web. Bergman said these sites collectively have information relevant to any need, citing as examples IBM's patent site, 10KWizard's database of SEC company filings, genome databases, the Costa Rica Supersite, genealogy records, historical sports statistics, NIH PubMed biomedical publications, and law cases and decisions.

Other findings from BrightPlanet's 41-pp white paper, The "Deep" Web: Surfacing Hidden Value are:

  • The deep Web contains nearly 550 billion individual documents compared to the 1 billion of the "surface" Web indexed by search engines
  • The deep Web contains 7,500 terabytes of information, compared to 19 terabytes of information in the surface Web
  • The deep Web is the fastest growing category of new information on the Internet
  • Total quality content of the deep Web is at least 1,000 to 2,000 times greater than that of the surface Web
  • Deep Web content is highly relevant to every information need, market and domain
  • A full 95% of the deep Web is publicly accessible information - not subject to fees or subscriptions

Direct-query Search Technology

The reason the deep Web has been hidden in plain sight is today's reliance on search engines for content discovery on the Web. Existing search engines catalog the surface Web using spiders or crawlers that follow links on static Web pages, akin to ripples spreading across a pond.

The deep Web is made up of searchable databases, with results that are only served up dynamically in answer to a direct query. Though search engines may point to the doorways of these databases, they can not find or search the contents housed inside. Search engines can knock on the door but not get in.

BrightPlanet's technology uniquely and automatically identifies deep Web sites and retrieves their contents. The technology asks a direct query to information sources: 'Do you have what I want?', in a distinct language that the sources understand. BrightPlanet's direct-query technology searches multiple sources simultaneously and then uses proprietary computational linguistics techniques to automatically qualify and organize only the most relevant results.

Thane Paulsen, BrightPlanet's General Manager, likens traditional search engines to trawlers moving through the ocean, using coarse nets that are wide, but only reach a few feet deep. He contrasts BrightPlanet's technology to multiple fishing lines precisely guided by sonar to find, capture and pull up specific information from the deep and surface Web.

Business and Consumer Applications

"What this means in practical terms is that entire new realms of information can now be made available through the use of our technology," Paulsen said. "We're packaging our technology to provide automated quality content to Web-enabled businesses, as a standalone search tool for individual power searchers, and through our own Web site as a service to the public."

"Our search technology can be easily customized to meet the specialized needs of vertical markets," Paulsen said. "By pre-qualifying content from the surface Web and industry-specific deep Web databases, we can automatically build a superior server-side search function for B2B portals or other content sites. We are now licensing this remarkable technology to businesses seeking a way to differentiate their Web sites in a competitive marketplace."

Professional Internet searchers can also use BrightPlanet's technology through a standalone search tool called the LexiBot™, available to the public for $89.95 (Currently $289.95 for Version 2.5) after a free thirty-day trial. LexiBot customers can simultaneously search the content of both surface Web search engines and deep Web databases to obtain accurate, sorted results automatically. "We're also configuring the LexiBot with market-specific enhancements as a private label client-side search tool for business," Paulsen added.

BrightPlanet also unveiled today its CompletePlanet Web site, the premier portal to search engines and databases, with nearly 20,000 search sites listed. "We intend CompletePlanet to remain the complete source to all things search on the Internet," Paulsen said. "It is also an example of how we automatically harvest, qualify, organize and summarize relevant content for other markets."

Links and Contact

BrightPlanet's deep Web white paper is available online or for download from http://www.completeplanet.com/Tutorials/DeepWeb/index.asp.

And BrightPlanet's CompletePlanet access point to search engines and searchable databases is found at http://www.completeplanet.com.

For more information or to set up interviews, contact Bryan Bjerke, Media Relations, at 605.331.6012.

***

BrightPlanet's Enterprise Services division is a premier provider of Internet content infrastructure and search data to Web-enabled businesses using automated technologies for content discovery, retrieval, aggregation, qualification and classification. Its Professional Services division provides search content tools and authoritative search sites to individual information professionals.

BrightPlanet.com LLC is a privately held company founded in 1999 and based in Sioux Falls, SD. Its corporate Web site is at http://www.brightplanet.com.

 
spacer.png, 0 kB
spacer.png, 0 kB
spacer.png, 0 kB

Sitemap Privacy About Us Contact Us Site Use

spacer.png, 0 kB