|
The BrightPlanet SQSTRTM (pronounced "sequester") Text Engine is designed to ingest, index and provide an efficient search and retrieval mechanism for unstructured and semi-structured information (text and metadata). It is highly extensible and highly scalable, intended to manage both a diversity and quantity of information unequaled by competing products.
The SQSTR Engine anticipates input via XML, as provided by the BrightPlanet Harvest Engine, or any third-party information source. The information can be produced from a variety of sources, including text harvesters, streaming-media harvesters or document producing applications. Once input, the information is parsed, tokenized and indexed in a proprietary, patented data store.
The SQSTR Engine is a highly efficient component requiring, on average, only about 30 percent of the original document volume to store the information. Yet the structure is extremely powerful, allowing full reconstruction (which we call "re-hydration") of the original text, and sophisticated local data mining and search operations including conventional and extended Boolean operators (such as NEAR, BEFORE, and AFTER).Scoring and ranking algorithms can be tailored to match application needs.
Additional sophisticated operations are supported, including extensive reporting, monitoring and comparing of both individual and collected documents for changes over time. Metadata management and operations are fully supported and integrated with the full-text indexing , including textual metatags, numeric tags, hierarchical keysand categorical keys. The metadata is extensible "on the fly" - prior knowledge of the anticipated collection of metatags is not required.
The SQSTR engine is highly scalable, and through a fully-distributed architecture, able to support individual indices of tens or hundreds of millions of documents. System performance for ingest and retrieval is unmatched.
application program interface (API)
The SQSTR Engine API is a comprehensive set of calls, written in Java. If needed, the Text Engine can be distributed across as many servers as needed to accommodate the document volumes you anticipate for your application. Full documentation is provided. Contact BrightPlanet to see a copy.
|