Search API features
With SeekStorm real-time search & indexing, at the same millisecond a document is indexed it is instantly searchable in real time.
SeekStorm search API can be used to implement instant search, where results are instantly searched and displayed, even while you are still typing the seach query. SeekStorm uses an extremely fast spelling correction and query completion to achieve this complex within sub-millisecond latency.
SeekStorm provides an automatic spelling correction for queries. We use SymSpell, the Symmetric Delete spelling correction algorithm developed by us to achieve 1 million times faster spelling correction & fuzzy search compared to other algorithms.
SeekStorm provides an auto-completion for queries. For incomplete queries (e.g. while typing) SeekStorm provides a list of suggestions with the most likely matching queries. The completion dictionary is automatically compiled during indexing from all indexed fields. The dictionary is updated in real-time with every newly indexed document. For each index there is a separate dictionary created, ensuring language independence, support of user and domain-specific vocabulary, and privacy.
SeekStorm supports automatic query rewriting. If the query is corrected or completed and the instant parameter is set to true, then the query is automatically rewritten and replaced with the best matching suggestion und results are immediately returned for the new corrected query.
When enabled, key text ectraction Remove ads and navigation elements from the fetched web page before storing and indexing. This increases the relevance of the search results, indexing speed, query speed and reduces the index size. Keytext extraction is enabled with keytextOnly=true in Create crawljob via the REST API.
Sometimes the title of a web page contains the same repetitive string on each page, like in "Crawler - Wikipedia". SeekStorm is able to detect and remove those strings. In this example "- Wikipedia" would be removed, while "Crawler" would remain and rewriten as title of this page in the index. Title rewriting improves both ranking and perceived relevance. Title rewriting is enabled with keytextOnly=true in Create crawljob via the REST API.
Besides full-text search SeekStorm is able to restrict the search to specific fields, e.g. to titel, URL or domain, author or product category, within the indexed JSON documents. This allows the user of the search API to implement a faceted search.
Fused search results
SeekStorm enables the aggregation of information and data from different sources, different crawl jobs. It also allows adding auxiliary fields and values (e.g. genre='comedy', product-category='grocery', language='english') to the JSON document created for every crawled document. Those fields will then be returned together with the search results.
SeekStorm search API supports the following field filters: e.g. intitle, intext, inurl, site, allintitle, allinurl, allintext, and additionally for each field defined in the index there is an aptly named field filter available.
SeekStorm search API supports the following boolean search operators: AND, NOT, PHRASE, Implicit PHRASE
SeekStorm is language independent, in crawling, indexing, searching, spelling correction, and query completion. SeekStorm also supports Chinese word segmentation to index Chinese content.