Search API features

Realtime search

With SeekStorm real-time search & indexing, at the same millisecond a document is indexed it is instantly searchable in real time.

Instant search

SeekStorm search API can be used to implement instant search, where results are instantly searched and displayed, even while you are still typing the seach query. SeekStorm uses an extremely fast spelling correction and query completion to achieve this complex within sub-millisecond latency.

Query correction

SeekStorm provides an automatic spelling correction for queries. We use SymSpell, the Symmetric Delete spelling correction algorithm developed by us to achieve 1 million times faster spelling correction & fuzzy search compared to other algorithms.

Query completion

SeekStorm provides an auto-completion for queries. For incomplete queries (e.g. while typing) SeekStorm provides a list of suggestions with the most likely matching queries. The completion dictionary is automatically compiled during indexing from all indexed fields. The dictionary is updated in real-time with every newly indexed document. For each index there is a separate dictionary created, ensuring language independence, support of user and domain-specific vocabulary, and privacy.

Query rewriting

SeekStorm supports automatic query rewriting. If the query is corrected or completed and the instant parameter is set to true, then the query is automatically rewritten and replaced with the best matching suggestion und results are immediately returned for the new corrected query.

Keytext extraction

When enabled, key text ectraction Remove ads and navigation elements from the fetched web page before storing and indexing. This increases the relevance of the search results, indexing speed, query speed and reduces the index size. Keytext extraction is enabled with keytextOnly=true in Create crawljob via the REST API.

Title rewriting

Sometimes the title of a web page contains the same repetitive string on each page, like in "Crawler - Wikipedia". SeekStorm is able to detect and remove those strings. In this example "- Wikipedia" would be removed, while "Crawler" would remain and rewriten as title of this page in the index. Title rewriting improves both ranking and perceived relevance. Title rewriting is enabled with keytextOnly=true in Create crawljob via the REST API.

Faceted search

Besides full-text search SeekStorm is able to restrict the search to specific fields, e.g. to titel, URL or domain, author or product category, within the indexed JSON documents. This allows the user of the search API to implement a faceted search.

Fused search results

SeekStorm enables the aggregation of information and data from different sources, different crawl jobs. It also allows adding auxiliary fields and values (e.g. genre='comedy', product-category='grocery', language='english') to the JSON document created for every crawled document. Those fields will then be returned together with the search results.

Advanced operators

SeekStorm search API supports the following field filters: e.g. intitle, intext, inurl, site, allintitle, allinurl, allintext, and additionally for each field defined in the index there is an aptly named field filter available.

Boolean search

SeekStorm search API supports the following boolean search operators: AND, NOT, PHRASE, Implicit PHRASE

Language independent

SeekStorm is language independent, in crawling, indexing, searching, spelling correction, and query completion. SeekStorm also supports Chinese word segmentation to index Chinese content.