Overview

API Endpoints

SeekStorm is a Search as a service and a Crawler as a Service: index and search your own documents or build your custom web and news search API. Documents and websites are indexed in real-time and the search API allows instant full-text search.

The SeekStorm REST API consists of three resources: indices, documents and crawljobs. We use resource-oriented URLs, accept JSON-encoded request bodies, return JSON-encoded responses, and utilize standard HTTP response codes and verbs. Regardless of the status code, each response contains a status object providing consistent information about the success or failure of the request. The REST API supports gzip, br (brotli), and deflate compression, both for requests and responses.

Read our Quickstart guide on how to create a SeekStorm account and project. Learn how to use dashboard and REST API to index and query documents.

Create index

POST https://{server}.seekstorm.com/indices

Create a new index.
Create index either programmatically via REST API or manually in the dashboard.

Body-Parameters:

name string

A short name to describe the indexes purpose. Purely informational.

fields object

fields
Before indexing documents the index structure has to be defined. fields is an array of document fields to be indexed and/or stored, where the keys are the field names and the values contain the field options.

The field options object has two keys:
1) store: true/false
Whether the field should be stored on disk or just used for indexing. If set to false it can not be retrieved and included in the search results, but still be indexed and searched by. Default=true

2) type: title, url, text, string, noindex

title: short text (only the first 1,024 words are indexed), highest relevance, bigram (fast search, high index space consumption, no frequent term capping); e.g. used for article title, book title, product name, person name. Only a single field of this type is allowed.

url: short text (only the first 1,024 words are indexed), high relevance; additional field filter site: which searches domain only as opposition to inurl: which searches the whole url including path. Only a single field of this type is allowed.

text: long text (only the first 16,384 words are indexed), normal relevance. Only a single field of this type is allowed.

string: short text (only the first 1,024 words are indexed), normal relevance. Up to 13 fields of this type are allowed.

noindex: will not be indexed and not be searchable, but can be returned in results; Default=noindex. An unlimited number of fields of this type are allowed.

The number of stored fields per index is unlimited. The maximum number of indexed fields (all field types besides noindex) is 16. Of this, a maximum of 1 field can be of type title, 1 field of type content, 1 field of type url, and 13 fields of type string. Additionally, the _docId field is always automatically generated and indexed. Only indexed fields are searchable, but all stored fields can be returned as results for a query.

createAutocompleteDictionary boolean default: true

Create an autocomplete dictionary from all terms of all indexed fields of all documents of this index (default=true). This works for all languages and domain or user specific vocabulary.
If enabled, then for this index an individual autocompletion dictionary is created. This dictionary is used for autocompletion in query documents, if there the parameter completion=true is set.

Get all indices

GET https://{server}.seekstorm.com/indices

Get all active indices belonging to the the API key
Get all indices either programmatically via REST API or manually in the dashboard.

This endpoint has no parameters

Get index

GET https://{server}.seekstorm.com/indices/{indexId}

Get all properties and stats of the targeted index.
Get index either programmatically via REST API or manually in the dashboard.

Path-Parameters:

indexId integer

ID of the targeted index

Delete index

DELETE https://{server}.seekstorm.com/indices/{indexId}

Delete the targeted index.
Delete index either programmatically via REST API or manually in the dashboard.

Path-Parameters:

indexId integer

ID of the targeted index

Create/Reset public key

POST https://{server}.seekstorm.com/indices/{indexId}/public-key

API keys are required to authorize the user to carry out operations within his project. Main API keys are valid within your whole project, for all indices, and all operations. Public API keys are valid only for a single index and have limited rights. They only allow two operations: "query document" and "get document(s)", and only for the single index, they are assigned to. All other operations, especially those adding, changing, or deleting information are available for main API keys only. Public API keys are distinguished by the prefix pub_ followed by a 32-char code.

While main API keys should always be kept confidential, public API keys can be used in end-user facing front-end code. A typical example is site search.

Every index can be assigned a single public API key. Per default, no public API key is assigned. Public API keys can be generated by any user possessing the main API key. With every request to create a public API key a new public API key is generated, while the previous one is invalidated.

All operations caused by the usage of a public API key are deducted from the contingent of your main API key (the chosen plan of your project).

Path-Parameters:

indexId integer

Index document(s)

POST https://{server}.seekstorm.com/indices/{indexId}/documents

Add a single document or an array of documents into the targeted index. Before indexing documents, the index and its fields have to be created with create index.

Path-Parameters:

indexId integer

ID of the targeted index

Body-Parameters:

oneOf

object

Document
A document is any valid JSON document, with any number of fields.
Every field is defined by a key-value pair, with the key as field name (string) and the value of any valid JSON type (string, number, array, boolean, object).

Returned documents contain following additional auto generated keys:
_docId - document ID, created at index time
_summary - KWIC (keyword in context), created at query time, the document field from which the result field _summary field is derieved, is defined with the query parameter summary
_indexDate - index date, created at index time: the number of milliseconds that have passed between now and the beginning of 1970 (in Unix time format).

or

array

An array of documents

Query documents

GET https://{server}.seekstorm.com/indices/{indexId}/documents

Query documents in the targeted index.
Query documents either programmatically via REST API or manually in the dashboard.

Path-Parameters:

indexId integer

ID of the targeted index

Query-Parameters:

query string

Query to search documents by. Needs do be encoded/escaped before sending (C# : System.Uri.EscapeDataString or System.Web.HttpUtility.UrlEncode; JavaScript: encodeURIComponent), use "" for phrase, - for NOT. Additional field filters like site: and in{fieldname}: and allin{fieldname}: are allowed within the query string - see Create Index.
If the query is empty then all documents in the range [offset...offset+length] are returned, sorted by recency of indexDate.

length integer default: 10

Maximum number of results that should be returned

offset integer default: 1

Number of documents the result should be offset by - used for pagination

instant boolean default: false

whether to automatically rewrite the query with the best spelling correction and/or query completion suggestion and return the results for the modified query

correction boolean default: true

spelling correction suggestions

completion boolean default: false

query completion suggestions

result array default: {All fields will be returned}

defines fields which are returned in results

summary string

defines which field is used to create a KWIC (keyword in context) summary. If undefined no summary will be created.

suggestionslength integer

Maximum number of suggestions that should be returned

Update document by query

PATCH https://{server}.seekstorm.com/indices/{indexId}/documents

Update a single document in the targeted index. The document is specified with a query (like in query document). If the query matches none or multiple documents an error will be returned.

Path-Parameters:

indexId integer

ID of the targeted index

Query-Parameters:

query string

Query to find the desired document. Needs do be encoded/escaped before sending (C# : System.Uri.EscapeDataString or System.Web.HttpUtility.UrlEncode; JavaScript: encodeURIComponent), use "" for phrase, - for NOT. Additional field filters like "site:" and "in{fieldname}:" are allowed - see Create Index.

Delete documents by query

DELETE https://{server}.seekstorm.com/indices/{indexId}/documents

Delete a single or multiple documents in the targeted index. The documents are specified with a query (like in query document). The maximum number of documents to be deleted can be specified with parameter max (default=1).

Path-Parameters:

indexId integer

ID of the targeted index

Query-Parameters:

query string

Query to find the desired document(s). Needs do be encoded/escaped before sending (C# : System.Uri.EscapeDataString or System.Web.HttpUtility.UrlEncode; JavaScript: encodeURIComponent), use "" for phrase, - for NOT. Additional field filters like "site:" and "in{fieldname}:" are allowed - see Create Index.

max integer default: 1

The amount of how many documents matching the query at most will be deleted

Get document

GET https://{server}.seekstorm.com/indices/{indexId}/documents/{documentId}

Get the targeted document

Path-Parameters:

indexId integer

ID of the targeted index. The ID is returned by "Index document". Alternatively can be addressed by a query string.

documentId string

ID of the targeted document

Update document

PATCH https://{server}.seekstorm.com/indices/{indexId}/documents/{documentId}

Update the targeted document

Currently the old version of the document is only flaged as deleted and excluded from get document and query document, and a new version of the document is indexed in addition.
But the old version of the document still remains in the index and occupies disk space.
For bulk updates it is recomended to delete the whole index and reindex the updated documents into a newly created index instead.

Path-Parameters:

indexId integer

ID of the targeted index. The ID is returned by "Index document". Alternatively can be addressed by a query string.

documentId string

ID of the targeted document

Delete document

DELETE https://{server}.seekstorm.com/indices/{indexId}/documents/{documentId}

Delete the targeted document

Currently documents are only flaged as deleted and excluded from get document and query document.
But the old version of the document still remains in the index and takes up disk space.
For bulk updates it is recomended to delete the whole index and reindex the updated documents into a newly created index instead.

Path-Parameters:

indexId integer

ID of the targeted index. The ID is returned by "Index document". Alternatively can be addressed by a query string.

documentId string

ID of the targeted document

Create Crawljob(s)

POST https://{server}.seekstorm.com/indices/{indexId}/crawljobs

Create crawljob(s) creates either a single crawljob or a list of crawljobs.

Before creating a crawljob you have to create an index with create index, where the crawljob will store the crawled documents. The following mandatory fields of specific field types are required for crawljob compatibility:

title title stored
content text stored
url url stored
domain noindex stored

Any auxiliaryFields defined in create crawljob must be defined in create index as well.

Disclaimer: SeekStorm crawling is intended for consensual crawling only. You need to own or obtain all rights and licenses required to crawl the content and you need to obey all restrictions and limits imposed by the website owner. SeekStorm doesn't allow, encourage nor provide tools (e.g. proxies) to break restrictions imposed by the website owner. Please see our terms of service for detailed information.

Path-Parameters:

indexId string

ID of the targeted index

Body-Parameters:

oneOf

object

Crawljob
Definition of crawljob parameters

or

array

A list of crawljobs

Get all Crawljobs

GET https://{server}.seekstorm.com/indices/{indexId}/crawljobs

Get crawljob retrieves information about all crawljobs of a specific index.

Path-Parameters:

indexId string

ID of the targeted index

Get crawljob

GET https://{server}.seekstorm.com/indices/{indexId}/crawljobs/{crawljobId}

Get crawljob retrieves information about a specific crawljob.

Path-Parameters:

indexId string

ID of the targeted index

crawljobId string

ID of the targeted crawljob

Delete crawljob

DELETE https://{server}.seekstorm.com/indices/{indexId}/crawljobs/{crawljobId}

Delete crawljob deletes a specific crawljob.

Path-Parameters:

indexId string

ID of the targeted index

crawljobId string

ID of the targeted crawljob