manticore

package
v0.0.0-...-1ea59d4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 10, 2024 License: Apache-2.0 Imports: 9 Imported by: 5

Documentation

Overview

Package manticore implements Client to work with manticoresearch over it's internal binary protocol. Also in many cases it may be used to work with sphinxsearch daemon as well. It implements Client connector which may be used as

cl := NewClient()
res, err := cl.Query("hello")
...

Set of functions is mostly imitates API description of Manticoresearch for PHP, but with few changes which are specific to Go language as more effective and mainstream for that language (as, for example, error handling).

This SDK help you to send different manticore API packets and parse results. These are:

* Search (full-text and full-scan)

* Build snippets

* Build keywords

* Flush attributes

* Perform JSON queries (as via HTTP proto)

* Perform sphinxql queries (as via mysql proto)

* Set user variables

* Ping the server

* Look server status

* Perform pecrolate queries

The percolate query is used to match documents against queries stored in an index. It is also called “search in reverse” as it works opposite to a regular search where documents are stored in an index and queries are issued against the index.

These queries are stored in special kind index and they can be added, deleted and listed using INSERT/DELETE/SELECT statements similar way as it’s done for a regular index.

Checking if a document matches any of the predefined criterias (queries) performed via CallPQ function, or via http /json/pq/<index>/_search endpoint. They returns list of matched queries and may be additional info as matching clause, filters, and tags.

Index

Examples

Constants

View Source
const (
	AggrNone eAggrFunc = iota // None
	AggrAvg                   // Avg()
	AggrMin                   // Min()
	AggrMax                   // Max()
	AggrSum                   // Sum()
	AggrCat                   // Cat()
)
View Source
const (
	CollationLibcCi        eCollation = iota // Libc CI
	CollationLibcCs                          // Libc Cs
	CollationUtf8GeneralCi                   // Utf8 general CI
	CollationBinary                          // Binary

	CollationDefault = CollationLibcCi
)
View Source
const (
	FilterValues     eFilterType = iota // filter by integer values set
	FilterRange                         // filter by integer range
	FilterFloatrange                    // filter by float range
	FilterString                        // filter by string value
	FilterNull                          // filter by NULL
	FilterUservar                       // filter by @uservar
	FilterStringList                    // filter by string list
	FilterExpression                    // filter by expression
)
View Source
const (
	QueryOptDefault   eQueryoption = iota // Default
	QueryOptDisabled                      // Disabled
	QueryOptEnabled                       // Enabled
	QueryOptMorphNone                     // None morphology expansion
)
View Source
const (
	SphinxPort uint16 = 9312 // Default IANA port for Sphinx API
)

Variables

This section is empty.

Functions

func EscapeString

func EscapeString(from string) string

EscapeString escapes characters that are treated as special operators by the query language parser.

`from` is a string to escape. This function might seem redundant because it’s trivial to implement in any calling application. However, as the set of special characters might change over time, it makes sense to have an API call that is guaranteed to escape all such characters at all times. Returns escaped string.

Example
escaped := EscapeString("escaping-sample@query/string")
fmt.Println(escaped)
Output:

escaping\-sample\@query\/string

Types

type Client

type Client struct {
	// contains filtered or unexported fields
}

Client represents connection to manticore daemon. It provides set of public API functions

func NewClient

func NewClient() Client

NewClient creates default connector, which points to 'localhost:9312', has zero timeout and 8M maxalloc. Defaults may be changed later by invoking `SetServer()`, `SetMaxAlloc()`

func (*Client) BuildExcerpts

func (cl *Client) BuildExcerpts(docs []string, index,
	words string, opts ...SnippetOptions) ([]string, error)

BuildExcerpts generates excerpts (snippets) of given documents for given query. returns nil on failure, an array of snippets on success. If necessary it will connect to the server before processing.

`docs` is a plain slice of strings that carry the documents’ contents.

`index` is an index name string. Different settings (such as charset, morphology, wordforms) from given index will be used.

`words` is a string that contains the keywords to highlight. They will be processed with respect to index settings. For instance, if English stemming is enabled in the index, shoes will be highlighted even if keyword is shoe. Keywords can contain wildcards, that work similarly to star-syntax available in queries.

`opts` is an optional struct SnippetOptions which may contain additional optional highlighting parameters, it may be created by calling of “NewSnippetOptions()” and then tuned for your needs. If `opts` is omitted, default will be used.

Snippets extraction algorithm currently favors better passages (with closer phrase matches), and then passages with keywords not yet in snippet. Generally, it will try to highlight the best match with the query, and it will also to highlight all the query keywords, as made possible by the limits. In case the document does not match the query, beginning of the document trimmed down according to the limits will be return by default. You can also return an empty snippet instead case by setting allow_empty option to true.

Returns false on failure. Returns a plain array of strings with excerpts (snippets) on success.

func (*Client) BuildKeywords

func (cl *Client) BuildKeywords(query, index string, hits bool) ([]Keyword, error)

BuildKeywords extracts keywords from query using tokenizer settings for given index, optionally with per-keyword occurrence statistics. Returns an array of hashes with per-keyword information. If necessary it will connect to the server before processing.

`query` is a query to extract keywords from.

`index` is a name of the index to get tokenizing settings and keyword occurrence statistics from.

`hits` is a boolean flag that indicates whether keyword occurrence statistics are required.

Example (WithHits)
cl := NewClient()

keywords, err := cl.BuildKeywords("this.is.my query", "lj", true)
if err != nil {
	fmt.Println(err.Error())
} else {
	fmt.Println(keywords)
}
Output:

[{Tok: 'this',	Norm: 'this',	Qpos: 1; docs/hits 1629922/3905279}
 {Tok: 'is',	Norm: 'is',	Qpos: 2; docs/hits 1901345/6052344}
 {Tok: 'my',	Norm: 'my',	Qpos: 3; docs/hits 1981048/7549917}
 {Tok: 'query',	Norm: 'query',	Qpos: 4; docs/hits 1235/1474}
]
Example (WithoutHits)
cl := NewClient()

keywords, err := cl.BuildKeywords("this.is.my query", "lj", false)
if err != nil {
	fmt.Println(err.Error())
} else {
	fmt.Println(keywords)
}
Output:

[{Tok: 'this',	Norm: 'this',	Qpos: 1; docs/hits 0/0}
 {Tok: 'is',	Norm: 'is',	Qpos: 2; docs/hits 0/0}
 {Tok: 'my',	Norm: 'my',	Qpos: 3; docs/hits 0/0}
 {Tok: 'query',	Norm: 'query',	Qpos: 4; docs/hits 0/0}
]

func (*Client) CallPQ

func (cl *Client) CallPQ(index string, values []string, opts SearchPqOptions) (*SearchPqResponse, error)

CallQP perform check if a document matches any of the predefined criterias (queries) It returns list of matched queries and may be additional info as matching clause, filters, and tags.

`index` determines name of PQ index you want to call into. It can be either local, either distributed built from several PQ agents

`values` is the list of the index. Each value regarded as separate index. Order num of matched indexes then may be returned in resultset

`opts` packed options. See description of SearchPqOptions for details. In general you need to make instance of options by calling NewSearchPqOptions(), set desired flags and options, and then invoke CallPQ, providing desired index, set of documents and the options.

Since this function expects plain text documents, it will remove all flags about json from the options, and also will not use IdAlias, if any provided.

For example:

..
po := NewSearchPqOptions()
po.Flags = NeedDocs | Verbose | NeedQuery
resp, err := cl.CallPQ("pq",[]string{"angry test","filter test doc2",},po)
...

func (*Client) CallPQBson

func (cl *Client) CallPQBson(index string, values []byte, opts SearchPqOptions) (*SearchPqResponse, error)

CallPQBson perform check if a document matches any of the predefined criterias (queries) It returns list of matched queries and may be additional info as matching clause, filters, and tags.

It works very like CallPQ, but expects documents in BSON form. With this function it is have sense to use flags as SkipBadJson, and param IdAlias which are not used for plain queries.

This function is not yet implemented in SDK, it is stub.

func (*Client) Close

func (cl *Client) Close() (bool, error)

Close closes previously opened persistent connection. If no connection active, it fire error 'not connected' which is just informational and safe to ignore.

func (*Client) FlushAttributes

func (cl *Client) FlushAttributes() (int, error)

FlushAttributes forces searchd to flush pending attribute updates to disk, and blocks until completion. Returns a non-negative internal flush tag on success, or -1 and error.

Attribute values updated using UpdateAttributes() API call are kept in a memory mapped file. Which means the OS decides when the updates are actually written to disk. FlushAttributes() call lets you enforce a flush, which writes all the changes to disk. The call will block until searchd finishes writing the data to disk, which might take seconds or even minutes depending on the total data size (.spa file size). All the currently updated indexes will be flushed.

Flush tag should be treated as an ever growing magic number that does not mean anything. It’s guaranteed to be non-negative. It is guaranteed to grow over time, though not necessarily in a sequential fashion; for instance, two calls that return 10 and then 1000 respectively are a valid situation. If two calls to FlushAttrs() return the same tag, it means that there were no actual attribute updates in between them, and therefore current flushed state remained the same (for all indexes).

Usage example:

status, err := cl.FlushAttributes ()
if err!=nil {
  fmt.Println(err.Error())
}

func (*Client) GetLastWarning

func (cl *Client) GetLastWarning() string

GetLastWarning returns last warning message, as a string, in human readable format. If there were no warnings during the previous API call, empty string is returned.

You should call it to verify whether your request (such as Query()) was completed but with warnings. For instance, search query against a distributed index might complete successfully even if several remote agents timed out. In that case, a warning message would be produced.

The warning message is not reset by this call; so you can safely call it several times if needed. If you issued multi-query by running RunQueries(), individual warnings will not be written in client; instead check the Warning field in each returned result of the slice.

func (*Client) IsConnectError

func (cl *Client) IsConnectError() bool

IsConnectError checks whether the last error was a network error on API side, or a remote error reported by searchd. Returns true if the last connection attempt to searchd failed on API side, false otherwise (if the error was remote, or there were no connection attempts at all).

func (*Client) Json

func (cl *Client) Json(endpoint, request string) (JsonAnswer, error)

Json pefrorms remote call of JSON query, as if it were fired via HTTP connection. It is intented to run updates and deletes, however works sometimes in other cases. General rule: if the endpoint accepts data via POST, it will work via Json call.

`endpoint` - is the endpoint, like "json/search".

`request` - the query. As in REST, expected to be in JSON, like `{"index":"lj","query":{"match":{"title":"luther"}}}`

func (*Client) Open

func (cl *Client) Open() (bool, error)

Open opens persistent connection to the server.

func (*Client) Ping

func (cl *Client) Ping(cookie uint32) (uint32, error)

Ping just send a uint32 cookie to the daemon and immediately receive it back. It may be used to average network responsibility time, or to ping if daemon is alive or not.

func (*Client) Query

func (cl *Client) Query(query string, indexes ...string) (*QueryResult, error)

Query connects to searchd server, run given simple search query string through given indexes, and return the search result.

This is simplified function which accepts only 1 query string parameter and no options Internally it will run with ranker 'RankProximityBm25', mode 'MatchAll' with 'max_matches=1000' and 'limit=20' It is good to be used in kind of a demo run. If you want more fine-tuned options, consider to use `RunQuery()` and `RunQueries()` functions which provide you full spectre of possible tuning options.

`query` is a query string.

`indexes` is an index name (or names) string. Default value for `indexes` is "*" that means to query all local indexes. Characters allowed in index names include Latin letters (a-z), numbers (0-9) and underscore (_); everything else is considered a separator. Note that index name should not start with underscore character. Internally 'Query' is just invokes 'RunQuery' with default Search, where only `query` and `index` fields are customized.

Therefore, all of the following samples calls are valid and will search the same two indexes:

cl.Query ( "test query", "main delta" )
cl.Query ( "test query", "main;delta" )
cl.Query ( "test query", "main, delta" )

func (*Client) RunQueries

func (cl *Client) RunQueries(queries []Search) ([]QueryResult, error)

RunQueries connects to searchd, runs a batch of queries, obtains and returns the result sets. Returns nil and error message on general error (such as network I/O failure). Returns a slice of result sets on success.

`queries` is slice of Search structures, each represent one query. You need to prepare this slice yourself before call.

Each result set in the returned array is exactly the same as the result set returned from RunQuery.

Note that the batch query request itself almost always succeeds - unless there’s a network error, blocking index rotation in progress, or another general failure which prevents the whole request from being processed.

However individual queries within the batch might very well fail. In this case their respective result sets will contain non-empty `error` message, but no matches or query statistics. In the extreme case all queries within the batch could fail. There still will be no general error reported, because API was able to successfully connect to searchd, submit the batch, and receive the results - but every result set will have a specific error message.

func (*Client) RunQuery

func (cl *Client) RunQuery(query Search) (*QueryResult, error)

RunQuery connects to searchd, runs a query, obtains and returns the result set. Returns nil and error message on general error (such as network I/O failure). Returns a result set on success.

`query` is a single Search structure, representing the query. You need to prepare it yourself before call.

Each result set in the returned array is exactly the same as the result set returned from RunQuery.

func (*Client) SetConnectTimeout

func (cl *Client) SetConnectTimeout(timeout time.Duration)

SetConnectTimeout sets the time allowed to spend connecting to the server before giving up.

Under some circumstances, the server can be delayed in responding, either due to network delays, or a query backlog. In either instance, this allows the client application programmer some degree of control over how their program interacts with searchd when not available, and can ensure that the client application does not fail due to exceeding the execution limits.

In the event of a failure to connect, an appropriate error code should be returned back to the application in order for application-level error handling to advise the user.

func (*Client) SetMaxAlloc

func (cl *Client) SetMaxAlloc(alloc int)

SetMaxAlloc limits size of client's network buffer. For sending queries and receiving results client reuses byte array, which can grow up to required size. If the limit reached, array will be released and new one will be created. Usually API needs just few kilobytes of the memory, but sometimes the value may grow significantly high. For example, if you fetch a big resultset with many attributes. Such resultset will be properly received and processed, however at the next query backend array which used for it will be released, and occupied memory will be returned to runtime.

`alloc` is size, in bytes. Reasonable default value is 8M.

func (*Client) SetServer

func (cl *Client) SetServer(host string, port ...uint16)

SetServer sets searchd host name and TCP port. All subsequent requests will use the new host and port settings. Default host and port are ‘localhost’ and 9312, respectively.

`host` is either url (hostname or ip address), either unix socket path (starting with '/')

`port` is optional, it has sense only for tcp connections and not used for unix socket. Default is 9312

Example (Tcpsocket)
cl := NewClient()
cl.SetServer("google.com", 9999)

fmt.Println(cl.dialmethod)
fmt.Println(cl.host)
fmt.Println(cl.port)
Output:

tcp
google.com
9999
Example (Unixsocket)
cl := NewClient()
cl.SetServer("/var/log")

fmt.Println(cl.dialmethod)
fmt.Println(cl.host)
Output:

unix
/var/log

func (*Client) Sphinxql

func (cl *Client) Sphinxql(cmd string) ([]Sqlresult, error)

Sphinxql send sphinxql request encapsulated into API. Return over network came in mysql native proto format, which is parsed by SDK and represented as usable structure (see Sqlresult definition). Also result provides Stringer interface, so it may be printed nice without any postprocessing. Limitation of the command is that it is done in one session, as if you open connection via mysql, execute the command and disconnected. So, some information, like 'show meta' after 'call pq' will be lost in such case (however, you can invoke CallPQ directly from API), but another things like 'select...; show meta' in one line is still supported and work well

func (*Client) Status

func (cl *Client) Status(global bool) (map[string]string, error)

Status queries searchd status, and returns an array of status variable name and value pairs.

`global` determines whether you take global status, or meta of the last query.

true: receive global daemon status
false: receive meta of the last executed query

Usage example:

 status, err := cl.Status(false)
	if err != nil {
		fmt.Println(err.Error())
	} else {
		for key, line := range (status) {
			fmt.Printf("%v:\t%v\n", key, line)
		}
	}

example output:

time:	0.000
keyword[0]:	query
docs[0]:	1235
hits[0]:	1474
total:	3
total_found:	3

func (*Client) UpdateAttributes

func (cl *Client) UpdateAttributes(index string, attrs []string, values map[DocID][]interface{},
	vtype EUpdateType, ignorenonexistent bool) (int, error)

UpdateAttributes instantly updates given attribute values in given documents. Returns number of actually updated documents (0 or more) on success, or -1 on failure with error.

`index` is a name of the index (or indexes) to be updated. It can be either a single index name or a list, like in Query(). Unlike Query(), wildcard is not allowed and all the indexes to update must be specified explicitly. The list of indexes can include distributed index names. Updates on distributed indexes will be pushed to all agents.

`attrs` is a slice with string attribute names, listing attributes that are updated.

`values` is a map with documents IDs as keys and new attribute values, see below.

`vtype` type parameter, see EUpdateType description for values.

`ignorenonexistent` points that the update will silently ignore any warnings about trying to update a column which is not exists in current index schema.

Usage example:

upd, err := cl.UpdateAttributes("test1", []string{"group_id"}, map[DocID][]interface{}{1:{456}}, UpdateInt, false)

Here we update document 1 in index test1, setting group_id to 456.

upd, err := cl.UpdateAttributes("products", []string{"price", "amount_in_stock"}, map[DocID][]interface{}{1001:{123,5}, 1002:{37,11}, 1003:{25,129}}, UpdateInt, false)

Here we update documents 1001, 1002 and 1003 in index products. For document 1001, the new price will be set to 123 and the new amount in stock to 5; for document 1002, the new price will be 37 and the new amount will be 11; etc.

func (*Client) Uvar

func (cl *Client) Uvar(name string, values []uint64) error

Uvar defines remote user variable which later may be used for filtering. You can really push megabytes of values and later just refer to the whole set by name.

`name` is the name of the variable, must start with @, like "@foo"

`values` is array of the numbers you want to store in the variable. It is considered as 'set', so dupes will be removed, order will not be kept. Like: []uint64{7811237,7811235,7811235,7811233,7811236}

type ColumnInfo

type ColumnInfo struct {
	Name string    // name of the attribute
	Type EAttrType // type of the attribute
}

ColumnInfo represents one attribute column in resultset schema

func (ColumnInfo) String

func (res ColumnInfo) String() string

Stringer interface for ColumnInfo type

type DocID

type DocID uint64

Document ID type

const DocidMax DocID = 0xffffffffffffffff

type EAttrType

type EAttrType uint32

EAttrType represents known attribute types. See comments in constants for concrete meaning. Values of this type will be returned with resultset schema, you don't need to use them yourself.

const (
	AttrNone      EAttrType = iota // not an attribute at all
	AttrInteger                    // unsigned 32-bit integer
	AttrTimestamp                  // this attr is a timestamp

	AttrBool   // this attr is a boolean bit field
	AttrFloat  // floating point number (IEEE 32-bit)
	AttrBigint // signed 64-bit integer
	AttrString // string (binary; in-memory)

	AttrPoly2d                            // vector of floats, 2D polygon (see POLY2D)
	AttrStringptr                         // string (binary, in-memory, stored as pointer to the zero-terminated string)
	AttrTokencount                        // field token count, 32-bit integer
	AttrJson                              // JSON subset; converted, packed, and stored as string
	AttrUint32set  EAttrType = 0x40000001 // MVA, set of unsigned 32-bit integers
	AttrInt64set   EAttrType = 0x40000002 // MVA, set of signed 64-bit integers
)
const (
	AttrMaparg      EAttrType = 1000 + iota
	AttrFactors               // packed search factors (binary, in-memory, pooled)
	AttrJsonField             // points to particular field in JSON column subset
	AttrFactorsJson           // packed search factors (binary, in-memory, pooled, provided to Client json encoded)
)

these types are runtime only used as intermediate types in the expression engine

func (EAttrType) String

func (vl EAttrType) String() string

Stringer interface for EAttrType type

type EGroupBy

type EGroupBy uint32

EGroupBy selects search query grouping mode. It is used as a param when calling `SetGroupBy()` function.

GroupbyDay

GroupbyDay extracts year, month and day in YYYYMMDD format from timestamp.

GroupbyWeek

GroupbyWeek extracts year and first day of the week number (counting from year start) in YYYYNNN format from timestamp.

GroupbyMonth

GroupbyMonth extracts month in YYYYMM format from timestamp.

GroupbyYear

GroupbyYear extracts year in YYYY format from timestamp.

GroupbyAttr

GroupbyAttr uses attribute value itself for grouping.

GroupbyMultiple

GroupbyMultiple group by on multiple attribute values. Allowed plain attributes and json fields; MVA and full JSONs are not allowed.

const (
	GroupbyDay   EGroupBy = iota // group by day
	GroupbyWeek                  // group by week
	GroupbyMonth                 // group by month
	GroupbyYear                  // group by year
	GroupbyAttr                  // group by attribute value

	GroupbyMultiple // group by on multiple attribute values
)

type EMatchMode

type EMatchMode uint32

EMatchMode selects search query matching mode. So-called matching modes are a legacy feature that used to provide (very) limited query syntax and ranking support. Currently, they are deprecated in favor of full-text query language and so-called Available built-in rankers. It is thus strongly recommended to use `MatchExtended` and proper query syntax rather than any other legacy mode. All those other modes are actually internally converted to extended syntax anyway. SphinxAPI still defaults to `MatchAll` but that is for compatibility reasons only.

There are the following matching modes available:

MatchAll

MatchAll matches all query words.

MatchAny

MatchAny matches any of the query words.

MatchPhrase

MatchPhrase, matches query as a phrase, requiring perfect match.

MatchBoolean

MatchBoolean, matches query as a boolean expression (see Boolean query syntax).

MatchExtended

MatchExtended2

MatchExtended, MatchExtended2 (alias) matches query as an expression in Manticore internal query language (see Extended query syntax). This is default matching mode if nothing else specified.

MatchFullscan

MatchFullscan, matches query, forcibly using the “full scan” mode as below. NB, any query terms will be ignored, such that filters, filter-ranges and grouping will still be applied, but no text-matching. MatchFullscan mode will be automatically activated in place of the specified matching mode when the query string is empty (ie. its length is zero).

In full scan mode, all the indexed documents will be considered as matching. Such queries will still apply filters, sorting, and group by, but will not perform any full-text searching. This can be useful to unify full-text and non-full-text searching code, or to offload SQL server (there are cases when Manticore scans will perform better than analogous MySQL queries). An example of using the full scan mode might be to find posts in a forum. By selecting the forum’s user ID via SetFilter() but not actually providing any search text, Manticore will match every document (i.e. every post) where SetFilter() would match - in this case providing every post from that user. By default this will be ordered by relevancy, followed by Manticore document ID in ascending order (earliest first).

const (
	MatchAll       EMatchMode = iota // match all query words
	MatchAny                         // match any query word
	MatchPhrase                      // match this exact phrase
	MatchBoolean                     // match this boolean query
	MatchExtended                    // match this extended query
	MatchFullscan                    // match all document IDs w/o fulltext query, apply filters
	MatchExtended2                   // extended engine V2 (TEMPORARY, WILL BE REMOVED IN 0.9.8-RELEASE)

	MatchTotal
)

type ERankMode

type ERankMode uint32

ERankMode selects query relevance ranking mode. It is set via `SetRankingMode()` and `SetRankingExpression()` functions.

Manticore ships with a number of built-in rankers suited for different purposes. A number of them uses two factors, phrase proximity (aka LCS) and BM25. Phrase proximity works on the keyword positions, while BM25 works on the keyword frequencies. Basically, the better the degree of the phrase match between the document body and the query, the higher is the phrase proximity (it maxes out when the document contains the entire query as a verbatim quote). And BM25 is higher when the document contains more rare words. We’ll save the detailed discussion for later.

Currently implemented rankers are:

RankProximityBm25

RankProximityBm25, the default ranking mode that uses and combines both phrase proximity and BM25 ranking.

RankBm25

RankBm25, statistical ranking mode which uses BM25 ranking only (similar to most other full-text engines). This mode is faster but may result in worse quality on queries which contain more than 1 keyword.

RankNone

RankNone, no ranking mode. This mode is obviously the fastest. A weight of 1 is assigned to all matches. This is sometimes called boolean searching that just matches the documents but does not rank them.

RankWordcount

RankWordcount, ranking by the keyword occurrences count. This ranker computes the per-field keyword occurrence counts, then multiplies them by field weights, and sums the resulting values.

RankProximity

RankProximity, returns raw phrase proximity value as a result. This mode is internally used to emulate MatchAll queries.

RankMatchany

RankMatchany, returns rank as it was computed in SPH_MATCH_ANY mode earlier, and is internally used to emulate MatchAny queries.

RankFieldmask

RankFieldmask, returns a 32-bit mask with N-th bit corresponding to N-th fulltext field, numbering from 0. The bit will only be set when the respective field has any keyword occurrences satisfying the query.

RankSph04

RankSph04, is generally based on the default SPH_RANK_PROXIMITY_BM25 ranker, but additionally boosts the matches when they occur in the very beginning or the very end of a text field. Thus, if a field equals the exact query, SPH04 should rank it higher than a field that contains the exact query but is not equal to it. (For instance, when the query is “Hyde Park”, a document entitled “Hyde Park” should be ranked higher than a one entitled “Hyde Park, London” or “The Hyde Park Cafe”.)

RankExpr

RankExpr, lets you specify the ranking formula in run time. It exposes a number of internal text factors and lets you define how the final weight should be computed from those factors.

RankExport

RankExport, rank by BM25, but compute and export all user expression factors

RankPlugin

RankPlugin, rank by user-defined ranker provided as UDF function.

const (
	RankProximityBm25 ERankMode = iota // default mode, phrase proximity major factor and BM25 minor one (aka SPH03)
	RankBm25                           // statistical mode, BM25 ranking only (faster but worse quality)
	RankNone                           // no ranking, all matches get a weight of 1
	RankWordcount                      // simple word-count weighting, rank is a weighted sum of per-field keyword occurence counts
	RankProximity                      // phrase proximity (aka SPH01)
	RankMatchany                       // emulate old match-any weighting (aka SPH02)
	RankFieldmask                      // sets bits where there were matches
	RankSph04                          // codename SPH04, phrase proximity + bm25 + head/exact boost
	RankExpr                           // rank by user expression (eg. "sum(lcs*user_weight)*1000+bm25")
	RankExport                         // rank by BM25, but compute and export all user expression factors
	RankPlugin                         // user-defined ranker
	RankTotal
	RankDefault = RankProximityBm25
)

type ESearchdstatus

type ESearchdstatus uint16

ESearchdstatus describes known return codes. Also status codes for search command (but there 32bit)

const (
	StatusOk      ESearchdstatus = iota // general success, command-specific reply follows
	StatusError                         // general failure, error message follows
	StatusRetry                         // temporary failure, error message follows, Client should retry late
	StatusWarning                       // general success, warning message and command-specific reply follow
)

func (ESearchdstatus) String

func (vl ESearchdstatus) String() string

Stringer interface for ESearchdstatus type

type ESortOrder

type ESortOrder uint32

ESortOrder selects search query sorting orders

There are the following result sorting modes available:

SortRelevance

SortRelevance sorts by relevance in descending order (best matches first).

SortAttrDesc

SortAttrDescmode sorts by an attribute in descending order (bigger attribute values first).

SortAttrAsc

SortAttrAsc mode sorts by an attribute in ascending order (smaller attribute values first).

SortTimeSegments

SortTimeSegments sorts by time segments (last hour/day/week/month) in descending order, and then by relevance in descending order. Attribute values are split into so-called time segments, and then sorted by time segment first, and by relevance second.

The segments are calculated according to the current timestamp at the time when the search is performed, so the results would change over time. The segments are as follows:

last hour,

last day,

last week,

last month,

last 3 months,

everything else.

These segments are hardcoded, but it is trivial to change them if necessary.

This mode was added to support searching through blogs, news headlines, etc. When using time segments, recent records would be ranked higher because of segment, but within the same segment, more relevant records would be ranked higher - unlike sorting by just the timestamp attribute, which would not take relevance into account at all.

SortExtended

SortExtended sorts by SQL-like combination of columns in ASC/DESC order. You can specify an SQL-like sort expression with up to 5 attributes (including internal attributes), eg:

@relevance DESC, price ASC, @id DESC

Both internal attributes (that are computed by the engine on the fly) and user attributes that were configured for this index are allowed. Internal attribute names must start with magic @-symbol; user attribute names can be used as is. In the example above, @relevance and @id are internal attributes and price is user-specified.

Known internal attributes are:

@id (match ID)

@weight (match weight)

@rank (match weight)

@relevance (match weight)

@random (return results in random order)

@rank and @relevance are just additional aliases to @weight.

SortExpr

SortExpr sorts by an arithmetic expression.

`SortRelevance` ignores any additional parameters and always sorts matches by relevance rank. All other modes require an additional sorting clause, with the syntax depending on specific mode. SortAttrAsc, SortAttrDesc and SortTimeSegments modes require simply an attribute name. SortRelevance is equivalent to sorting by “@weight DESC, @id ASC” in extended sorting mode, SortAttrAsc is equivalent to “attribute ASC, @weight DESC, @id ASC”, and SortAttrDesc to “attribute DESC, @weight DESC, @id ASC” respectively.

const (
	SortRelevance    ESortOrder = iota // sort by document relevance desc, then by date
	SortAttrDesc                       // sort by document data desc, then by relevance desc
	SortAttrAsc                        // sort by document data asc, then by relevance desc
	SortTimeSegments                   // sort by time segments (hour/day/week/etc) desc, then by relevance desc
	SortExtended                       // sort by SQL-like expression (eg. "@relevance DESC, price ASC, @id DESC")
	SortExpr                           // sort by arithmetic expression in descending order (eg. "@id + max(@weight,1000)*boost + log(price)")
	SortTotal
)

type EUpdateType

type EUpdateType uint32

EUpdateType is values for `vtype` of UpdateAttributes() call, which determines meaning of `values` param of this function.

UpdateInt

This is the default value. `values` hash holds documents IDs as keys and a plain arrays of new attribute values.

UpdateMva

Points that MVA attributes are being updated. In this case the `values` must be a hash with document IDs as keys and array of arrays of int values (new MVA attribute values).

UpdateString

Points that string attributes are being updated. `values` must be a hash with document IDs as keys and array of strings as values.

UpdateJson

Works the same as `UpdateString`, but for JSON attribute updates.

const (
	UpdateInt EUpdateType = iota
	UpdateMva
	UpdateString
	UpdateJson
)

type ExcerptFlags

type ExcerptFlags uint32

ExcerptFlags is bitmask for SnippetOptions.Flags Different values have to be combined with '+' or '|' operation from following constants:

ExcerptFlagExactphrase

Whether to highlight exact query phrase matches only instead of individual keywords.

ExcerptFlagUseboundaries

Whether to additionally break passages by phrase boundary characters, as configured in index settings with phrase_boundary directive.

ExcerptFlagWeightorder

Whether to sort the extracted passages in order of relevance (decreasing weight), or in order of appearance in the document (increasing position).

ExcerptFlagQuery

Whether to handle 'words' as a query in extended syntax, or as a bag of words (default behavior). For instance, in query mode "(one two | three four)" will only highlight and include those occurrences one two or three four when the two words from each pair are adjacent to each other. In default mode, any single occurrence of one, two, three, or four would be highlighted.

ExcerptFlagForceAllWords

Ignores the snippet length limit until it includes all the keywords.

ExcerptFlagLoadFiles

Whether to handle 'docs' as data to extract snippets from (default behavior), or to treat it as file names, and load data from specified files on the server side. Up to dist_threads worker threads per request will be created to parallelize the work when this flag is enabled. To parallelize snippets build between remote agents, configure “dist_threads” param of searchd to value greater than 1, and then invoke the snippets generation over the distributed index, which contain only one(!) local agent and several remotes. The “snippets_file_prefix” param of remote daemons is also in the game and the final filename is calculated by concatenation of the prefix with given name.

ExcerptFlagAllowEmpty

Allows empty string to be returned as highlighting result when a snippet could not be generated (no keywords match, or no passages fit the limit). By default, the beginning of original text would be returned instead of an empty string.

ExcerptFlagEmitZones

Emits an HTML tag with an enclosing zone name before each passage.

ExcerptFlagFilesScattered

It works only with distributed snippets generation with remote agents. The source files for snippets could be distributed among different agents, and the main daemon will merge together all non-erroneous results. So, if one agent of the distributed index has ‘file1.txt’, another has ‘file2.txt’ and you call for the snippets with both these files, the daemon will merge results from the agents together, so you will get the snippets from both ‘file1.txt’ and ‘file2.txt’.

If the load_files is also set, the request will return the error in case if any of the files is not available anywhere. Otherwise (if 'load_files' is not set) it will just return the empty strings for all absent files. The master instance reset this flag when distributes the snippets among agents. So, for agents the absence of a file is not critical error, but for the master it is so. If you want to be sure that all snippets are actually created, set both `load_files_scattered` and `load_files`. If the absence of some snippets caused by some agents is not critical for you - set just `load_files_scattered`, leaving `load_files` not set.

ExcerptFlagForcepassages

Whether to generate passages for snippet even if limits allow to highlight whole text.

Confusion and deprecation

const (
	ExcerptFlagExactphrase ExcerptFlags

	ExcerptFlagUseboundaries
	ExcerptFlagWeightorder
	ExcerptFlagQuery
	ExcerptFlagForceAllWords
	ExcerptFlagLoadFiles
	ExcerptFlagAllowEmpty
	ExcerptFlagEmitZones
	ExcerptFlagFilesScattered
	ExcerptFlagForcepassages
)

type JsonAnswer

type JsonAnswer struct {
	Endpoint string
	Answer   string
}

JsonAnswer encapsulates answer to Json command.

`Endpoint` - endpoint to which request was directed

`Answer` - string, containing the answer. In opposite to true HTTP connection, here only string mesages given, no numeric error codes.

type JsonOrStr

type JsonOrStr struct {
	IsJson bool   // true, if Val is JSON document; false if it is just a plain string
	Val    string // value (string or JSON document)
}

JsonOrStr is typed string with explicit flag whether it is 'just a string', or json document. It may be used, say, to either escape plain strings when appending to JSON structure, either add it 'as is' assuming it is alreayd json. Such values came from daemon as attribute values for PQ indexes.

func (JsonOrStr) String

func (vl JsonOrStr) String() string

Stringer interface for JsonOrStr type. Just append ' (json)' suffix, if IsJson is true.

type Keyword

type Keyword struct {
	Tokenized  string // token from the query
	Normalized string // normalized token after all stemming/lemming
	Querypos   int    // position in the query
	Docs       int    // number of docs (from backend index)
	Hits       int    // number of hits (from backend index)
}

Keyword represents a keyword returned from BuildKeywords() call

func (Keyword) String

func (kw Keyword) String() string

Stringer interface for Keyword type

type Match

type Match struct {
	DocID  DocID         // key Document ID
	Weight int           // weight of the match
	Attrs  []interface{} // optional array of attributes, quantity and types depends from schema
}

Match represents one match (document) in result schema

func (Match) String

func (vl Match) String() (line string)

Stringer interface for Match type

type PqQuery

type PqQuery = struct {
	Flags   QueryDescFlags
	Query   string
	Tags    string
	Filters string
}

PqQuery describes one separate query info from resultset of CallPQ/CallPQBson

Flags determines type of the Query, and also whether other fields of the struct are filled or not.

Query, Tags, Filters - attributes saved with query, all are optional

type PqResponseFlags

type PqResponseFlags uint32

PqResponseFlags determines boolean flags came in SearchPqResponse result These flags are unified into one bitfield used instead of bunch of separate flags.

There are following bits available:

HasDocs

HasDocs indicates that each QueryDesc of Queries result array have array of documents in Docs field. Otherwise this field there is nil.

DumpQueries

DumpQueries indicates that each query contains additional info, like query itself, tags and filters. Otherwise it have only the number - QueryID and nothing more.

HasDocids

HasDocids, came in pair with HasDocs, indicates that array of documents in Queries[]Docs field is array of uint64 with document ids, provided in documents of original query. Otherwise it is array of int32 with order numbers, may be shifted by Shift param.

const (
	HasDocs PqResponseFlags = (1 << iota)
	DumpQueries
	HasDocids
)

type Pqflags

type Pqflags uint32

Pqflags determines boolean parameter flags for CallQP options This flags are unified into one bitfield used instead of bunch of separate flags.

There are the following flags for CallPQ modes available:

NeedDocs

NeedDocs require to provide numbers of matched documents. It is either order numbers from the set of provided documents, or DocIDs, if documents are JSON and you pointed necessary field which contains DocID. (NOTE: json PQ calls are not yet implemented via API, it will be done later).

NeedQuery

NeedQuery require to return not only QueryID of the matched queries, but also another information about them. It may include query itself, tags and filters.

Verbose

Verbose, require to return additional meta-information about matching and queries. It causes daemon to fill fields TmSetup, TmTotal, QueriesFailed, EarlyOutQueries and QueryDT of SearchPqResponse structure.

SkipBadJson

SkipBadJson, require to not fail on bad (ill-formed) jsons, but warn and continue processing. This flag works only for bson queries and useless for plain text (may even cause warning if provided there).

const (
	NeedDocs Pqflags = (1 << iota)
	NeedQuery

	Verbose
	SkipBadJson
)

type Qflags

type Qflags uint32

Qflags is bitmask with query flags which is set by calling Search.SetQueryFlags() Different values have to be combined with '+' or '|' operation from following constants:

QflagReverseScan

Control the order in which full-scan query processes the rows.

 0 direct scan
 1 reverse scan

QFlagSortKbuffer

Determines sort method for resultset sorting. The result set is in both cases the same; picking one option or the other may just improve (or worsen!) performance.

0 priority queue
1 k-buffer (gives faster sorting for already pre-sorted data, e.g. index data sorted by id)

QflagMaxPredictedTime

Determines if query has or not max_predicted_time option as an extra parameter

0 no predicted time provided
1 query contains predicted time metric

QflagSimplify

Switch on query boolean simplification to speed it up If set to 1, daemon will simplify complex queries or queries that produced by different algos to eliminate and optimize different parts of query.

0 query will be calculated without transformations
1 query will be transformed and simplified.

List of performed transformation is:

common NOT
 ((A !N) | (B !N)) -> ((A|B) !N)

common compound NOT
 ((A !(N C)) | (B !(N D))) -> (((A|B) !N) | (A !C) | (B !D)) // if cost(N) > cost(A) + cost(B)

common sub-term
 ((A (X | C)) | (B (X | D))) -> (((A|B) X) | (A C) | (B D)) // if cost(X) > cost(A) + cost(B)

common keywords
 (A | "A B"~N) -> A
 ("A B" | "A B C") -> "A B"
 ("A B"~N | "A B C"~N) -> ("A B"~N)

common PHRASE
 ("X A B" | "Y A B") -> (("X|Y") "A B")

common AND NOT factor
 ((A !X) | (A !Y) | (A !Z)) -> (A !(X Y Z))

common OR NOT
 ((A !(N | N1)) | (B !(N | N2))) -> (( (A !N1) | (B !N2) ) !N)

excess brackets
 ((A | B) | C) -> ( A | B | C )
 ((A B) C) -> ( A B C )

excess AND NOT
 ((A !N1) !N2) -> (A !(N1 | N2))

QflagPlainIdf

Determines how BM25 IDF will be calculated. Below ``N'' is collection size, and ``n'' is number of matched documents
 1 plain IDF = log(N/n), as per Sparck-Jonesor
 0 normalized IDF = log((N-n+1)/n), as per Robertson et al

QflagGlobalIdf

Determines whether to use global statistics (frequencies) from the global_idf file for IDF computations, rather than the local index statistics.

0 use local index statistics
1 use global_idf file (see https://docs.manticoresearch.com/latest/html/conf_options_reference/index_configuration_options.html#global-idf)

QflagNormalizedTfIdf

Determines whether to divide IDF value additionally by query word count, so that TF*IDF fits into [0..1] range

0 don't divide IDF by query word count
1 divide IDF by query word count

Notes for QflagPlainIdf and QflagNormalizedTfIdf flags

The historically default IDF (Inverse Document Frequency) in Manticore is equivalent to QflagPlainIdf=0, QflagNormalizedTfIdf=1, and those normalizations may cause several undesired effects.

First, normalized idf (QflagPlainIdf=0) causes keyword penalization. For instance, if you search for [the | something] and [the] occurs in more than 50% of the documents, then documents with both keywords [the] and [something] will get less weight than documents with just one keyword [something]. Using QflagPlainIdf=1 avoids this. Plain IDF varies in [0, log(N)] range, and keywords are never penalized; while the normalized IDF varies in [-log(N), log(N)] range, and too frequent keywords are penalized.

Second, QflagNormalizedTfIdf=1 causes IDF drift over queries. Historically, we additionally divided IDF by query keyword count, so that the entire sum(tf*idf) over all keywords would still fit into [0,1] range. However, that means that queries [word1] and [word1 | nonmatchingword2] would assign different weights to the exactly same result set, because the IDFs for both “word1” and “nonmatchingword2” would be divided by 2. QflagNormalizedTfIdf=0 fixes that. Note that BM25, BM25A, BM25F() ranking factors will be scaled accordingly once you disable this normalization.

QflagLocalDf

Determines whether to automatically sum DFs over all the local parts of a distributed index, so that the IDF is consistent (and precise) over a locally sharded index.

0 don't sum local DFs
1 sum local DFs

QflagLowPriority

Determines priority for executing the query

0 run the query in usual (normal) priority
1 run the query in idle priority

QflagFacet

Determines slave role of the query in multi-query facet

0 query is not a facet query, or is main facet query
1 query is depended (slave) part of facet multiquery

QflagFacetHead

Determines slave role of the query in multi-query facet

0 query is not a facet query, or is slave of facet query
1 query is main (head) query of facet multiquery

QflagJsonQuery

Determines if query is originated from REST api and so, must be parsed as one of JSON syntax

0 query is API query
1 query is JSON query
Example
fl := QflagJsonQuery
fmt.Println(fl)
Output:

2048
const (
	QflagReverseScan      Qflags = 1 << iota // direct or reverse full-scans
	QFlagSortKbuffer                         // pq or kbuffer for sorting
	QflagMaxPredictedTime                    // has or not max_predicted_time value
	QflagSimplify                            // apply or not boolean simplification
	QflagPlainIdf                            // plain or normalized idf
	QflagGlobalIdf                           // use or not global idf
	QflagNormalizedTfIdf                     // plain or normalized tf-idf
	QflagLocalDf                             // sum or not DFs over a locally sharderd (distributed) index
	QflagLowPriority                         // run query in idle priority
	QflagFacet                               // query is part of facet batch query
	QflagFacetHead                           // query is main facet query
	QflagJsonQuery                           // query is JSON query (otherwise - API query)
)

type QueryDesc

type QueryDesc = struct {
	QueryID uint64
	Docs    interface{}
	Query   PqQuery
}

QueryDesc represents an elem of Queries array from SearchPqResponse and describe one returned stored query.

QueryID

QueryID is namely, Query ID. In most minimal query it is the only returned field.

Docs

Docs is filled only if flag HasDocs is set, and contains either array of DocID (which are uint64) - if flag HasDocids is set, either array of doc ordinals (which are int32), if flag HasDocids is NOT set.

Query

Query is query meta, in addition to QueryID. It is filled only if in the query options they were requested via bit NeedQuery, and may contain query string, tags and filters.

type QueryDescFlags

type QueryDescFlags uint32

QueryDescFlags is bitfield describing internals of PqQuery struct This flags are unified into one bitfield used instead of bunch of separate flags.

There are following bits available:

QueryPresent

QueryPresent indicates that field Query is valid. Otherwise it is not touched ("" by default)

TagsPresent

TagsPresent indicates that field Tags is valid. Otherwise it is not touched ("" by default)

FiltersPresent

FiltersPresent indicates that field Filters is valid. Otherwise it is not touched ("" by default)

QueryIsQl

QueryIsQl indicates that field Query (if present) is query in sphinxql syntax. Otherwise it is query in json syntax. PQ index can store indexes in both format, and this flag in resultset helps you to distinguish them (both are text, but syntax m.b. different)

const (
	QueryPresent QueryDescFlags = (1 << iota)
	TagsPresent
	FiltersPresent
	QueryIsQl
)

type QueryResult

type QueryResult struct {
	Error, Warning    string         // messages (if any)
	Status            ESearchdstatus // status code for current resultset
	Fields            []string       // fields of the schema
	Attrs             []ColumnInfo   // attributes of the schema
	Id64              bool           // if DocumentID is 64-bit (always true)
	Matches           []Match        // set of matches according to schema
	Total, TotalFound int            // num of matches and total num of matches found
	QueryTime         time.Duration  // query duration
	WordStats         []WordStat     // words statistic
}

QueryResult represents resultset from successful Query/RunQuery, or one of resultsets from RunQueries call.

func (QueryResult) String

func (res QueryResult) String() string

Stringer interface for QueryResult type

type Search struct {
	Offset       int32 // offset into resultset (0)
	Limit        int32 // count of resultset (20)
	MaxMatches   int32
	CutOff       int32
	RetryCount   int32
	MaxQueryTime time.Duration
	RetryDelay   time.Duration

	MatchMode EMatchMode // Matching mode

	FieldWeights map[string]int32 // bind per-field weights by name
	IndexWeights map[string]int32 // bind per-index weights by name
	IDMin        DocID            // set IDs range to match (from)
	IDMax        DocID            // set IDs range to match (to)

	Groupfunc     EGroupBy
	GroupBy       string
	GroupSort     string
	GroupDistinct string // count-distinct attribute for group-by queries
	SelectClause  string // select-list (attributes or expressions), SQL-like syntax

	Indexes string
	Comment string
	Query   string
	// contains filtered or unexported fields
}

Search represents one search query. Exported fields may be set directly. Unexported which bind by internal dependencies and constrains intended to be set wia special methods.

func NewSearch

func NewSearch(query, index, comment string) Search

NewSearch construct default search which then may be customized. You may just customize 'Query' and m.b. 'Indexes' from default one, and it will work like a simple 'Query()' call.

func (*Search) AddFilter

func (q *Search) AddFilter(attribute string, values []int64, exclude bool)

AddFilter adds new integer values set filter.

On this call, additional new filter is added to the existing list of filters.

`attribute` must be a string with attribute name

`values` must be a plain slice containing integer values.

`exclude` controls whether to accept the matching documents (default mode, when `exclude` is false) or reject them.

Only those documents where `attribute` column value stored in the index matches any of the values from `values` slice will be matched (or rejected, if `exclude` is true).

func (*Search) AddFilterExpression

func (q *Search) AddFilterExpression(expression string, exclude bool)

AddFilterExpression adds new filter by expression.

On this call, additional new filter is added to the existing list of filters.

The only value `expression` must contain filtering expression which returns bool.

Expression has SQL-like syntax and may refer to columns (usually json fields) by name, and may look like: 'j.price - 1 > 3 OR j.tag IS NOT null' Documents either filtered by 'true' expression, either (if `exclude` is set to true) by 'false'.

func (*Search) AddFilterFloatRange

func (q *Search) AddFilterFloatRange(attribute string, fmin, fmax float32, exclude bool)

AddFilterFloatRange adds new float range filter.

On this call, additional new filter is added to the existing list of filters.

`attribute` must be a string with attribute name.

`fmin` and `fmax` must be floats that define the acceptable attribute values range (including the boundaries).

`exclude` controls whether to accept the matching documents (default mode, when `exclude` is false) or reject them.

Only those documents where `attribute` column value stored in the index is between `fmin` and `fmax` (including values that are exactly equal to `fmin` or `fmax`) will be matched (or rejected, if `exclude` is true).

func (*Search) AddFilterNull

func (q *Search) AddFilterNull(attribute string, isnull bool)

AddFilterNull adds new IsNull filter.

On this call, additional new filter is added to the existing list of filters. Documents where `attribute` is null will match, (if `isnull` is true) or not match (if `isnull` is false).

func (*Search) AddFilterRange

func (q *Search) AddFilterRange(attribute string, imin, imax int64, exclude bool)

AddFilterRange adds new integer range filter.

On this call, additional new filter is added to the existing list of filters.

`attribute` must be a string with attribute name.

`imin` and `imax` must be integers that define the acceptable attribute values range (including the boundaries).

`exclude` controls whether to accept the matching documents (default mode, when `exclude` is false) or reject them.

Only those documents where `attribute` column value stored in the index is between `imin` and `imax` (including values that are exactly equal to `imin` or `imax`) will be matched (or rejected, if `exclude` is true).

func (*Search) AddFilterString

func (q *Search) AddFilterString(attribute string, value string, exclude bool)

AddFilterString adds new string value filter.

On this call, additional new filter is added to the existing list of filters.

`attribute` must be a string with attribute name.

`value` must be a string.

`exclude` must be a boolean value; it controls whether to accept the matching documents (default mode, when `exclude` is false) or reject them.

Only those documents where `attribute` column value stored in the index equal to string value from `value` will be matched (or rejected, if `exclude` is true).

func (*Search) AddFilterStringList

func (q *Search) AddFilterStringList(attribute string, values []string, exclude bool)

AddFilterStringList adds new string list filter.

On this call, additional new filter is added to the existing list of filters.

`attribute` must be a string with attribute name.

`values` must be slice of strings

`exclude` must be a boolean value; it controls whether to accept the matching documents (default mode, when `exclude` is false) or reject them.

Only those documents where `attribute` column value stored in the index equal to one of string values from `values` will be matched (or rejected, if `exclude` is true).

func (*Search) AddFilterUservar

func (q *Search) AddFilterUservar(attribute string, uservar string, exclude bool)

AddFilterUservar adds new uservar filter.

On this call, additional new filter is added to the existing list of filters.

`attribute` must be a string with attribute name.

`uservar` must be name of user variable, containing list of filtering values, starting from @, as "@var"

`exclude` must be a boolean value; it controls whether to accept the matching documents (default mode, when `exclude` is false) or reject them.

Only those documents where `attribute` column value stored in the index equal to one of the values stored in `uservar` variable on daemon side (or rejected, if `exclude` is true). Such filter intended to save huge list of variables once on the server, and then refer to it by name. Saving the list might be done by separate call of 'SetUservar()'

func (*Search) ChangeQueryFlags

func (q *Search) ChangeQueryFlags(flags Qflags, set bool)

ChangeQueryFlags changes (set or reset) query flags by mask `flags`.

func (*Search) ResetFilters

func (q *Search) ResetFilters()

ResetFilters clears all currently set search filters.

This call is only normally required when using multi-queries. You might want to set different filters for different queries in the batch. To do that, you may either create another Search request and fill it from the scratch, either copy existing (last one) and modify. To change all the filters in the copy you can call ResetFilters() and add new filters using the respective calls.

func (*Search) ResetGroupBy

func (q *Search) ResetGroupBy()

ResetGroupBy clears all currently group-by settings, and disables group-by.

This call is only normally required when using multi-queries. You might want to set different group-by settings in the batch. To do that, you may either create another Search request and fill ot from the scratch, either copy existing (last one) and modify. In last case you can change individual group-by settings using SetGroupBy() and SetGroupDistinct() calls, but you can not disable group-by using those calls. ResetGroupBy() fully resets previous group-by settings and disables group-by mode in the current Search query.

func (*Search) ResetOuterSelect

func (q *Search) ResetOuterSelect()

ResetOuterSelect clears all outer select settings

This call is only normally required when using multi-queries. You might want to set different outer select settings in the batch. To do that, you may either create another Search request and fill ot from the scratch, either copy existing (last one) and modify. In last case you can change individual group-by settings using SetOuterSelect() calls, but you can not disable outer statement by this calls. ResetOuterSelect() fully resets previous outer select settings.

func (*Search) ResetQueryFlags

func (q *Search) ResetQueryFlags()

ResetQueryFlags resets query flags of Select query to default value, and also reset value set by SetMaxPredictedTime() call.

This call is only normally required when using multi-queries. You might want to set different flags of Select queries in the batch. To do that, you may either create another Search request and fill ot from the scratch, either copy existing (last one) and modify. In last case you can change individual or many flags using SetQueryFlags() and ChangeQueryFlags() calls. This call just one-shot set all the flags to default value `QflagNormalizedTfIdf`, and also set predicted time to 0.

func (*Search) SetGeoAnchor

func (q *Search) SetGeoAnchor(attrlat, attrlong string, lat, long float32)

SetGeoAnchor sets anchor point for and geosphere distance (geodistance) calculations, and enable them.

`attrlat` and `attrlong` contain the names of latitude and longitude attributes, respectively.

`lat` and `long` specify anchor point latitude and longitude, in radians.

Once an anchor point is set, you can use magic @geodist attribute name in your filters and/or sorting expressions. Manticore will compute geosphere distance between the given anchor point and a point specified by latitude and longitude attributes from each full-text match, and attach this value to the resulting match. The latitude and l ongitude values both in SetGeoAnchor and the index attribute data are expected to be in radians. The result will be returned in meters, so geodistance value of 1000.0 means 1 km. 1 mile is approximately 1609.344 meters.

func (*Search) SetGroupBy

func (q *Search) SetGroupBy(attribute string, gfunc EGroupBy, groupsort ...string)

SetGroupBy sets grouping attribute, function, and groups sorting mode; and enables grouping.

`attribute` is a string that contains group-by attribute name.

`func` is a constant that chooses a function applied to the attribute value in order to compute group-by key.

`groupsort` is optional clause that controls how the groups will be sorted.

Grouping feature is very similar in nature to GROUP BY clause from SQL. Results produces by this function call are going to be the same as produced by the following pseudo code:

SELECT ... GROUP BY func(attribute) ORDER BY groupsort

Note that it’s `groupsort` that affects the order of matches in the final result set. Sorting mode (see `SetSortMode()`) affect the ordering of matches within group, ie. what match will be selected as the best one from the group. So you can for instance order the groups by matches count and select the most relevant match within each group at the same time.

Grouping on string attributes is supported, with respect to current collation.

func (*Search) SetMaxPredictedTime

func (q *Search) SetMaxPredictedTime(predtime time.Duration)

SetMaxPredictedTime set max predicted time and according query flag

func (*Search) SetOuterSelect

func (q *Search) SetOuterSelect(orderby string, offset, limit int32)

SetOuterSelect determines outer select conditions for Search query.

`orderby` specify clause with SQL-like syntax as "foo ASC, bar DESC, baz" where name of the items (`foo`, `bar`, `baz` in example) are the names of columns originating from internal query.

`offset` and `limit` has the same meaning as fields Offset and Limit in the clause, but applied to outer select.

Outer select currently have 2 usage cases:

1. We have a query with 2 ranking UDFs, one very fast and the other one slow and we perform a full-text search will a big match result set. Without outer the query would look like

q := NewSearch("some common query terms", "index", "")
q.SelectClause = "id, slow_rank() as slow, fast_rank as fast"
q.SetSortMode( SortExtended, "fast DESC, slow DESC" )
// q.Limit=20, q.MaxMatches=1000 - are default, so we don't set them explicitly

With subselects the query can be rewritten as :

q := NewSearch("some common query terms", "index", "")
q.SelectClause = "id, slow_rank() as slow, fast_rank as fast"
q.SetSortMode( SortExtended, "fast DESC" )
q.Limit=100
q.SetOuterSelect("slow desc", 0, 20)

In the initial query the slow_rank() UDF is computed for the entire match result set. With subselects, only fast_rank() is computed for the entire match result set, while slow_rank() is only computed for a limited set.

2. The second case comes handy for large result set coming from a distributed index.

For this query:

q := NewSearch("some conditions", "my_dist_index", "")
q.Limit = 50000

If we have 20 nodes, each node can send back to master a number of 50K records, resulting in 20 x 50K = 1M records, however as the master sends back only 50K (out of 1M), it might be good enough for us for the nodes to send only the top 10K records. With outer select we can rewrite the query as:

q := NewSearch("some conditions", "my_dist_index", "")
q.Limit = 10000
q.SetOuterSelect("some_attr", 0, 50000)

In this case, the nodes receive only the inner query and execute. This means the master will receive only 20x10K=200K records. The master will take all the records received, reorder them by the OUTER clause and return the best 50K records. The outer select helps reducing the traffic between the master and the nodes and also reduce the master’s computation time (as it process only 200K instead of 1M).

func (*Search) SetQueryFlags

func (q *Search) SetQueryFlags(flags Qflags)

SetQueryFlags set query flags. New flags are |-red to existing value, previously set flags are not affected. Note that default flags has set QflagNormalizedTfIdf bit, so if you need to reset it, you need to explicitly invoke ChangeQueryFlags(QflagNormalizedTfIdf,false) for it.

func (*Search) SetRankingExpression

func (q *Search) SetRankingExpression(rankexpr string)

SetRankingExpression assigns ranking expression, and also set ranking mode to RankExpr

`rankexpr` provides ranking formula, for example, "sum(lcs*user_weight)*1000+bm25" - this is the same as RankProximityBm25, but written explicitly. Since using ranking expression assumes RankExpr ranker, it is also set by this function.

func (*Search) SetRankingMode

func (q *Search) SetRankingMode(ranker ERankMode)

SetRankingMode assigns ranking mode and also adjust MatchMode to MatchExtended2 (since otherwise rankers are useless)

func (*Search) SetSortMode

func (q *Search) SetSortMode(sort ESortOrder, sortby ...string)

SetSortMode sets matches sorting mode

`sort` determines sorting mode.

`sortby` determines attribute or expression used for sorting.

If `sortby` set in Search query is empty (it is not necessary set in this very call, it might be set earlier!), then `sort` is explicitly set as SortRelevance

func (*Search) SetTokenFilter

func (q *Search) SetTokenFilter(library, name string, opts string)

SetTokenFilter setups UDF token filter

`library` is the name of plugin library, as "mylib.so"

`name` is the name of token filtering function in the library, as "email_process"

`opts` is string parameters which passed to udf filter, like "field=email;split=.io". Format of the options determined by UDF plugin.

type SearchPqOptions

type SearchPqOptions = struct {
	Flags   Pqflags
	IdAlias string
	Shift   int32
}

SearchPqOptions incapsulates params to be passed to CallPq function.

Flags

Flags is instance of Pqflags, different bites described there.

IdAlias

IdAlias determines name of the field in supplied json documents, which contain DocumentID. If NeedDocs flag is set, this value will be used in resultset to identify documents instead of just plain numbers of them.

Shift

Shift is used if daemon returns order number of the documents (i.e. when NeedDoc flag is set, but no IdAlias provided, or if documents are just plain texts and can't contain such field at all). Shift then is just added to every number of the doc, helping move the whole range. Say, if you provide 2 documents, they may be returned as numbers 1 and 2. Buf if you also give Shift=100, they will became 101 and 102. It may help if you distribute bit docset over several instances and want to keep the numbers. Daemon itself uses this value for the same purpose.

func NewSearchPqOptions

func NewSearchPqOptions() SearchPqOptions

NewSearchPqOptions creates empty instance of search options. Prefer to use this function when you need options, since it may set necessary defaults

type SearchPqResponse

type SearchPqResponse = struct {
	Flags           PqResponseFlags
	TmTotal         time.Duration // total time spent for matching the document(s)
	TmSetup         time.Duration // time spent to initial setup of matching process - parsing docs, setting options, etc.
	QueriesMatched  int           // how many stored queries match the document(s)
	QueriesFailed   int           // number of failed queries
	DocsMatched     int           // how many times the documents match the queries stored in the index
	TotalQueries    int           // how many queries are stored in the index at all
	OnlyTerms       int           // how many queries in the index have terms. The rest of the queries have extended query syntax
	EarlyOutQueries int           // num of queries which wasn’t fall into full routine, but quickly matched and rejected with filters or other conditions
	QueryDT         []int         // detailed times per each query
	Warnings        string
	Queries         []QueryDesc // queries themselve. See QueryDesc structure for details
}

SearchPqResponse represents whole response to CallPQ and CallPQBson calls

type SnippetOptions

type SnippetOptions struct {
	BeforeMatch,
	AfterMatch,
	ChunkSeparator,
	HtmlStripMode,
	PassageBoundary string
	Limit,
	LimitPassages,
	LimitWords,
	Around,
	StartPassageId int32
	Flags ExcerptFlags
}

SnippetOptions used to tune snippet's generation. All fields are exported and have meaning described below.

BeforeMatch

A string to insert before a keyword match. A '%PASSAGE_ID%' macro can be used in this string. The first match of the macro is replaced with an incrementing passage number within a current snippet. Numbering starts at 1 by default but can be overridden with start_passage_id option. In a multi-document call, '%PASSAGE_ID%' would restart at every given document.

AfterMatch

A string to insert after a keyword match. %PASSAGE_ID% macro can be used in this string.

ChunkSeparator

A string to insert between snippet chunks (passages).

HtmlStripMode

HTML stripping mode setting. Possible values are `index`, which means that index settings will be used, `none` and `strip`, that forcibly skip or apply stripping irregardless of index settings; and `retain`, that retains HTML markup and protects it from highlighting. The retain mode can only be used when highlighting full documents and thus requires that no snippet size limits are set. String, allowed values are none, strip, index, and retain.

PassageBoundary

Ensures that passages do not cross a sentence, paragraph, or zone boundary (when used with an index that has the respective indexing settings enabled). Allowed values are `sentence`, `paragraph`, and `zone`.

Limit

Maximum snippet size, in runes (codepoints).

LimitPassages

Limits the maximum number of passages that can be included into the snippet.

LimitWords

Limits the maximum number of words that can be included into the snippet. Note the limit applies to any words, and not just the matched keywords to highlight. For example, if we are highlighting Mary and a passage Mary had a little lamb is selected, then it contributes 5 words to this limit, not just 1

Around

How much words to pick around each matching keywords block.

StartPassageId

Specifies the starting value of `%PASSAGE_ID%` macro (that gets detected and expanded in before_match, after_match strings).

Flags

Bitmask. Individual bits described in `type ExcerptFlags` constants.

func NewSnippetOptions

func NewSnippetOptions() *SnippetOptions

Create default SnippetOptions with following defaults:

BeforeMatch: "<b>"
AfterMatch: "</b>"
ChunkSeparator: " ... "
HtmlStripMode: "index"
PassageBoundary: "none"
Limit: 256
Around: 5
StartPassageId: 1
//  Rest of the fields: 0, or "" (depends from type)

type SqlMsg

type SqlMsg string

SqlMsg represents answer from mysql proto (error code and message)

func (SqlMsg) String

func (r SqlMsg) String() string

Stringer interface for SqlMsg type. Provides message like one in mysql cli

type SqlResultset

type SqlResultset [][]interface{}

SqlResultset returned from Sphinxql and contains one or more mysql resultsets

type SqlSchema

type SqlSchema []sqlfield

SqlSchema is the schema of resultset from mysql call.

type Sqlresult

type Sqlresult struct {
	Msg          SqlMsg
	Warnings     uint16
	ErrorCode    uint16
	RowsAffected int
	Schema       SqlSchema
	Rows         SqlResultset
}

Sqlresult is mysql resultset with table of messages.

func (Sqlresult) String

func (r Sqlresult) String() string

Stringer interface for Sqlresult type. provides data like one from mysql cli, as header with the schema, and rows of data following.

Number of warnings and errors also provided usual way.

type WordStat

type WordStat struct {
	Word       string
	Docs, Hits int
}

WordStat describes statistic for one word in QueryResult. That is, word, num of docs and num of hits.

func (WordStat) String

func (vl WordStat) String() string

Stringer interface for WordStat type

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL