pandagg.search module¶
-
class
pandagg.search.
MultiSearch
(using: Optional[elasticsearch.client.Elasticsearch], index: Union[str, Tuple[str], List[str], None] = None)[source]¶ Bases:
pandagg.search.Request
Combine multiple
Search
objects into a single request.-
add
(search: pandagg.search.Search) → MultiSearch[source]¶ Adds a new
Search
object to the request:ms = MultiSearch(index='my-index') ms = ms.add(Search(doc_type=Category).filter('term', category='python')) ms = ms.add(Search(doc_type=Blog))
-
-
class
pandagg.search.
Request
(using: Optional[elasticsearch.client.Elasticsearch], index: Union[str, Tuple[str], List[str], None] = None)[source]¶ Bases:
object
-
index
(*index) → T[source]¶ Set the index for the search. If called empty it will remove all information.
Example:
s = Search() s = s.index(‘twitter-2015.01.01’, ‘twitter-2015.01.02’) s = s.index([‘twitter-2015.01.01’, ‘twitter-2015.01.02’])
-
params
(**kwargs) → T[source]¶ Specify query params to be used when executing the search. All the keyword arguments will override the current values. See https://elasticsearch-py.readthedocs.io/en/master/api.html#elasticsearch.Elasticsearch.search for all available parameters.
Example:
s = Search() s = s.params(routing='user-1', preference='local')
-
using
(client: elasticsearch.client.Elasticsearch) → T[source]¶ Associate the search request with an elasticsearch client. A fresh copy will be returned with current instance remaining unchanged.
Parameters: client – an instance of elasticsearch.Elasticsearch
to use or an alias to look up inelasticsearch_dsl.connections
-
-
class
pandagg.search.
Search
(using: Optional[Elasticsearch] = None, index: Optional[Union[str, Tuple[str], List[str]]] = None, mappings: Optional[Union[MappingsDict, Mappings]] = None, nested_autocorrect: bool = False, repr_auto_execute: bool = False, document_class: DocumentMeta = None)[source]¶ Bases:
pandagg.utils.DSLMixin
,pandagg.search.Request
-
agg
(name: str, type_or_agg: Union[str, Dict[str, Dict[str, Any]], pandagg.node.aggs.abstract.AggClause, None] = None, insert_below: Optional[str] = None, at_root: bool = False, **body) → Search[source]¶ Insert provided agg clause in copy of initial Aggs.
Accept following syntaxes for type_or_agg argument:
string, with body provided in kwargs >>> Aggs().agg(name=’some_agg’, type_or_agg=’terms’, field=’some_field’)
python dict format: >>> Aggs().agg(name=’some_agg’, type_or_agg={‘terms’: {‘field’: ‘some_field’})
AggClause instance: >>> from pandagg.aggs import Terms >>> Aggs().agg(name=’some_agg’, type_or_agg=Terms(field=’some_field’))
Parameters: - name – inserted agg clause name
- type_or_agg – either agg type (str), or agg clause of dict format, or AggClause instance
- insert_below – name of aggregation below which provided aggs should be inserted
- at_root – if True, aggregation is inserted at root
- body – aggregation clause body when providing string type_of_agg (remaining kwargs)
Returns: copy of initial Aggs with provided agg inserted
-
aggs
(aggs: Union[Dict[str, Union[Dict[str, Dict[str, Any]], pandagg.node.aggs.abstract.AggClause]], Aggs], insert_below: Optional[str] = None, at_root: bool = False) → Search[source]¶ Insert provided aggs in copy of initial Aggs.
Accept following syntaxes for provided aggs:
python dict format: >>> Aggs().aggs({‘some_agg’: {‘terms’: {‘field’: ‘some_field’}}, ‘other_agg’: {‘avg’: {‘field’: ‘age’}}})
Aggs instance: >>> Aggs().aggs(Aggs({‘some_agg’: {‘terms’: {‘field’: ‘some_field’}}, ‘other_agg’: {‘avg’: {‘field’: ‘age’}}}))
dict with Agg clauses values: >>> from pandagg.aggs import Terms, Avg >>> Aggs().aggs({‘some_agg’: Terms(field=’some_field’), ‘other_agg’: Avg(field=’age’)})
Parameters: - aggs – aggregations to insert into existing aggregation
- insert_below – name of aggregation below which provided aggs should be inserted
- at_root – if True, aggregation is inserted at root
Returns: copy of initial Aggs with provided aggs inserted
-
bool
(must: Union[Dict[str, Dict[str, Any]], pandagg.node.query.abstract.QueryClause, List[Union[Dict[str, Dict[str, Any]], pandagg.node.query.abstract.QueryClause]], None] = None, should: Union[Dict[str, Dict[str, Any]], pandagg.node.query.abstract.QueryClause, List[Union[Dict[str, Dict[str, Any]], pandagg.node.query.abstract.QueryClause]], None] = None, must_not: Union[Dict[str, Dict[str, Any]], pandagg.node.query.abstract.QueryClause, List[Union[Dict[str, Dict[str, Any]], pandagg.node.query.abstract.QueryClause]], None] = None, filter: Union[Dict[str, Dict[str, Any]], pandagg.node.query.abstract.QueryClause, List[Union[Dict[str, Dict[str, Any]], pandagg.node.query.abstract.QueryClause]], None] = None, insert_below: Optional[str] = None, on: Optional[str] = None, mode: typing_extensions.Literal['add', 'replace', 'replace_all'][add, replace, replace_all] = 'add', **body) → Search[source]¶ >>> Query().bool(must={"term": {"some_field": "yolo"}})
-
count
() → int[source]¶ Return the number of hits matching the query and filters. Note that only the actual number is returned.
-
exclude
(type_or_query: Union[str, Dict[str, Dict[str, Any]], pandagg.node.query.abstract.QueryClause, Query], insert_below: Optional[str] = None, on: Optional[str] = None, mode: typing_extensions.Literal['add', 'replace', 'replace_all'][add, replace, replace_all] = 'add', bool_body: Optional[Dict[str, Any]] = None, **body) → Search[source]¶ Must not wrapped in filter context.
-
execute
() → pandagg.response.SearchResponse[source]¶ Execute the search and return an instance of
Response
wrapping all the data.
-
filter
(type_or_query: Union[str, Dict[str, Dict[str, Any]], pandagg.node.query.abstract.QueryClause, Query], insert_below: Optional[str] = None, on: Optional[str] = None, mode: typing_extensions.Literal['add', 'replace', 'replace_all'][add, replace, replace_all] = 'add', bool_body: Optional[Dict[str, Any]] = None, **body) → Search[source]¶
-
classmethod
from_dict
(d: Dict[KT, VT]) → Search[source]¶ Construct a new Search instance from a raw dict containing the search body. Useful when migrating from raw dictionaries.
Example:
s = Search.from_dict({ "query": { "bool": { "must": [...] } }, "aggs": {...} }) s = s.filter('term', published=True)
-
groupby
(name: str, type_or_agg: Union[str, Dict[str, Dict[str, Any]], pandagg.node.aggs.abstract.AggClause, None] = None, insert_below: Optional[str] = None, at_root: bool = False, **body) → Search[source]¶ Insert provided aggregation clause in copy of initial Aggs.
Given the initial aggregation:
A──> B └──> C
If insert_below = ‘A’:
A──> new──> B └──> C
>>> Aggs().groupby('per_user_id', 'terms', field='user_id') {"per_user_id":{"terms":{"field":"user_id"}}}
>>> Aggs().groupby('per_user_id', {'terms': {"field": "user_id"}}) {"per_user_id":{"terms":{"field":"user_id"}}}
>>> from pandagg.aggs import Terms >>> Aggs().groupby('per_user_id', Terms(field="user_id")) {"per_user_id":{"terms":{"field":"user_id"}}}
Return type: pandagg.aggs.Aggs
-
highlight
(*fields, **kwargs) → Search[source]¶ Request highlighting of some fields. All keyword arguments passed in will be used as parameters for all the fields in the
fields
parameter. Example:Search().highlight('title', 'body', fragment_size=50)
will produce the equivalent of:
{ "highlight": { "fields": { "body": {"fragment_size": 50}, "title": {"fragment_size": 50} } } }
If you want to have different options for different fields you can call
highlight
twice:Search().highlight('title', fragment_size=50).highlight('body', fragment_size=100)
which will produce:
{ "highlight": { "fields": { "body": {"fragment_size": 100}, "title": {"fragment_size": 50} } } }
-
highlight_options
(**kwargs) → Search[source]¶ Update the global highlighting options used for this request. For example:
s = Search() s = s.highlight_options(order='score')
-
must
(type_or_query: Union[str, Dict[str, Dict[str, Any]], pandagg.node.query.abstract.QueryClause, Query], insert_below: Optional[str] = None, on: Optional[str] = None, mode: typing_extensions.Literal['add', 'replace', 'replace_all'][add, replace, replace_all] = 'add', bool_body: Optional[Dict[str, Any]] = None, **body) → Search[source]¶ Create copy of initial Query and insert provided clause under “bool” query “must”.
>>> Query().must('term', some_field=1) >>> Query().must({'term': {'some_field': 1}}) >>> from pandagg.query import Term >>> Query().must(Term(some_field=1))
Keyword Arguments: - insert_below (
str
) – named query clause under which the inserted clauses should be placed. - compound_param (
str
) – param under which inserted clause will be placed in compound query - on (
str
) – named compound query clause on which the inserted compound clause should be merged. - mode (
str
one of ‘add’, ‘replace’, ‘replace_all’) – merging strategy when inserting clauses on a existing compound clause.- ‘add’ (default) : adds new clauses keeping initial ones
- ‘replace’ : for each parameter (for instance in ‘bool’ case : ‘filter’, ‘must’, ‘must_not’, ‘should’), replace existing clauses under this parameter, by new ones only if declared in inserted compound query
- ‘replace_all’ : existing compound clause is completely replaced by the new one
- insert_below (
-
must_not
(type_or_query: Union[str, Dict[str, Dict[str, Any]], pandagg.node.query.abstract.QueryClause, Query], insert_below: Optional[str] = None, on: Optional[str] = None, mode: typing_extensions.Literal['add', 'replace', 'replace_all'][add, replace, replace_all] = 'add', bool_body: Optional[Dict[str, Any]] = None, **body) → Search[source]¶
-
post_filter
(type_or_query: Union[str, Dict[str, Dict[str, Any]], pandagg.node.query.abstract.QueryClause, Query], insert_below: Optional[str] = None, on: Optional[str] = None, mode: typing_extensions.Literal['add', 'replace', 'replace_all'][add, replace, replace_all] = 'add', compound_param: Optional[str] = None, **body) → Search[source]¶
-
query
(type_or_query: Union[str, Dict[str, Dict[str, Any]], pandagg.node.query.abstract.QueryClause, Query], insert_below: Optional[str] = None, on: Optional[str] = None, mode: typing_extensions.Literal['add', 'replace', 'replace_all'][add, replace, replace_all] = 'add', compound_param: Optional[str] = None, **body) → Search[source]¶ Insert provided clause in copy of initial Query.
>>> from pandagg.query import Query >>> Query().query('term', some_field=23) {'term': {'some_field': 23}}
>>> from pandagg.query import Term >>> Query()\ >>> .query({'term': {'some_field': 23})\ >>> .query(Term(other_field=24))\ {'bool': {'must': [{'term': {'some_field': 23}}, {'term': {'other_field': 24}}]}}
Keyword Arguments: - insert_below (
str
) – named query clause under which the inserted clauses should be placed. - compound_param (
str
) – param under which inserted clause will be placed in compound query - on (
str
) – named compound query clause on which the inserted compound clause should be merged. - mode (
str
one of ‘add’, ‘replace’, ‘replace_all’) – merging strategy when inserting clauses on a existing compound clause.- ‘add’ (default) : adds new clauses keeping initial ones
- ‘replace’ : for each parameter (for instance in ‘bool’ case : ‘filter’, ‘must’, ‘must_not’, ‘should’), replace existing clauses under this parameter, by new ones only if declared in inserted compound query
- ‘replace_all’ : existing compound clause is completely replaced by the new one
- insert_below (
-
scan
() → Iterator[pandagg.response.Hit][source]¶ Turn the search into a scan search and return a generator that will iterate over all the documents matching the query.
Use
params
method to specify any additional arguments you with to pass to the underlyingscan
helper fromelasticsearch-py
- https://elasticsearch-py.readthedocs.io/en/master/helpers.html#elasticsearch.helpers.scan
-
scan_composite_agg
(size: int) → Iterator[Dict[str, Any]][source]¶ Iterate over the whole aggregation composed buckets, yields buckets.
-
scan_composite_agg_at_once
(size: int) → pandagg.response.Aggregations[source]¶ Iterate over the whole aggregation composed buckets (converting Aggs into composite agg if possible), and return all buckets at once in a Aggregations instance.
-
script_fields
(**kwargs) → Search[source]¶ Define script fields to be calculated on hits. See https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-script-fields.html for more details.
Example:
s = Search() s = s.script_fields(times_two="doc['field'].value * 2") s = s.script_fields( times_three={ 'script': { 'inline': "doc['field'].value * params.n", 'params': {'n': 3} } } )
-
should
(type_or_query: Union[str, Dict[str, Dict[str, Any]], pandagg.node.query.abstract.QueryClause, Query], insert_below: Optional[str] = None, on: Optional[str] = None, mode: typing_extensions.Literal['add', 'replace', 'replace_all'][add, replace, replace_all] = 'add', bool_body: Optional[Dict[str, Any]] = None, **body) → Search[source]¶
-
sort
(*keys) → Search[source]¶ Add sorting information to the search request. If called without arguments it will remove all sort requirements. Otherwise it will replace them. Acceptable arguments are:
'some.field' '-some.other.field' {'different.field': {'any': 'dict'}}
so for example:
s = Search().sort( 'category', '-title', {"price" : {"order" : "asc", "mode" : "avg"}} )
will sort by
category
,title
(in descending order) andprice
in ascending order using theavg
mode.The API returns a copy of the Search object and can thus be chained.
-
source
(fields: Union[str, List[str], Dict[str, Any], None] = None, **kwargs) → Search[source]¶ Selectively control how the _source field is returned.
Parameters: fields – wildcard string, array of wildcards, or dictionary of includes and excludes If
fields
is None, the entire document will be returned for each hit. If fields is a dictionary with keys of ‘includes’ and/or ‘excludes’ the fields will be either included or excluded appropriately.Calling this multiple times with the same named parameter will override the previous values with the new ones.
Example:
s = Search() s = s.source(includes=['obj1.*'], excludes=["*.description"]) s = Search() s = s.source(includes=['obj1.*']).source(excludes=["*.description"])
-
suggest
(name: str, text: str, **kwargs) → Search[source]¶ Add a suggestions request to the search.
Parameters: - name – name of the suggestion
- text – text to suggest on
All keyword arguments will be added to the suggestions body. For example:
s = Search() s = s.suggest('suggestion-1', 'Elasticsearch', term={'field': 'body'})
-
to_dict
(count: bool = False, **kwargs) → pandagg.types.SearchDict[source]¶ Serialize the search into the dictionary that will be sent over as the request’s body.
Parameters: count – a flag to specify if we are interested in a body for count - no aggregations, no pagination bounds etc. All additional keyword arguments will be included into the dictionary.
-