Search¶
Search
class is intended to perform requests, and refers to
Elasticsearch search api:
>>> from pandagg.search import Search
>>>
>>> client = ElasticSearch(hosts=['localhost:9200'])
>>> search = Search(using=client, index='movies')\
>>> .size(2)\
>>> .groupby('decade', 'histogram', interval=10, field='year')\
>>> .groupby('genres', size=3)\
>>> .aggs('avg_rank', 'avg', field='rank')\
>>> .agg('avg_nb_roles', 'avg', field='nb_roles')\
>>> .filter('range', year={"gte": 1990})
>>> search
{
"query": {
"bool": {
"filter": [
{
"range": {
"year": {
"gte": 1990
}
}
}
]
}
},
"aggs": {
"decade": {
"histogram": {
"field": "year",
"interval": 10
},
"aggs": {
"genres": {
"terms": {
"field": "genres",
"size": 3
},
"aggs": {
"avg_rank": {
"avg": {
"field": "rank"
}
},
"avg_nb_roles": {
"avg": {
"field": "nb_roles"
}
}
}
}
}
}
},
"size": 2
}
It relies on:
Query
to build queries, query or post_filter (see Query),Aggs
to build aggregations (see Aggregation)
Note
All methods described below return a new Search
instance, and keep unchanged the
initial search request.
>>> from pandagg.search import Search
>>> initial_s = Search()
>>> enriched_s = initial_s.query('terms', genres=['Comedy', 'Short'])
>>> initial_s.to_dict()
{}
>>> enriched_s.to_dict()
{'query': {'terms': {'genres': ['Comedy', 'Short']}}}
Query part¶
The query or post_filter parts of a Search
instance are available respectively
under _query and _post_filter attributes.
>>> search._query.__class__
pandagg.tree.query.abstract.Query
>>> search._query.show()
<Query>
bool
└── filter
└── range, field=year, gte=1990
To enrich query of a search request, methods are exactly the same as for a
Query
instance.
>>> Search().must_not('range', year={'lt': 1980})
{
"query": {
"bool": {
"must_not": [
{
"range": {
"year": {
"lt": 1980
}
}
}
]
}
}
}
See section Query for more details.
Aggregations part¶
The aggregations part of a Search
instance is available under _aggs attribute.
>>> search._aggs.__class__
pandagg.tree.aggs.aggs.Aggs
>>> search._aggs.show()
<Aggregations>
decade <histogram, field="year", interval=10>
└── genres <terms, field="genres", size=3>
├── avg_nb_roles <avg, field="nb_roles">
└── avg_rank <avg, field="rank">
To enrich aggregations of a search request, methods are exactly the same as for a
Aggs
instance.
>>> Search()\
>>> .groupby('decade', 'histogram', interval=10, field='year')\
>>> .agg('avg_rank', 'avg', field='rank')
{
"aggs": {
"decade": {
"histogram": {
"field": "year",
"interval": 10
},
"aggs": {
"avg_rank": {
"avg": {
"field": "rank"
}
}
}
}
}
}
See section Aggregation for more details.
Other search request parameters¶
size, sources, limit etc, all those parameters are documented in Search
documentation and their usage is quite self-explanatory.
Request execution¶
To a execute a search request, you must first have bound it to an Elasticsearch client beforehand:
>>> from elasticsearch import Elasticsearch
>>> client = Elasticsearch(hosts=['localhost:9200'])
Either at instantiation:
>>> from pandagg.search import Search
>>> search = Search(using=client, index='movies')
Either with using()
method:
>>> from pandagg.search import Search
>>> search = Search()\
>>> .using(client=client)\
>>> .index('movies')
Executing a Search
request using execute()
will return a
Response
instance (see more in Response).
>>> response = search.execute()
>>> response
<Response> took 58ms, success: True, total result >=10000, contains 2 hits
>>> response.__class__
pandagg.response.Response