pandagg.aggs module¶

class pandagg.aggs.Aggs(*args, **kwargs)[source]¶

Bases: pandagg.tree._tree.Tree

Combination of aggregation clauses. This class provides handful methods to build an aggregation (see aggs() and groupby()), and is used as well to parse aggregations response in handy formats.

Mapping declaration is optional, but doing so validates aggregation validity and automatically handles missing nested clauses.

All following syntaxes are identical:

From a dict:

>>> Aggs({"per_user":{"terms":{"field":"user"}}})

Using shortcut declaration: first argument is the aggregation type, other arguments are aggregation body parameters:

>>> Aggs('terms', name='per_user', field='user')

Using DSL class:

>>> from pandagg.aggs import Terms
>>> Aggs(Terms('per_user', field='user'))

Dict and DSL class syntaxes allow to provide multiple clauses aggregations:

>>> Aggs({"per_user":{"terms":{"field":"user"}, "aggs": {"avg_age": {"avg": {"field": "age"}}}}})

Which is similar to:

>>> from pandagg.aggs import Terms, Avg
>>> Terms('per_user', field='user', aggs=Avg('avg_age', field='age'))

Keyword Arguments:
	mapping (`dict` or `pandagg.tree.mapping.Mapping`) – Mapping of requested indice(s). Providing it will validate aggregations validity, and add required nested clauses if missing. nested_autocorrect (`bool`) – In case of missing nested clauses in aggregation, if True, automatically add missing nested clauses, else raise error. remaining kwargs: Used as body in aggregation

aggs(*args, **kwargs)[source]¶

Arrange passed aggregations “horizontally”.

Given the initial aggregation:

A──> B
└──> C

If passing multiple aggregations with insert_below = ‘A’:

A──> B
└──> C
└──> new1
└──> new2

Note: those will be placed under the insert_below aggregation clause id if provided, else under the deepest linear bucket aggregation if there is no ambiguity:

OK:

A──> B ─> C ─> new

KO:

A──> B
└──> C

args accepts single occurrence or sequence of following formats:

string (for terms agg concise declaration)
regular Elasticsearch dict syntax
AggNode instance (for instance Terms, Filters etc)

Keyword Arguments:
	insert_below (`string`) – Parent aggregation name under which these aggregations should be placed at_root (`string`) – Insert aggregations at root of aggregation query remaining kwargs: Used as body in aggregation
Return type:	pandagg.aggs.Aggs

applied_nested_path_at_node(nid)[source]¶

deepest_linear_bucket_agg¶: Return deepest bucket aggregation node (pandagg.nodes.abstract.BucketAggNode) of that aggregation that neither has siblings, nor has an ancestor with siblings.

groupby(*args, **kwargs)[source]¶

Arrange passed aggregations in vertical/nested manner, above or below another agg clause.

Given the initial aggregation:

A──> B
└──> C

If insert_below = ‘A’:

A──> new──> B
      └──> C

If insert_above = ‘B’:

A──> new──> B
└──> C

by argument accepts single occurrence or sequence of following formats:

string (for terms agg concise declaration)
regular Elasticsearch dict syntax
AggNode instance (for instance Terms, Filters etc)

If insert_below nor insert_above is provided by will be placed between the the deepest linear bucket aggregation if there is no ambiguity, and its children:

A──> B      : OK generates     A──> B ─> C ─> by

A──> B      : KO, ambiguous, must precise either A, B or C
└──> C

Accepted all Aggs.__init__ syntaxes

>>> Aggs()\
>>> .groupby('terms', name='per_user_id', field='user_id')
{"terms_on_my_field":{"terms":{"field":"some_field"}}}

Passing a dict:

>>> Aggs().groupby({"terms_on_my_field":{"terms":{"field":"some_field"}}})
{"terms_on_my_field":{"terms":{"field":"some_field"}}}

Using DSL class:

>>> from pandagg.aggs import Terms
>>> Aggs().groupby(Terms('terms_on_my_field', field='some_field'))
{"terms_on_my_field":{"terms":{"field":"some_field"}}}

Shortcut syntax for terms aggregation: creates a terms aggregation, using field as aggregation name

>>> Aggs().groupby('some_field')
{"some_field":{"terms":{"field":"some_field"}}}

Using a Aggs object:

>>> Aggs().groupby(Aggs('per_user_id', 'terms', field='user_id'))
{"terms_on_my_field":{"terms":{"field":"some_field"}}}

Accepted declarations for multiple aggregations:

Keyword Arguments:
	insert_below (`string`) – Parent aggregation name under which these aggregations should be placed insert_above (`string`) – Aggregation name above which these aggregations should be placed at_root (`string`) – Insert aggregations at root of aggregation query remaining kwargs: Used as body in aggregation
Return type:	pandagg.aggs.Aggs

node_class¶: alias of pandagg.node.aggs.abstract.AggNode

show(*args, **kwargs)[source]¶

Return tree structure in hierarchy style.

Parameters:

nid – Node identifier from which tree traversal will start. If None tree root will be used
filter_ – filter function performed on nodes. Nodes excluded from filter function nor their children won’t be displayed
reverse – the reverse param for sorting Node objects in the same level
key – key used to order nodes of same parent
reverse – reverse parameter applied at sorting
line_type – display type choice
limit – int, truncate tree display to this number of lines
kwargs – kwargs params passed to node line_repr method

Return type: