1
0
mirror of https://github.com/django/django.git synced 2025-10-23 21:59:11 +00:00

Fixed #27849 -- Added filtering support to aggregates.

This commit is contained in:
Tom
2017-04-22 16:44:51 +01:00
committed by Tim Graham
parent 489421b015
commit b78d100fa6
13 changed files with 290 additions and 55 deletions

View File

@@ -22,7 +22,7 @@ General-purpose aggregation functions
``ArrayAgg``
------------
.. class:: ArrayAgg(expression, distinct=False, **extra)
.. class:: ArrayAgg(expression, distinct=False, filter=None, **extra)
Returns a list of values, including nulls, concatenated into an array.
@@ -36,7 +36,7 @@ General-purpose aggregation functions
``BitAnd``
----------
.. class:: BitAnd(expression, **extra)
.. class:: BitAnd(expression, filter=None, **extra)
Returns an ``int`` of the bitwise ``AND`` of all non-null input values, or
``None`` if all values are null.
@@ -44,7 +44,7 @@ General-purpose aggregation functions
``BitOr``
---------
.. class:: BitOr(expression, **extra)
.. class:: BitOr(expression, filter=None, **extra)
Returns an ``int`` of the bitwise ``OR`` of all non-null input values, or
``None`` if all values are null.
@@ -52,7 +52,7 @@ General-purpose aggregation functions
``BoolAnd``
-----------
.. class:: BoolAnd(expression, **extra)
.. class:: BoolAnd(expression, filter=None, **extra)
Returns ``True``, if all input values are true, ``None`` if all values are
null or if there are no values, otherwise ``False`` .
@@ -60,7 +60,7 @@ General-purpose aggregation functions
``BoolOr``
----------
.. class:: BoolOr(expression, **extra)
.. class:: BoolOr(expression, filter=None, **extra)
Returns ``True`` if at least one input value is true, ``None`` if all
values are null or if there are no values, otherwise ``False``.
@@ -68,7 +68,7 @@ General-purpose aggregation functions
``JSONBAgg``
------------
.. class:: JSONBAgg(expressions, **extra)
.. class:: JSONBAgg(expressions, filter=None, **extra)
.. versionadded:: 1.11
@@ -77,7 +77,7 @@ General-purpose aggregation functions
``StringAgg``
-------------
.. class:: StringAgg(expression, delimiter, distinct=False)
.. class:: StringAgg(expression, delimiter, distinct=False, filter=None)
Returns the input values concatenated into a string, separated by
the ``delimiter`` string.
@@ -105,7 +105,7 @@ field or an expression returning a numeric data. Both are required.
``Corr``
--------
.. class:: Corr(y, x)
.. class:: Corr(y, x, filter=None)
Returns the correlation coefficient as a ``float``, or ``None`` if there
aren't any matching rows.
@@ -113,7 +113,7 @@ field or an expression returning a numeric data. Both are required.
``CovarPop``
------------
.. class:: CovarPop(y, x, sample=False)
.. class:: CovarPop(y, x, sample=False, filter=None)
Returns the population covariance as a ``float``, or ``None`` if there
aren't any matching rows.
@@ -129,7 +129,7 @@ field or an expression returning a numeric data. Both are required.
``RegrAvgX``
------------
.. class:: RegrAvgX(y, x)
.. class:: RegrAvgX(y, x, filter=None)
Returns the average of the independent variable (``sum(x)/N``) as a
``float``, or ``None`` if there aren't any matching rows.
@@ -137,7 +137,7 @@ field or an expression returning a numeric data. Both are required.
``RegrAvgY``
------------
.. class:: RegrAvgY(y, x)
.. class:: RegrAvgY(y, x, filter=None)
Returns the average of the dependent variable (``sum(y)/N``) as a
``float``, or ``None`` if there aren't any matching rows.
@@ -145,7 +145,7 @@ field or an expression returning a numeric data. Both are required.
``RegrCount``
-------------
.. class:: RegrCount(y, x)
.. class:: RegrCount(y, x, filter=None)
Returns an ``int`` of the number of input rows in which both expressions
are not null.
@@ -153,7 +153,7 @@ field or an expression returning a numeric data. Both are required.
``RegrIntercept``
-----------------
.. class:: RegrIntercept(y, x)
.. class:: RegrIntercept(y, x, filter=None)
Returns the y-intercept of the least-squares-fit linear equation determined
by the ``(x, y)`` pairs as a ``float``, or ``None`` if there aren't any
@@ -162,7 +162,7 @@ field or an expression returning a numeric data. Both are required.
``RegrR2``
----------
.. class:: RegrR2(y, x)
.. class:: RegrR2(y, x, filter=None)
Returns the square of the correlation coefficient as a ``float``, or
``None`` if there aren't any matching rows.
@@ -170,7 +170,7 @@ field or an expression returning a numeric data. Both are required.
``RegrSlope``
-------------
.. class:: RegrSlope(y, x)
.. class:: RegrSlope(y, x, filter=None)
Returns the slope of the least-squares-fit linear equation determined
by the ``(x, y)`` pairs as a ``float``, or ``None`` if there aren't any
@@ -179,7 +179,7 @@ field or an expression returning a numeric data. Both are required.
``RegrSXX``
-----------
.. class:: RegrSXX(y, x)
.. class:: RegrSXX(y, x, filter=None)
Returns ``sum(x^2) - sum(x)^2/N`` ("sum of squares" of the independent
variable) as a ``float``, or ``None`` if there aren't any matching rows.
@@ -187,7 +187,7 @@ field or an expression returning a numeric data. Both are required.
``RegrSXY``
-----------
.. class:: RegrSXY(y, x)
.. class:: RegrSXY(y, x, filter=None)
Returns ``sum(x*y) - sum(x) * sum(y)/N`` ("sum of products" of independent
times dependent variable) as a ``float``, or ``None`` if there aren't any
@@ -196,7 +196,7 @@ field or an expression returning a numeric data. Both are required.
``RegrSYY``
-----------
.. class:: RegrSYY(y, x)
.. class:: RegrSYY(y, x, filter=None)
Returns ``sum(y^2) - sum(y)^2/N`` ("sum of squares" of the dependent
variable) as a ``float``, or ``None`` if there aren't any matching rows.

View File

@@ -184,12 +184,14 @@ their registration dates. We can do this using a conditional expression and the
>>> Client.objects.values_list('name', 'account_type')
<QuerySet [('Jane Doe', 'G'), ('James Smith', 'R'), ('Jack Black', 'P')]>
.. _conditional-aggregation:
Conditional aggregation
-----------------------
What if we want to find out how many clients there are for each
``account_type``? We can nest conditional expression within
:ref:`aggregate functions <aggregation-functions>` to achieve this::
``account_type``? We can use the ``filter`` argument of :ref:`aggregate
functions <aggregation-functions>` to achieve this::
>>> # Create some more Clients first so we can have something to count
>>> Client.objects.create(
@@ -207,17 +209,30 @@ What if we want to find out how many clients there are for each
>>> # Get counts for each value of account_type
>>> from django.db.models import IntegerField, Sum
>>> Client.objects.aggregate(
... regular=Sum(
... Case(When(account_type=Client.REGULAR, then=1),
... output_field=IntegerField())
... ),
... gold=Sum(
... Case(When(account_type=Client.GOLD, then=1),
... output_field=IntegerField())
... ),
... platinum=Sum(
... Case(When(account_type=Client.PLATINUM, then=1),
... output_field=IntegerField())
... )
... regular=Count('pk', filter=Q(account_type=Client.REGULAR)),
... gold=Count('pk', filter=Q(account_type=Client.GOLD)),
... platinum=Count('pk', filter=Q(account_type=Client.PLATINUM)),
... )
{'regular': 2, 'gold': 1, 'platinum': 3}
This aggregate produces a query with the SQL 2003 ``FILTER WHERE`` syntax
on databases that support it:
.. code-block:: sql
SELECT count('id') FILTER (WHERE account_type=1) as regular,
count('id') FILTER (WHERE account_type=2) as gold,
count('id') FILTER (WHERE account_type=3) as platinum
FROM clients;
On other databases, this is emulated using a ``CASE`` statement:
.. code-block:: sql
SELECT count(CASE WHEN account_type=1 THEN id ELSE null) as regular,
count(CASE WHEN account_type=2 THEN id ELSE null) as gold,
count(CASE WHEN account_type=3 THEN id ELSE null) as platinum
FROM clients;
The two SQL statements are functionally equivalent but the more explicit
``FILTER`` may perform better.

View File

@@ -339,7 +339,7 @@ some complex computations::
The ``Aggregate`` API is as follows:
.. class:: Aggregate(expression, output_field=None, **extra)
.. class:: Aggregate(expression, output_field=None, filter=None, **extra)
.. attribute:: template
@@ -370,9 +370,17 @@ should define the desired ``output_field``. For example, adding an
``IntegerField()`` and a ``FloatField()`` together should probably have
``output_field=FloatField()`` defined.
The ``filter`` argument takes a :class:`Q object <django.db.models.Q>` that's
used to filter the rows that are aggregated. See :ref:`conditional-aggregation`
and :ref:`filtering-on-annotations` for example usage.
The ``**extra`` kwargs are ``key=value`` pairs that can be interpolated
into the ``template`` attribute.
.. versionchanged:: 2.0
The ``filter`` argument was added.
Creating your own Aggregate Functions
-------------------------------------

View File

@@ -3085,6 +3085,17 @@ of the return value
``output_field`` if all fields are of the same type. Otherwise, you
must provide the ``output_field`` yourself.
``filter``
~~~~~~~~~~
.. versionadded:: 2.0
An optional :class:`Q object <django.db.models.Q>` that's used to filter the
rows that are aggregated.
See :ref:`conditional-aggregation` and :ref:`filtering-on-annotations` for
example usage.
``**extra``
~~~~~~~~~~~
@@ -3094,7 +3105,7 @@ by the aggregate.
``Avg``
~~~~~~~
.. class:: Avg(expression, output_field=FloatField(), **extra)
.. class:: Avg(expression, output_field=FloatField(), filter=None, **extra)
Returns the mean value of the given expression, which must be numeric
unless you specify a different ``output_field``.
@@ -3106,7 +3117,7 @@ by the aggregate.
``Count``
~~~~~~~~~
.. class:: Count(expression, distinct=False, **extra)
.. class:: Count(expression, distinct=False, filter=None, **extra)
Returns the number of objects that are related through the provided
expression.
@@ -3125,7 +3136,7 @@ by the aggregate.
``Max``
~~~~~~~
.. class:: Max(expression, output_field=None, **extra)
.. class:: Max(expression, output_field=None, filter=None, **extra)
Returns the maximum value of the given expression.
@@ -3135,7 +3146,7 @@ by the aggregate.
``Min``
~~~~~~~
.. class:: Min(expression, output_field=None, **extra)
.. class:: Min(expression, output_field=None, filter=None, **extra)
Returns the minimum value of the given expression.
@@ -3145,7 +3156,7 @@ by the aggregate.
``StdDev``
~~~~~~~~~~
.. class:: StdDev(expression, sample=False, **extra)
.. class:: StdDev(expression, sample=False, filter=None, **extra)
Returns the standard deviation of the data in the provided expression.
@@ -3169,7 +3180,7 @@ by the aggregate.
``Sum``
~~~~~~~
.. class:: Sum(expression, output_field=None, **extra)
.. class:: Sum(expression, output_field=None, filter=None, **extra)
Computes the sum of all values of the given expression.
@@ -3179,7 +3190,7 @@ by the aggregate.
``Variance``
~~~~~~~~~~~~
.. class:: Variance(expression, sample=False, **extra)
.. class:: Variance(expression, sample=False, filter=None, **extra)
Returns the variance of the data in the provided expression.

View File

@@ -273,6 +273,10 @@ Models
parameters, if the backend supports this feature. Of Django's built-in
backends, only Oracle supports it.
* The new ``filter`` argument for built-in aggregates allows :ref:`adding
different conditionals <conditional-aggregation>` to multiple aggregations
over the same fields or relations.
Requests and Responses
~~~~~~~~~~~~~~~~~~~~~~

View File

@@ -84,6 +84,16 @@ In a hurry? Here's how to do common aggregate queries, assuming the models above
>>> pubs[0].num_books
73
# Each publisher, with a separate count of books with a rating above and below 5
>>> from django.db.models import Q
>>> above_5 = Count('book', filter=Q(book__rating__gt=5))
>>> below_5 = Count('book', filter=Q(book__rating__lte=5))
>>> pubs = Publisher.objects.annotate(below_5=below_5).annotate(above_5=above_5)
>>> pubs[0].above_5
23
>>> pubs[0].below_5
12
# The top 5 publishers, in order by number of books.
>>> pubs = Publisher.objects.annotate(num_books=Count('book')).order_by('-num_books')[:5]
>>> pubs[0].num_books
@@ -324,6 +334,8 @@ title that starts with "Django" using the query::
>>> Book.objects.filter(name__startswith="Django").aggregate(Avg('price'))
.. _filtering-on-annotations:
Filtering on annotations
~~~~~~~~~~~~~~~~~~~~~~~~
@@ -339,6 +351,27 @@ you can issue the query::
This query generates an annotated result set, and then generates a filter
based upon that annotation.
If you need two annotations with two separate filters you can use the
``filter`` argument with any aggregate. For example, to generate a list of
authors with a count of highly rated books::
>>> highly_rated = Count('books', filter=Q(books__rating__gte=7))
>>> Author.objects.annotate(num_books=Count('books'), highly_rated_books=highly_rated)
Each ``Author`` in the result set will have the ``num_books`` and
``highly_rated_books`` attributes.
.. admonition:: Choosing between ``filter`` and ``QuerySet.filter()``
Avoid using the ``filter`` argument with a single annotation or
aggregation. It's more efficient to use ``QuerySet.filter()`` to exclude
rows. The aggregation ``filter`` argument is only useful when using two or
more aggregations over the same relations with different conditionals.
.. versionchanged:: 2.0
The ``filter`` argument was added to aggregates.
Order of ``annotate()`` and ``filter()`` clauses
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~