1
0
mirror of https://github.com/django/django.git synced 2025-10-08 22:39:34 +00:00

[5.2.x] Fixed #36526 -- Doc'd QuerySet.bulk_update() memory usage when batching.

Thanks Simon Charette for the review.

Backport of 608d3ebc8889863d43be1090d634b9507fe4a85e from main.
This commit is contained in:
Natalia 2025-08-28 17:19:20 -03:00
parent c05c5b80a6
commit 80b9c8f529

View File

@ -2510,6 +2510,21 @@ them, but it has a few caveats:
* If updating a large number of columns in a large number of rows, the SQL * If updating a large number of columns in a large number of rows, the SQL
generated can be very large. Avoid this by specifying a suitable generated can be very large. Avoid this by specifying a suitable
``batch_size``. ``batch_size``.
* When updating a large number of objects, be aware that ``bulk_update()``
prepares all of the ``WHEN`` clauses for every object across all batches
before executing any queries. This can require more memory than expected. To
reduce memory usage, you can use an approach like this::
from itertools import islice
batch_size = 100
ids_iter = range(1000)
while ids := list(islice(ids_iter, batch_size)):
batch = Entry.objects.filter(ids__in=ids)
for entry in batch:
entry.headline = f"Updated headline {entry.pk}"
Entry.objects.bulk_update(batch, ["headline"], batch_size=batch_size)
* Updating fields defined on multi-table inheritance ancestors will incur an * Updating fields defined on multi-table inheritance ancestors will incur an
extra query per ancestor. extra query per ancestor.
* When an individual batch contains duplicates, only the first instance in that * When an individual batch contains duplicates, only the first instance in that