mirror of
https://github.com/django/django.git
synced 2025-10-24 14:16:09 +00:00
Fixed #24938 -- Added PostgreSQL trigram support.
This commit is contained in:
committed by
Tim Graham
parent
d7334b405f
commit
1962a96a30
@@ -2,6 +2,32 @@
|
||||
PostgreSQL specific lookups
|
||||
===========================
|
||||
|
||||
Trigram similarity
|
||||
==================
|
||||
|
||||
.. fieldlookup:: trigram_similar
|
||||
|
||||
.. versionadded:: 1.10
|
||||
|
||||
The ``trigram_similar`` lookup allows you to perform trigram lookups,
|
||||
measuring the number of trigrams (three consecutive characters) shared, using a
|
||||
dedicated PostgreSQL extension. A trigram lookup is given an expression and
|
||||
returns results that have a similarity measurement greater than the current
|
||||
similarity threshold.
|
||||
|
||||
To use it, add ``'django.contrib.postgres'`` in your :setting:`INSTALLED_APPS`
|
||||
and activate the `pg_trgm extension
|
||||
<http://www.postgresql.org/docs/current/interactive/pgtrgm.html>`_ on
|
||||
PostgreSQL. You can install the extension using the
|
||||
:class:`~django.contrib.postgres.operations.TrigramExtension` migration
|
||||
operation.
|
||||
|
||||
The ``trigram_similar`` lookup can be used on
|
||||
:class:`~django.db.models.CharField` and :class:`~django.db.models.TextField`::
|
||||
|
||||
>>> City.objects.filter(name__trigram_similar="Middlesborough")
|
||||
['<City: Middlesbrough>']
|
||||
|
||||
``Unaccent``
|
||||
============
|
||||
|
||||
|
@@ -27,6 +27,16 @@ the ``django.contrib.postgres.operations`` module.
|
||||
which will install the ``hstore`` extension and also immediately set up the
|
||||
connection to interpret hstore data.
|
||||
|
||||
``TrigramExtension``
|
||||
====================
|
||||
|
||||
.. class:: TrigramExtension()
|
||||
|
||||
.. versionadded:: 1.10
|
||||
|
||||
A subclass of :class:`~django.contrib.postgres.operations.CreateExtension`
|
||||
that installs the ``pg_trgm`` extension.
|
||||
|
||||
``UnaccentExtension``
|
||||
=====================
|
||||
|
||||
|
@@ -189,3 +189,58 @@ if it were an annotated ``SearchVector``::
|
||||
[<Entry: Cheese on Toast recipes>, <Entry: Pizza recipes>]
|
||||
|
||||
.. _PostgreSQL documentation: http://www.postgresql.org/docs/current/static/textsearch-features.html#TEXTSEARCH-UPDATE-TRIGGERS
|
||||
|
||||
Trigram similarity
|
||||
==================
|
||||
|
||||
Another approach to searching is trigram similarity. A trigram is a group of
|
||||
three consecutive characters. In addition to the :lookup:`trigram_similar`
|
||||
lookup, you can use a couple of other expressions.
|
||||
|
||||
To use them, you need to activate the `pg_trgm extension
|
||||
<http://www.postgresql.org/docs/current/interactive/pgtrgm.html>`_ on
|
||||
PostgreSQL. You can install it using the
|
||||
:class:`~django.contrib.postgres.operations.TrigramExtension` migration
|
||||
operation.
|
||||
|
||||
``TrigramSimilarity``
|
||||
---------------------
|
||||
|
||||
.. class:: TrigramSimilarity(expression, string, **extra)
|
||||
|
||||
.. versionadded:: 1.10
|
||||
|
||||
Accepts a field name or expression, and a string or expression. Returns the
|
||||
trigram similarity between the two arguments.
|
||||
|
||||
Usage example::
|
||||
|
||||
>>> from django.contrib.postgres.search import TrigramSimilarity
|
||||
>>> Author.objects.create(name='Katy Stevens')
|
||||
>>> Author.objects.create(name='Stephen Keats')
|
||||
>>> test = 'Katie Stephens'
|
||||
>>> Author.objects.annotate(
|
||||
... similarity=TrigramSimilarity('name', test),
|
||||
... ).filter(similarity__gt=0.3).order_by('-similarity')
|
||||
[<Author: Katy Stephens>, <Author: Stephen Keats>]
|
||||
|
||||
``TrigramDistance``
|
||||
-------------------
|
||||
|
||||
.. class:: TrigramDistance(expression, string, **extra)
|
||||
|
||||
.. versionadded:: 1.10
|
||||
|
||||
Accepts a field name or expression, and a string or expression. Returns the
|
||||
trigram distance between the two arguments.
|
||||
|
||||
Usage example::
|
||||
|
||||
>>> from django.contrib.postgres.search import TrigramDistance
|
||||
>>> Author.objects.create(name='Katy Stevens')
|
||||
>>> Author.objects.create(name='Stephen Keats')
|
||||
>>> test = 'Katie Stephens'
|
||||
>>> Author.objects.annotate(
|
||||
... distance=TrigramDistance('name', test),
|
||||
... ).filter(distance__lte=0.7).order_by('distance')
|
||||
[<Author: Katy Stephens>, <Author: Stephen Keats>]
|
||||
|
@@ -33,6 +33,10 @@ search engine. You can search across multiple fields in your relational
|
||||
database, combine the searches with other lookups, use different language
|
||||
configurations and weightings, and rank the results by relevance.
|
||||
|
||||
It also now includes trigram support, using the :lookup:`trigram_similar`
|
||||
lookup, and the :class:`~django.contrib.postgres.search.TrigramSimilarity` and
|
||||
:class:`~django.contrib.postgres.search.TrigramDistance` expressions.
|
||||
|
||||
Minor features
|
||||
--------------
|
||||
|
||||
|
@@ -55,11 +55,12 @@ use :lookup:`unaccented comparison <unaccent>`::
|
||||
This shows another issue, where we are matching against a different spelling of
|
||||
the name. In this case we have an asymmetry though - a search for ``Helen``
|
||||
will pick up ``Helena`` or ``Hélène``, but not the reverse. Another option
|
||||
would be to use a trigram comparison, which compares sequences of letters.
|
||||
would be to use a :lookup:`trigram_similar` comparison, which compares
|
||||
sequences of letters.
|
||||
|
||||
For example::
|
||||
|
||||
>>> Author.objects.filter(name__unaccent__lower__trigram='Hélène')
|
||||
>>> Author.objects.filter(name__unaccent__lower__trigram_similar='Hélène')
|
||||
[<Author: Helen Mirren>, <Actor: Hélène Joy>]
|
||||
|
||||
Now we have a different problem - the longer name of "Helena Bonham Carter"
|
||||
|
Reference in New Issue
Block a user