django/docs/ref/forms/validation.txt

.. _ref-forms-validation:

Form and field validation
=========================

.. versionchanged:: 1.2

Form validation happens when the data is cleaned. If you want to customize
this process, there are various places you can change, each one serving a
different purpose. Three types of cleaning methods are run during form
processing. These are normally executed when you call the ``is_valid()``
method on a form. There are other things that can trigger cleaning and
validation (accessing the ``errors`` attribute or calling ``full_clean()``
directly), but normally they won't be needed.

In general, any cleaning method can raise ``ValidationError`` if there is a
problem with the data it is processing, passing the relevant error message to
the ``ValidationError`` constructor. If no ``ValidationError`` is raised, the
method should return the cleaned (normalized) data as a Python object.

If you detect multiple errors during a cleaning method and wish to signal all
of them to the form submitter, it is possible to pass a list of errors to the
``ValidationError`` constructor.

Most validation can be done using `validators`_ - simple helpers that can be
reused easily. There are two types of validators:

    * Simple validators are simple functions (or callables) that take a single
      argument and raises ``ValidationError`` on invalid input.  Simple
      validators are run inside the ``run_validators`` method that is called
      from ``Field.clean`` once the value is validated by the field's methods.

    * Complex validators are instances of ``ComplexValidator`` class and,
      unlike simple validators, can access not just one value, but all at once.
      These are perfectly suited for cross field validation (one of N fields
      must be supplied, this field must equal that etc.)


.. warning::

    Since complex validators must have access to all cleaned values, they must
    be run after individual fields have been cleaned. This means that these are run
    in ``Form.full_clean`` and not inside ``Field.clean`` with simple validators.


Validation of a Form is split into several steps, which can be customized or
overridden:

    * The ``to_python()`` method on a Field is the first step in every
      validation. It coerces the value to correct datatype and raises
      ``ValidationError`` if that is not possible. This method accepts the raw
      value from the widget and returns the converted value. For example, a
      FloatField will turn the data into a Python ``float`` or raise a
      ``ValidationError``.

    * Next step resides in ``validate()`` method. This is a method where all
      field-specific validation, that cannot be abstracted into a validator,
      should take place. It takes the value coerced to correct datatype and
      raises ``ValidationError`` on any error. This method does not return
      anything and shouldn't alter the value.

    * Simple validators are run in the ``run_validators`` method. This method
      aggregates all the errors from all validators run into a single
      ``ValidationError``.

    * The ``clean()`` method on a Field subclass. This is responsible for
      running ``to_python``, ``validate`` and ``run_validators`` in the correct
      order and propagate their errors. If, at any time, any of the methods
      raise ValidationError, the validation stops and that error is raised.
      This method returns the clean data, which is then inserted into the
      ``cleaned_data`` dictionary of the form.

    * The ``clean_<fieldname>()`` method in a form subclass -- where
      ``<fieldname>`` is replaced with the name of the form field attribute.
      This method does any cleaning that is specific to that particular
      attribute, unrelated to the type of field that it is. This method is not
      passed any parameters. You will need to look up the value of the field
      in ``self.cleaned_data`` and remember that it will be a Python object
      at this point, not the original string submitted in the form (it will be
      in ``cleaned_data`` because the general field ``clean()`` method, above,
      has already cleaned the data once).

      For example, if you wanted to validate that the contents of a
      ``CharField`` called ``serialnumber`` was unique,
      ``clean_serialnumber()`` would be the right place to do this. You don't
      need a specific field (it's just a ``CharField``), but you want a
      formfield-specific piece of validation and, possibly,
      cleaning/normalizing the data.

      Just like the general field ``clean()`` method, above, this method
      should return the cleaned data, regardless of whether it changed
      anything or not.

    * The Field's complex validators are run after all the fields have been
      cleaned and only on those fields that passed validation. Errors from
      individual validators are aggregated and all the errors are raised for a
      given field.

    * The Form subclass's ``clean()`` method. This method can perform
      any validation that requires access to multiple fields from the form at
      once. This is where you might put in things to check that if field ``A``
      is supplied, field ``B`` must contain a valid e-mail address and the
      like. The data that this method returns is the final ``cleaned_data``
      attribute for the form, so don't forget to return the full list of
      cleaned data if you override this method (by default, ``Form.clean()``
      just returns ``self.cleaned_data``).

      Note that any errors raised by your ``Form.clean()`` override will not
      be associated with any field in particular. They go into a special
      "field" (called ``__all__``), which you can access via the
      ``non_field_errors()`` method if you need to. If you want to attach
      errors to a specific field in the form, you will need to access the
      ``_errors`` attribute on the form, which is `described later`_.

These methods are run in the order given above, one field at a time.  That is,
for each field in the form (in the order they are declared in the form
definition), the ``Field.clean()`` method (or its override) is run, then
``clean_<fieldname>()``. Once those two methods are run for every
field, the complex validators are run for every field and, finally,
``Form.clean()`` method, or its override, is executed.

Examples of each of these methods are provided below.

As mentioned, any of these methods can raise a ``ValidationError``. For any
field, if the ``Field.clean()`` method raises a ``ValidationError``, any
field-specific cleaning method is not called. However, the cleaning methods
for all remaining fields are still executed.

The ``clean()`` method for the ``Form`` class or subclass is always run. If
that method raises a ``ValidationError``, ``cleaned_data`` will be an empty
dictionary.

The previous paragraph means that if you are overriding ``Form.clean()``, you
should iterate through ``self.cleaned_data.items()``, possibly considering the
``_errors`` dictionary attribute on the form as well. In this way, you will
already know which fields have passed their individual validation requirements.

.. _described later:

Form subclasses and modifying field errors
------------------------------------------

Sometimes, in a form's ``clean()`` method, you will want to add an error
message to a particular field in the form. This won't always be appropriate
and the more typical situation is to raise a ``ValidationError`` from
``Form.clean()``, which is turned into a form-wide error that is available
through the ``Form.non_field_errors()`` method.

When you really do need to attach the error to a particular field, you should
store (or amend) a key in the ``Form._errors`` attribute. This attribute is an
instance of a ``django.forms.util.ErrorDict`` class. Essentially, though, it's
just a dictionary. There is a key in the dictionary for each field in the form
that has an error. Each value in the dictionary is a
``django.forms.util.ErrorList`` instance, which is a list that knows how to
display itself in different ways. So you can treat ``_errors`` as a dictionary
mapping field names to lists.

If you want to add a new error to a particular field, you should check whether
the key already exists in ``self._errors`` or not. If not, create a new entry
for the given key, holding an empty ``ErrorList`` instance. In either case,
you can then append your error message to the list for the field name in
question and it will be displayed when the form is displayed.

There is an example of modifying ``self._errors`` in the following section.

.. admonition:: What's in a name?

    You may be wondering why is this attribute called ``_errors`` and not
    ``errors``. Normal Python practice is to prefix a name with an underscore
    if it's not for external usage. In this case, you are subclassing the
    ``Form`` class, so you are essentially writing new internals. In effect,
    you are given permission to access some of the internals of ``Form``.

    Of course, any code outside your form should never access ``_errors``
    directly. The data is available to external code through the ``errors``
    property, which populates ``_errors`` before returning it).

    Another reason is purely historical: the attribute has been called
    ``_errors`` since the early days of the forms module and changing it now
    (particularly since ``errors`` is used for the read-only property name)
    would be inconvenient for a number of reasons. You can use whichever
    explanation makes you feel more comfortable. The result is the same.

Using validation in practice
----------------------------

The previous sections explained how validation works in general for forms.
Since it can sometimes be easier to put things into place by seeing each
feature in use, here are a series of small examples that use each of the
previous features.

.. _validators:

Using validators
~~~~~~~~~~~~~~~~
.. versionadded:: 1.2

Django's form (and model) fields support use of simple utility functions and
classes known as validators. These can be added to fields on their declaration
as argument to ``__init__`` or defined on the Field class itself.

Simple validators can be used to validate values inside the field, let's have a
look at Django's ``EmailField``::

    class EmailField(CharField):
        default_error_messages = {
            'invalid': _(u'Enter a valid e-mail address.'),
        }
        default_validators = [validators.validate_email]

As you can see, ``EmailField`` is just a ``CharField`` with customized error
message and a validator that validates e-mail addresses. This can also be done
on field definition so::

    email = forms.EmailField()

is equivalent to::

    email = forms.CharField(validators=[validators.validate_email],
            error_messages={'invalid': _(u'Enter a valid e-mail address.')})


Form field default cleaning
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Let's firstly create a custom form field that validates its input is a string
containing comma-separated e-mail addresses. The full class looks like this::

    from django import forms
    from django.core.validators import validate_email

    class MultiEmailField(forms.Field):
        def to_python(self, value):
            "Normalize data to a list of strings."

            # return empty list on empty input
            if not value: return []

            return value.split(',')

        def validate(self, value):
            "Check if value consists only of valid emails."

            # check if value is given if the field is required
            super(MultiEmailField, self).validate(value)

            for email in value:
                validate_email(email)

Every form that uses this field will have these methods run before anything
else can be done with the field's data. This is cleaning that is specific to
this type of field, regardless of how it is subsequently used.

Let's create a simple ``ContactForm`` to demonstrate how you'd use this
field::

    class ContactForm(forms.Form):
        subject = forms.CharField(max_length=100)
        message = forms.CharField()
        sender = forms.EmailField()
        recipients = MultiEmailField()
        cc_myself = forms.BooleanField(required=False)

Simply use ``MultiEmailField`` like any other form field. When the
``is_valid()`` method is called on the form, the ``MultiEmailField.clean()``
method will be run as part of the cleaning process and it will, in turn, call
the custom ``to_python()`` and ``validate()`` methods.

Cleaning a specific field attribute
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Continuing on from the previous example, suppose that in our ``ContactForm``,
we want to make sure that the ``recipients`` field always contains the address
``"fred@example.com"``. This is validation that is specific to our form, so we
don't want to put it into the general ``MultiEmailField`` class. Instead, we
write a cleaning method that operates on the ``recipients`` field, like so::

    class ContactForm(forms.Form):
        # Everything as before.
        ...

        def clean_recipients(self):
            data = self.cleaned_data['recipients']
            if "fred@example.com" not in data:
                raise forms.ValidationError("You have forgotten about Fred!")

            # Always return the cleaned data, whether you have changed it or
            # not.
            return data

Cleaning and validating fields that depend on each other
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Suppose we add another requirement to our contact form: if the ``cc_myself``
field is ``True``, the ``subject`` must contain the word ``"help"``. We are
performing validation on more than one field at a time, so a
``ComplexValidator`` is a good start. The complex validators are run in the
form's ``clean()`` after the individual fields have been validated. This is the
main difference against simple validators. It is important to realize that,
even if defined in very similar way, simple and complex validators are run in
different places in the code. Simple validators on a field (single data point),
complex validator on a form (collection of fields).

By the time the field's complex validators are called, all the individual field
clean methods will have been run (the previous two sections), so the
validator's ``all_values`` argument will be populated with any data that has
survived so far. So you also need to remember to allow for the fact that the
fields you are wanting to validate might not have survived the initial
individual field checks.

Complex validator is run on the form, but reports it's error with the field. As
with simple validators, all complex validators for a given field are run if the
field has passed cleaning and their errors are aggregated and reported.

To create a complex validator, simply subclass ``ComplexValidator`` and supply
your validation logic in it's ``__call__()`` method, for example::

    class ValidateHelpInSubjectIfCCingMyself(ComplexValidator):
        def __call__(self, all_values={}, obj=None):
            cc_myself = self.get_value('cc_myself', all_values, obj)

            if cc_myself and "help" not in value:
                raise forms.ValidationError("Did not send for 'help' in "
                        "the subject despite CC'ing yourself.")


    class ContactForm(forms.Form):
        subject = forms.CharField(max_length=100,
            validators=[ValidateHelpInSubjectIfCCingMyself()])
        ...
        # Everything as before.


The second approach might involve assigning the error message to both fields
involved. To do this we will move the validation code to the form's ``clean()``
method, which is a convenient place for form-wide validation and has access to
the form instance and it's ``_errors`` property. Our new code (replacing the
previous sample) looks like this::

    from django.forms.util import ErrorList

    class ContactForm(forms.Form):
        # Everything as before.
        ...

        def clean(self):
            cleaned_data = self.cleaned_data
            cc_myself = cleaned_data.get("cc_myself")
            subject = cleaned_data.get("subject")

            if cc_myself and subject and "help" not in subject:
                # We know these are not in self._errors now (see discussion
                # below).
                msg = u"Must put 'help' in subject when cc'ing yourself."
                self._errors["cc_myself"] = ErrorList([msg])
                self._errors["subject"] = ErrorList([msg])

                # These fields are no longer valid. Remove them from the
                # cleaned data.
                del cleaned_data["cc_myself"]
                del cleaned_data["subject"]

            # Always return the full collection of cleaned data.
            return cleaned_data

As you can see, this approach requires a bit more effort, not withstanding the
extra design effort to create a sensible form display. The details are worth
noting, however. Firstly, earlier we mentioned that you might need to check if
the field name keys already exist in the ``_errors`` dictionary. In this case,
since we know the fields exist in ``self.cleaned_data``, they must have been
valid when cleaned as individual fields, so there will be no corresponding
entries in ``_errors``.

Secondly, once we have decided that the combined data in the two fields we are
considering aren't valid, we must remember to remove them from the
``cleaned_data``.

In fact, Django will currently completely wipe out the ``cleaned_data``
dictionary if there are any errors in the form. However, this behaviour may
change in the future, so it's not a bad idea to clean up after yourself in the
first place.