~11 min read • Updated Mar 15, 2026
Introduction
Django’s serialization framework provides a powerful mechanism for converting Django model instances into various data formats. These formats are typically text-based—such as JSON or XML—and are often used for data transfer, backups, or integration with external systems.
Serialization is the process of converting Django objects into a structured format. Deserialization reverses this process, turning serialized data back into Django model instances.
Serializing Data
The simplest way to serialize Django objects is by using the serialize() function:
from django.core import serializers
data = serializers.serialize("json", SomeModel.objects.all())
The first argument is the format (e.g., json, xml, yaml), and the second is a QuerySet or any iterable of model instances.
Using Serializer Objects Directly
You can also obtain a serializer class and use it manually:
JSONSerializer = serializers.get_serializer("json")
json_serializer = JSONSerializer()
json_serializer.serialize(queryset)
data = json_serializer.getvalue()
This approach is useful when writing directly to a file or a stream:
with open("file.json", "w") as out:
json_serializer.serialize(SomeModel.objects.all(), stream=out)
If you request an unknown format, Django raises SerializerDoesNotExist.
Serializing a Subset of Fields
You can limit the fields included in the output:
data = serializers.serialize(
"json",
SomeModel.objects.all(),
fields=["name", "size"]
)
The primary key is always included as pk, even if not listed in fields.
Note: Deserialization may fail if required fields are missing from the serialized data.
Serializing Inherited Models
Abstract Base Classes
If your model inherits from an abstract base class, serialization works normally—no extra steps required.
Multi‑Table Inheritance
For multi-table inheritance, Django only serializes fields defined on the child model. Example:
class Place(models.Model):
name = models.CharField(max_length=50)
class Restaurant(Place):
serves_hot_dogs = models.BooleanField(default=False)
Serializing Restaurant alone will not include the name field:
serializers.serialize("json", Restaurant.objects.all())
To fully serialize the object, include both models:
all_objects = [*Restaurant.objects.all(), *Place.objects.all()]
data = serializers.serialize("json", all_objects)
Deserializing Data
Deserialization is straightforward:
for obj in serializers.deserialize("json", data):
do_something_with(obj)
However, the returned objects are DeserializedObject instances, not actual Django model instances.
Saving Deserialized Objects
Call .save() to persist them:
for deserialized in serializers.deserialize("json", data):
deserialized.save()
If the serialized data has pk = null or missing, Django creates a new object.
Handling Unknown Fields
If the serialized data contains fields not present on the model, Django raises DeserializationError unless:
serializers.deserialize("json", data, ignorenonexistent=True)
Serialization Formats
Django supports several formats:
| Format | Description |
|---|---|
| json | Standard JSON serialization |
| jsonl | JSON Lines format |
| xml | Simple XML dialect |
| yaml | Requires PyYAML |
XML Serialization Example
A typical XML output looks like:
Relational Fields
Foreign keys:
9
Many‑to‑many fields:
Control Characters
XML 1.0 does not allow certain control characters.
If they appear in your data, serialization raises ValueError.
Conclusion
Django’s serialization framework is flexible, powerful, and essential for data exchange, backups, migrations, and integrations. By understanding how serialization works—especially with inherited models, relational fields, and deserialization—you can confidently manage structured data across your Django applications.
JSON Serialization in Django
JSON is one of the most widely used formats for data exchange. Django provides a clean and structured JSON output. Using the same example as before, JSON serialization looks like this:
[
{
"pk": "4b678b301dfd8a4e0dad910de3ae245b",
"model": "sessions.session",
"fields": {
"expire_date": "2013-01-16T08:16:59.844Z"
}
}
]
In JSON:
- pk is the primary key
- model is the app label and model name
- fields contains all field names and values
Relational Fields
- ForeignKey → represented by the related object’s PK
- ManyToMany → represented as a list of PKs
Handling Custom Data Types
If your model contains custom Python types, Django cannot serialize them automatically. You must define a custom JSON encoder:
from django.core.serializers.json import DjangoJSONEncoder
class LazyEncoder(DjangoJSONEncoder):
def default(self, obj):
if isinstance(obj, YourCustomType):
return str(obj)
return super().default(obj)
Then pass it to serialize():
from django.core.serializers import serialize
serialize("json", SomeModel.objects.all(), cls=LazyEncoder)
GeoDjango also provides a specialized GeoJSON serializer.
DjangoJSONEncoder
Django uses DjangoJSONEncoder for JSON serialization.
It extends Python’s JSONEncoder and supports additional types:
- datetime → ECMA‑262 timestamp
- date → YYYY‑MM‑DD
- time → HH:MM:ss.sss
- timedelta → ISO‑8601 duration
- Decimal, UUID, Promise → converted to strings
JSONL Serialization
JSONL (JSON Lines) stores each object on a separate line:
{"pk": "...", "model": "sessions.session", "fields": {...}}
{"pk": "...", "model": "sessions.session", "fields": {...}}
{"pk": "...", "model": "sessions.session", "fields": {...}}
JSONL is ideal for large datasets because it can be processed line‑by‑line without loading the entire file into memory.
YAML Serialization
YAML is similar to JSON but more human‑readable. Django supports YAML serialization if PyYAML is installed.
- model: sessions.session
pk: 4b678b301dfd8a4e0dad910de3ae245b
fields:
expire_date: 2013-01-16 08:16:59.844560+00:00
Relational fields are represented by PKs or lists of PKs.
Creating Custom Serialization Formats
Django allows you to define your own serialization formats. Here’s an example of implementing a custom CSV serializer and deserializer.
Custom CSV Serializer
# path/to/custom_csv_serializer.py
import csv
from django.apps import apps
from django.core import serializers
from django.core.serializers.base import DeserializationError
class Serializer(serializers.python.Serializer):
def get_dump_object(self, obj):
dumped = super().get_dump_object(obj)
row = [dumped["model"], str(dumped["pk"])]
row += [str(value) for value in dumped["fields"].values()]
return ",".join(row), dumped["model"]
def end_object(self, obj):
dumped_str, model = self.get_dump_object(obj)
if self.first:
fields = [f.name for f in apps.get_model(model)._meta.fields]
header = ",".join(fields)
self.stream.write(f"model,{header}\n")
self.stream.write(f"{dumped_str}\n")
Custom CSV Deserializer
class Deserializer(serializers.python.Deserializer):
def __init__(self, stream_or_string, **options):
if isinstance(stream_or_string, bytes):
stream_or_string = stream_or_string.decode()
if isinstance(stream_or_string, str):
stream_or_string = stream_or_string.splitlines()
try:
objects = csv.DictReader(stream_or_string)
except Exception as exc:
raise DeserializationError() from exc
super().__init__(objects, **options)
def _handle_object(self, obj):
try:
model_fields = apps.get_model(obj["model"])._meta.fields
obj["fields"] = {
field.name: obj[field.name]
for field in model_fields
if field.name in obj
}
yield from super()._handle_object(obj)
except Exception as exc:
raise DeserializationError(f"Error deserializing object: {exc}") from exc
Registering the Custom Format
SERIALIZATION_MODULES = {
"csv": "path.to.custom_csv_serializer",
"json": "django.core.serializers.json",
}
Conclusion
Django’s serialization framework is flexible and powerful, supporting JSON, JSONL, YAML, and fully custom formats. With DjangoJSONEncoder you can handle complex data types, and with custom serializers you can build formats tailored to your system’s needs. Whether you're exporting data, integrating with external services, or building ETL pipelines, Django provides all the tools you need.
What Are Natural Keys?
By default, Django serializes foreign keys and many‑to‑many relationships using primary key values. While this works in most cases, it becomes problematic when dealing with automatically generated models such as ContentType, User, Group, or Permission, where primary keys are unpredictable.
A natural key is a tuple of field values that uniquely identifies an object without relying on its primary key. Natural keys make serialized data more readable, portable, and stable across environments.
Warning: Never include automatically generated objects in fixtures. Their primary keys may differ between environments, causing fixture loading to fail.
Deserializing Natural Keys
To enable natural key deserialization, define a custom manager with a get_by_natural_key() method.
Example Models
class PersonManager(models.Manager):
def get_by_natural_key(self, first_name, last_name):
return self.get(first_name=first_name, last_name=last_name)
class Person(models.Model):
first_name = models.CharField(max_length=100)
last_name = models.CharField(max_length=100)
birthdate = models.DateField()
objects = PersonManager()
class Meta:
constraints = [
models.UniqueConstraint(
fields=["first_name", "last_name"],
name="unique_first_last_name",
),
]
Now a Book model referencing Person can use natural keys:
{
"pk": 1,
"model": "store.book",
"fields": {
"name": "Mostly Harmless",
"author": ["Douglas", "Adams"]
}
}
During deserialization, Django resolves ["Douglas", "Adams"] using get_by_natural_key().
Note: Fields used as natural keys must uniquely identify an object. Uniqueness does not have to be enforced at the database level, but it must be logically guaranteed.
Serialization Using Natural Keys
To serialize natural keys, define a natural_key() method on the model:
def natural_key(self):
return (self.first_name, self.last_name)
Then call serializers.serialize() with the appropriate flags:
serializers.serialize(
"json",
[book1, book2],
indent=2,
use_natural_foreign_keys=True,
use_natural_primary_keys=True,
)
What the Flags Do
- use_natural_foreign_keys=True ForeignKey fields are serialized using natural keys.
- use_natural_primary_keys=True The object’s own primary key is omitted, since it can be reconstructed.
This is useful when loading data into an existing database where primary keys may differ.
Tip:
When using dumpdata, use:
--natural-foreign
--natural-primary
Natural Keys and Forward References
A forward reference occurs when an object references another object that has not yet been deserialized.
To handle this, use:
objs_with_deferred_fields = []
for obj in serializers.deserialize("json", data, handle_forward_references=True):
obj.save()
if obj.deferred_fields is not None:
objs_with_deferred_fields.append(obj)
for obj in objs_with_deferred_fields:
obj.save_deferred_fields()
The referencing ForeignKey must have null=True.
Dependencies During Serialization
Sometimes natural keys depend on other natural keys.
To ensure correct ordering, define dependencies on the natural_key() method.
Example
def natural_key(self):
return (self.name,) + self.author.natural_key()
natural_key.dependencies = ["example_app.person"]
This ensures that Person objects are serialized before Book objects.
Conclusion
Natural keys provide a powerful way to serialize Django objects without relying on primary keys.
By implementing natural_key() and get_by_natural_key(), you can create readable, portable, and stable serialized data.
With support for forward references and dependency ordering, Django makes natural key serialization robust even in complex data structures.
Written & researched by Dr. Shahin Siami