LifestyleHarmful Defaults in Django

Harmful Defaults in Django

-

- Advertisment -

1. Relying on implicit SQL queries

Oh really, you’ve been normalizing your database tables all the way to 3rd normal form? You’re a true perfectionist.

Normalization is not only good for data integrity but also for crashing your database server.

A sample project for the code below can be found on github.

Let’s look at some example models:

from django.db import models
from django.conf import settings

class CustomerAddress(models.Model):
    line_1 = models.TextField()
    line_2 = models.TextField()
    line_3 = models.TextField()

class Customer(models.Model):
    auth_user = models.OneToOneField(
        settings.AUTH_USER_MODEL, on_delete=models.PROTECT, related_name="as_customer"
    )
    address = models.ForeignKey(
        CustomerAddress, on_delete=models.PROTECT, related_name="customers"
    )

class Topping(models.Model):
    name = models.CharField(max_length=100)

    def __str__(self):
        return self.name

class Pizza(models.Model):
    SMALL = 0
    MEDIUM = 1
    LARGE = 2
    SIZE_CHOICES = (
        (SMALL, "Small"),
        (MEDIUM, "Medium"),
        (LARGE, "Large"),
    )
    name = models.CharField(max_length=155)
    size = models.PositiveSmallIntegerField(choices=SIZE_CHOICES)
    toppings = models.ManyToManyField(Topping, through="PizzaTopping")

    def __str__(self):
        return f"{self.get_size_display()} {self.name}"

class PizzaTopping(models.Model):
    topping = models.ForeignKey(
        Topping,
        on_delete=models.PROTECT,
    )
    pizza = models.ForeignKey(
        Pizza,
        on_delete=models.PROTECT,
    )
    extra = models.BooleanField(default=False)

class Order(models.Model):
    customer = models.ForeignKey(
        Customer, on_delete=models.PROTECT, related_name="orders"
    )
    pizza = models.ForeignKey(Pizza, on_delete=models.PROTECT, related_name="orders")
    date = models.DateTimeField(auto_now_add=True)
    updated_at = models.DateTimeField(auto_now=True)

    def __str__(self):
        return f"Order by {self.customer.auth_user.username} on {self.date}"

Let’s build a view that shows all orders:

- Advertisement -
from django.core.paginator import Paginator
from django.shortcuts import render

from shop.models import Order

def slow_orders(request):

    all_orders = Order.objects.all()
    paginator = Paginator(all_orders, 10)
    page = paginator.get_page(request.GET.get("page"))

    return render(
        request, "orders.html", {"orders": page.object_list, "page_obj": page}
    )

And a template to render tabular data:

{% extends "base.html" %}

{% block content %}
    <h1>Ordersh1>
    <br>
    <br>
    <table>
        <tr>
            <th>Customer emailth>
            <th>Address line 1th>
            <th>Address line 2th>
            <th>Address line 3th>
            <th>Pizza nameth>
            <th>Toppingsth>
            <th>Dateth>
            <th>Updated atth>
        tr>
        {% for order in orders %}
            <tr>
                <td>{{ order.customer.auth_user.email }}td>
                <td>{{ order.customer.address.line_1 }}td>
                <td>{{ order.customer.address.line_2 }}td>
                <td>{{ order.customer.address.line_3 }}td>
                <td>{{ order.pizza.name }}td>
                <td>
                    {% for pizza_topping in order.pizza.pizzatopping_set.all %}
                        {{ pizza_topping.topping.name }}, <b>Extrab>: {{ pizza_topping.extra }}<br>
                    {% endfor %}
                td>
                <td>{{ order.updated_at }}td>
            tr>
        {% endfor %}
    table>
    <br>
    <br>
    {% include "_pagination.html" %}
{% endblock %}

My question to you is:

How many SQL queries are required to render this template? One? Two?

- Advertisement -

Remember, we used Order.objects.all() to fetch the orders. What does this line of code do anyway?

Some naive developers would think that this code:

  1. Connects to the database.
  2. Fetches all records from the table that stores data for our Order objects.

But actually, this line of code doesn’t connect to the database at all. All it does it prepare to execute when we actually need the data. This is called a lazy QuerySet in Djangoese.

The real work is taking place in our template:

  1. We iterate over the orders context object.
  2. We interpolate the attributes in the HTML.

Again, how many SQL queries? No less than 52 in this case.

- Advertisement -

Fifty Two queries!?

Whenever you access attributes that contain data from related tables, a SQL query will be perform to fetch that data. Obvious but not that much since we’re not seeing the raw queries at work.

And that’s why lazy QuerySets are so dangerous. It’s a mistake that almost all newcomers make, especially if their SQL knowledge is limited.

What needs to be done

Instead of preparing to fetch only the Orders, we need to carry out multi-table joins and aggregate all our data in as little queries as possible:

def fast_orders(request):
    all_orders = Order.objects.select_related(
        "customer",
        "customer__address",
        "customer__auth_user",
        "pizza",
    ).prefetch_related("pizza__pizzatopping_set", "pizza__pizzatopping_set__topping")
    paginator = Paginator(all_orders, 10)
    page = paginator.get_page(request.GET.get("page"))

    return render(
        request, "orders.html", {"orders": page.object_list, "page_obj": page}
    )

This reduces the number of SQL queries down to 4:

  1. One for counting objects for use with Paginator.
  2. Another for joining all fields related via ForeignKeys.
  3. Yet another for joining all fields related via ManyToManyFields.
  4. And a last one for joining across ManyToManyFields (Pizza to PizzaTopping to Topping).

Explicit QuerySets are better than implicit ones

Instead of remembering to join tables using select_related and prefetch_related, I like to use a package called django-zen-queries to force querysets to be evaluated as soon as they’re encountered.

The package also allows you to disable evaluation of QuerySets in templates.

By forcing QuerySet evaluation early, you also avoid the COUNT instruction for Paginator. Now your number of SQL queries gets reduced to only 3!

Without the ability to execute queries in templates, every developer will be forced to sculpt optimized queries before even rendering the template. This is essential for avoiding the n+1 query problem.

I highly recommend you install this package and use in it your project, especially if your team consist of many junior developers.

2. Using the default User model

Django comes with a default User model that you can use for authentication. It sounds great in theory because you can quickly build user login and registration functionality.

But guess what. User data will be stored in a database table that you’ll never be able to modify. This means that you don’t own your database — the Django foundation does!

The User model is also tightly coupled with the whole framework. Once your app reaches a certain level of maturity, you’re pretty much done for if you didn’t use a custom user model early on. Pray that you don’t have to implement custom authentication that requires you to add fields to the model.

Sometimes, you can solve this problem by using a UserProfile model that’s related to the default User model via a OneToOneField. It serves it’s purpose in many trivial web apps.

However, authentication is something you need to keep flexible for the lifetime of your project. At the very least, extend AbstractUser:

class User(AbstractUser):
	pass

Then set AUTH_USER_MODEL to myapp.User . It’s easy and will save you a ton of headaches down the road.

3. Using automatic migration names

When you run makemigrations, Django will automatically name the migration file according to the changes to made. Sometimes, it’s intelligent enough to create meaningful names but names often end up being gibberish like 0004_auto_20211124.py. By looking at this file name, nobody can know what this migration is doing to which model.

You should always name your migrations semantically. Team members looking at your directory need to know:

  1. Which model a particular migration is acting on.
  2. What it’s doing exactly.

For example, I could have a migration like 0005_post_reduce_title_length. By looking at this, you can deduce that I’m reducing the length of the title field on the Post model.

This will be very useful if you decide to move models to different apps down the road because you’ll know exactly which files belong to which models.

As an extra tip, I suggest you use 1 migration per model to make your migrations less coupled to the app itself.

4. Relying on automatic database table names

When you migrate a model for the first time, Django will create a database table like yourapp_modelname.

For example, if you have an app shop with a model Order, you will get a database table named shop_order.

Now think about this:

What would happen if you decide to move the Order model to a different app? I’m not even going to answer this question because I don’t want anything to do with this situation should it arise.

Instead of letting Django name your tables, name them yourself:

class Order:
	...
	class Meta:
		db_table = "orders"

When naming your tables, you should use names that would make sense to someone who’s not familiar with Django. Maybe you’ll hire a data analyst down the road and that person won’t know about Django apps and models.

Here, orders is better than shop_order because someone looking at the tables would immediately know that this is for orders. shop_order on the other hand, is in singular form and it’s necessary to know that there’s an app named shop that contains a model named order that manages this table.

5. Using multiple tiny apps

The official Django documentation encourages developers to break out functionality into apps. Separation of concerns is primordial and multiple apps are a great way to achieve that right?

Yes and no.

First, Python already has a solid package system. Many developers will break down into apps simply because they want to reduce the size of the models.py or one of the other boilerplate files. Then you end up with projects that have a ton of tiny apps with tiny files. Whenever you need to make a change, you need to follow dependencies across a bunch of apps.

Instead, just break down your modules into packages. Instead of models.py, have a models package with model classes imported in __init__.py. It’s much easier to handle and you also get your migrations, urls, and views all in one place.

6. Dumping apps in the root directory

Unlike Ruby on Rails, Django doesn’t have any opinions about how you should structure your project. Ok fine, I have my own opinions anyway. But what about people brand new to the framework? Is there a sane default for them? Bah!

By default, the startapp command dumps apps in the root directory. That’s how many projects are laid out. It’s not really an issue if you followed my advice of using a single app for everything but still, what happens if you name an app robots and then install a package called django-robots? Turns out, both of the packages are called robots. Will you be able to install both of them in your INSTALLED_APPS? Or will Django choose only one? But, which one?

I’ll let you find out on your own because I never run in this situation since I always place my apps inside a package named after my project. Take a look at this cookiecutter for an example.

By placing your apps inside a package, you solve the naming conflict problem and can name your apps whatever you want.

Django is unchained

As you can see, Django doesn’t have your best interest at heart. Instead of providing sane defaults, it punishes you for not being a master at the framework.

Now that you know what you look for, you’ll be able to ship higher quality apps using this powerful framework.

The more you work with Django, the better you’ll get and you’ll be able to keep your team happy and productive.

So go out there and start shipping!

Join the pack! Join 8000+ others registered users, and get chat, make groups, post updates and make friends around the world!
www.knowasiak.com/register/
Read More

- Advertisement -

2 Comments

  1. I think these are simply "gotchas" rather than reasons not to use django. We've probably all committed an n+1 query at some point, but that doesn't mean the language/framework we were using was bad.

    Meta point: I really enjoy articles that list the places noobies go wrong (for any language/framework, not exclusively django). E.g. an experienced dev I taught R to was mind-blown that vec[1] accesses the first element of a vector (not vec[0]).

    Long lists of these unexpected (albeit simple) things can quickly be compiled. It might only take 30 minutes for a noobie to read but probably ends up saving days of pain down the track.

  2. As someone has used Django extensively over the last 10 years for countless projects, I think that these are exaggerated claims of “harmful” defaults, particularly examples 2 to 6.

    Furthermore, this:

    > As you can see, Django doesn’t have your best interest at heart. Instead of providing sane defaults, it punishes you for not being a master at the framework.

    Is just nonsense.

    Regarding 1. While this is certainly an issue, it’s an issue for anyone using any framework and a challenge of database-backed web applications everywhere. It’s also heavily documented by Django and one of the first things mentioned. Lazy querysets are actually very useful when you need to build queries over a number of steps. Explicit query sets can have their own subtle issues with performance.

    2. The default user model is fine for most users. It’s not a bad default and there is clear documentation on how to extend it. It’s designed to be extended!

    3. I rarely rename migrations. Maybe this is a good idea, but it’s hardly a harmful default.

    4. I almost never rename my tables. Again, maybe a sensible suggestion, not a harmful default.

    5. Strangely this seems to be the most un-Django suggestion that would have the largest effect on a project but hardly any detail is given here. I never do this and wouldn’t recommend anyone else to when staring off.

    6. I think project structure and CLI is one place where Django is weak but again this is hardly a harmful default. Use django-cookiecutter.

    Anyway I feel like the author is trying to manufacturer controversy to help define their expertise (not to say they aren’t a good Django developer, I have no idea). Some of these are good suggestions and “top tips” but they’re certainly not instances of Django trying to harm you.

You might also likeRELATED
Recommended to you

Dangerous humid heat extremes occurring decades before expected (2020)

Oppressively hot summer days often evoke the expression, “it’s not the heat, it’s the humidity.” That sticky, tropical-like air combined with high temperatures is more than unpleasant — it makes extreme heat a greater health risk.  Climate models project that combinations of heat and humidity could reach deadly thresholds for anyone spending several hours outdoors…

Dominant languages can spread even without coercion

Whether and how to resist them is a tough questionNEVER THINK the world is in decline. A recent book,...

E.O. Wilson saw the world in a new way

I first met Edward O. Wilson in 1971 when I was a student in an ecology course at the Marine Biological Laboratory in Woods Hole, Massachusetts. Wilson, a famous Harvard professor, was sitting in on the student project reports. After I reported my experiments on food size selection in zooplankton, Wilson remarked, “That’s new, isn’t…

Existing HN: OnlyFans for Investing. Practice merchants and monetize your alpha.

Early Access Twitter About UsOnlyFans x Investing.Copy your favorite investors and monetize your existing portfolio in one-click.Available on iOS & Android.💸 Verified AlphaFollow investors in one-click to access their portfolios in real-time and copy their trades.Making money has never been easier.📺 Trade FeedKeep track of investors you follow in one place. View detailed information about their trades…
- Advertisement -

Billionaire Chamath Palihapitiya: ‘nobody cares’ about China’s Uyghur genocide

WASHINGTON – Billionaire investor Chamath Palihapitiya triggered a backlash on social media after saying during a recent episode of his podcast that "nobody cares" about the ongoing human rights abuses against the Uyghurs in China.During a 90-minute episode, Palihapitiya told co-host Jason Calacanis on their "All-In" podcast that he would be lying if he said that…

Show HN: Declarative Instrumentation for Python

Pyccolo is a library for declarative instrumentation in Python; i.e., it lets you specify the what of the instrumentation you wish to perform, and takes care of the how for you. It aims to be ergonomic, composable, and portable, by providing an intuitive interface, making it easy to layer multiple levels of instrumentation, and allowing…

Must read

Groups never admit failure

Naval: Groups never admit failure. A group would rather keep living in the mythology of “we were repressed” than ever admit failure. Individuals are the only ones who admit failure. Even individuals don’t like to admit failure, but eventually, they can be forced to. A group will never admit they were wrong. A group will…

Russia says Ukraine talks hit ‘dead end’, Poland warns of risk of war

Envoy says Russia wants peace but not at any costRyabkov says experts putting military options to PutinPolish minister says Europe closest to war for 30 yearsU.S. ambassador to OSCE says 'drumbeat of war' is loudMoscow says it has not given up on diplomacy thoughVIENNA/MOSCOW, Jan 13 (Reuters) - Poland's foreign minister said on Thursday that…
- Advertisement -