ORM Flashcards
How do you find the query associated with a queryset?
You have a model called Event. For getting all records, you will write something like Event.objects.all(), then do str(queryset.query)
> > > queryset = Event.objects.all()
str(queryset.query)
SELECT “events_event”.”id”, “events_event”.”epic_id”,
“events_event”.”details”, “events_event”.”years_ago”
FROM “events_event”
or
> > > queryset = Event.objects.filter(years_ago__gt=5)
str(queryset.query)
SELECT “events_event”.”id”, “events_event”.”epic_id”, “events_event”.”details”,
“events_event”.”years_ago” FROM “events_event”
WHERE “events_event”.”years_ago” > 5
How do you use OR filter queries in Django ORM?
Say you want find all users with firstname starting with ‘R’ and last_name starting with ‘D’.
ORM query would look like
queryset = User.objects.filter( first_name\_\_startswith='R' ) | User.objects.filter( last_name\_\_startswith='D' ) queryset , , , , ]> The | represents the or logical operator
Alternatively, you can use the Q objects.
qs = User.objects.filter(Q(first_name__startswith=’R’)|Q(last_name__startswith=’D’))
The Q represents the or logical operator
How do you use AND filter queries in Django ORM?
The default way to combine multiple conditions in filter is AND, so you can just do.
queryset_1 = User.objects.filter( first_name\_\_startswith='R', last_name\_\_startswith='D' ) Alternatively, you can explicitly use the & operator on querysets.
queryset_2 = User.objects.filter( first_name\_\_startswith='R' ) & User.objects.filter( last_name\_\_startswith='D' ) For complete customisability, you can use the Q objects.
queryset_3 = User.objects.filter(
Q(first_name__startswith=’R’) &
Q(last_name__startswith=’D’)
)
queryset_1
, , ]>
How do you use NOT filter queries in Django ORM?
Method 1 using exclude
Method 2 using Q() method
> > > from django.db.models import Q
queryset = User.objects.filter(~Q(id__lt=5))
queryst
, , , , , ]>
How to do union of two querysets from same or different models?
The UNION operator is used to combine the result-set of two or more querysets. The querysets can be from the same or from different models. When the querysets are from different models, the fields and their datatypes should match.
>>> q1 = User.objects.filter(id\_\_gte=5) >>> q1 , , , , , ]> >>> q2 = User.objects.filter(id\_\_lte=9) >>> q2 , , , , , , , , ]> >>> q1.union(q2) , , , , , , , , , ]> >>> q2.union(q1) , , , , , , , , , ]> The union operation can be performed only with the querysets having same fields and the datatypes. Hence our last union operation encountered error. You can do a union on two models as long as they have same fields or same subset of fields.
Since Hero and Villain both have the name and gender, we can use values_list to limit the selected fields then do a union.
Hero.objects.all().values_list( "name", "gender" ).union( Villain.objects.all().values_list( "name", "gender" )) This would give you all Hero and Villain objects with their name and gender.
How to select some fields only in a queryset?
Django provides two ways to do this
values and values_list methods on queryset.
only_method
Say, we want to get first_name and last_name of all the users whose name starts with R. You do not want the fetch the other fields to reduce the work the DB has to do.
>>> User.objects.filter( first_name\_\_startswith='R' ).values('first_name', 'last_name') > queryset = User.objects.filter( first_name\_\_startswith='R' ).only("first_name", "last_name") str(queryset.query), gives us
SELECT “auth_user”.”id”, “auth_user”.”first_name”, “auth_user”.”last_name”
FROM “auth_user” WHERE “auth_user”.”first_name”::text LIKE R%
The only difference between only and values is only also fetches the id.
How to do a subquery expression in Django?
Django allows using SQL subqueries. Let’s start with something simple, We have a UserParent model which has OnetoOne relation with auth user. We will find all the UserParent which have a UserParent.
> > > from django.db.models import Subquery
users = User.objects.all()
UserParent.objects.filter(user_id__in=Subquery(users.values(‘id’)))
, , ]>
Now for something more complex. For each Category, we want to find the most benevolent Hero.
The models look something like this.
class Category(models.Model): name = models.CharField(max_length=100)
class Hero(models.Model): # ... name = models.CharField(max_length=100) category = models.ForeignKey(Category, on_delete=models.CASCADE)
benevolence_factor = models.PositiveSmallIntegerField( help_text="How benevolent this hero is?", default=50 ) You can find the most benevolent Hero like this
hero_qs = Hero.objects.filter( category=OuterRef("pk") ).order_by("-benevolence_factor") Category.objects.all().annotate( most_benevolent_hero=Subquery( hero_qs.values('name')[:1] ) ) If you look at the generated sql, you will see
SELECT “entities_category”.”id”,
“entities_category”.”name”,
(SELECT U0.”name”
FROM “entities_hero” U0
WHERE U0.”category_id” = (“entities_category”.”id”)
ORDER BY U0.”benevolence_factor” DESC
LIMIT 1) AS “most_benevolent_hero”
FROM “entities_category”
Let’s break down the queryset logic. The first part is
hero_qs = Hero.objects.filter(
category=OuterRef(“pk”)
).order_by(“-benevolence_factor”)
We are ordering the Hero object by benevolence_factor in DESC order, and using category=OuterRef(“pk”) to declare that we will be using it in a subquery.
Then we annotate with most_benevolent_hero=Subquery(hero_qs.values(‘name’)[:1]), to get use the subquery with a Category queryset. The hero_qs.values(‘name’)[:1] part picks up the first name from subquery.
How to filter a queryset with criteria based on comparing their field values
Django ORM makes it easy to filter based on fixed values. To get all User objects with first_name starting with ‘R’, you can do User.objects.filter(first_name__startswith=’R’).
What if you want to compare the first_name and last name? You can use the F object. Create some users first.
In [27]: User.objects.create_user(email=”shabda@example.com”, username=”shabda”, first_name=”Shabda”, last_name=”Raaj”)
Out[27]:
In [28]: User.objects.create_user(email=”guido@example.com”, username=”Guido”, first_name=”Guido”, last_name=”Guido”)
Out[28]:
Now you can find the users where first_name==last_name
In [29]: User.objects.filter(last_name=F(“first_name”))
Out[29]: ]>
F also works with calculated field using annotate. What if we wanted users whose first and last names have same letter?
You can set the first letter from a string using Substr(“first_name”, 1, 1), so we do.
In [41]: User.objects.create_user(email=”guido@example.com”, username=”Tim”, first_name=”Tim”, last_name=”Teters”)
Out[41]:
#…
In [46]: User.objects.annotate(first=Substr(“first_name”, 1, 1), last=Substr(“last_name”, 1, 1)).filter(first=F(“last”))
Out[46]: , ]>
How to filter FileField without any file?
A FileField or ImageField stores the path of the file or image. At the DB level they are same as a CharField.
So to find FileField without any file we can query as under.
no_files_objects = MyModel.objects.filter(
Q(file=’’)|Q(file=None)
How to perform join operations in django ORM?
A SQL Join statement is used to combine data or rows from two or more tables based on a common field between them. Join can be carried out in many ways. Some are shown below.
> > > a1 = Article.objects.select_related(‘reporter’) // Using select_related
a1
, , , , , ]>
print(a1.query)
SELECT “events_article”.”id”, “events_article”.”headline”, “events_article”.”pub_date”, “events_article”.”reporter_id”, “events_article”.”slug”, “auth_user”.”id”, “auth_user”.”password”, “auth_user”.”last_login”, “auth_user”.”is_superuser”, “auth_user”.”username”, “auth_user”.”first_name”, “auth_user”.”last_name”, “auth_user”.”email”, “auth_user”.”is_staff”, “auth_user”.”is_active”, “auth_user”.”date_joined” FROM “events_article” INNER JOIN “auth_user” ON (“events_article”.”reporter_id” = “auth_user”.”id”) ORDER BY “events_article”.”headline” ASC
a2 = Article.objects.filter(reporter__username=’John’)
a2
, , , , ]>
print(a2.query)
SELECT “events_article”.”id”, “events_article”.”headline”, “events_article”.”pub_date”, “events_article”.”reporter_id
How to find second largest record using Django ORM ?
You would across situations when you want to find second highest user depending on their age or salary.
Though the ORM gives the flexibility of finding first(), last() item from the queryset but not nth item. You can do it using the slice operator.
We can find Nth records from the query by using slice operator.
> > > user = User.objects.order_by(‘-last_login’)[1] // Second Highest record w.r.t ‘last_login’
user.first_name
‘Raghu’
user = User.objects.order_by(‘-last_login’)[2] // Third Highest record w.r.t ‘last_login’
user.first_name
‘Sohan’
User.objects.order_by(‘-last_login’)[2] only pulls up the required object from db using LIMIT … OFFSET. If you look at the generated sql, you would see something like this.
SELECT "auth_user"."id", "auth_user"."password", "auth_user"."last_login", "auth_user"."is_superuser", "auth_user"."username", "auth_user"."first_name", "auth_user"."last_name", "auth_user"."email", "auth_user"."is_staff", "auth_user"."is_active", "auth_user"."date_joined" FROM "auth_user" ORDER BY "auth_user"."last_login" DESC LIMIT 1 OFFSET 2
Find rows which have duplicate field values
Say you want all users whose first_name matches another user.
You can find duplicate records using the technique below.
> > > duplicates = User.objects.values(
‘first_name’
).annotate(name_count=Count(‘first_name’)).filter(name_count__gt=1)
duplicates
If you need to fill all the records, you can do
> > > records = User.objects.filter(first_name__in=[item[‘first_name’] for item in duplicates])
print([item.id for item in records])
[2, 11, 13]
How to find distinct field values from queryset?
You want to find users whose names have not been repeated. You can do this like this
distinct = User.objects.values(
‘first_name’
).annotate(
name_count=Count(‘first_name’)
).filter(name_count=1)
records = User.objects.filter(first_name__in=[item[‘first_name’] for item in distinct])
This is different from User.objects.distinct(“first_name”).all(), which will pull up the first record when it encounters a distinct first_name.
How to use Q objects for complex queries?
If you want to OR your conditions.
>>> from django.db.models import Q >>> queryset = User.objects.filter( Q(first_name\_\_startswith='R') | Q(last_name\_\_startswith='D') ) >>> queryset , , , , ]> If you want to AND your conditions.
> > > queryset = User.objects.filter(
Q(first_name__startswith=’R’) & Q(last_name__startswith=’D’)
)
queryset
, , ]>
If you want to find all users whose first_name starts with ‘R’, but not if the last_name has ‘Z’
> > > queryset = User.objects.filter(
Q(first_name__startswith=’R’) & ~Q(last_name__startswith=’Z’)
)
If you look at the generated query, you would see
SELECT "auth_user"."id", "auth_user"."password", "auth_user"."last_login", "auth_user"."is_superuser", "auth_user"."username", "auth_user"."first_name", "auth_user"."last_name", "auth_user"."email", "auth_user"."is_staff", "auth_user"."is_active", "auth_user"."date_joined" FROM "auth_user" WHERE ("auth_user"."first_name"::text LIKE R% AND NOT ("auth_user"."last_name"::text LIKE Z%)) You can combine the Q objects in more complex ways to generate complex queries.
How to group records in Django ORM?
Grouping of records in Django ORM can be done using aggregation functions like Max, Min, Avg, Sum. Django queries help to create, retrieve, update and delete objects. But sometimes we need to get aggregated values from the objects. We can get them by example shown below
> > > from django.db.models import Avg, Max, Min, Sum, Count
User.objects.all().aggregate(Avg(‘id’))
{‘id__avg’: 7.571428571428571}
User.objects.all().aggregate(Max(‘id’))
{‘id__max’: 15}
User.objects.all().aggregate(Min(‘id’))
{‘id__min’: 1}
User.objects.all().aggregate(Sum(‘id’))
{‘id__sum’: 106}
How to efficiently select a random object from a model?
Your category models is like this.
class Category(models.Model): name = models.CharField(max_length=100)
class Meta: verbose_name_plural = "Categories"
def \_\_str\_\_(self): return self.name You want to get a random Category. We will look at few alternate ways to do this.
The most straightforward way, you can order_by random and fetch the first record. It would look something like this.
def get_random(): return Category.objects.order_by("?").first() Note: order_by('?') queries may be expensive and slow, depending on the database backend you’re using. To test other methods, we need to insert one million records in Category table. Go to your db like with python manage.py dbshell and run this.
INSERT INTO entities_category (name) (SELECT Md5(Random() :: text) AS descr FROM generate_series(1, 1000000)); You don’t need to understand the full details of the sql above, it creates one million numbers and md5-s them to generate the name, then inserts it in the DB.
Now, instead of sorting the whole table, you can get the max id, generate a random number in range [1, max_id], and filter that. You are assuming that there have been no deletions.
In [1]: from django.db.models import Max
In [2]: from entities.models import Category
In [3]: import random
In [4]: def get_random2():
…: max_id = Category.objects.all().aggregate(max_id=Max(“id”))[‘max_id’]
…: pk = random.randint(1, max_id)
…: return Category.objects.get(pk=pk)
…:
In [5]: get_random2()
Out[5]:
In [6]: get_random2()
Out[6]:
If your models has deletions, you can slightly modify the functions, to loop until you get a valid Category.
In [8]: def get_random3():
…: max_id = Category.objects.all().aggregate(max_id=Max(“id”))[‘max_id’]
…: while True:
…: pk = random.randint(1, max_id)
…: category = Category.objects.filter(pk=pk).first()
…: if category:
…: return category
…:
In [9]: get_random3()
Out[9]:
In [10]: get_random3()
Out[10]:
Unless your model has a lot of deletions, the while True: loop return quickly. Lets use timeit to see the differences.
In [14]: timeit.timeit(get_random3, number=100)
Out[14]: 0.20055226399563253
In [15]: timeit.timeit(get_random, number=100)
Out[15]: 56.92513192095794
get_random3 is about 283 time faster than get_random. get_random is the most generic way, but the technique in get_random3 will work unless you change changed the default way Django generates the id - autoincrementing integers, or there have been too many deletions.