Authentication with Django from First Principles

Authentication is how an application verifies that a request comes from a particular user. It's something we're often used to being done for us. It's part of the furniture of whatever framework we are using to write our applications, and we often give no more thought to how it actually works than we do to how the HTTP requests themselves work. They just... do.

As with all conveniences, this is fine until it isn't. If something about authentication isn't working the way it should, then suddenly it matters what is being sent with requests, how that is processed, and why. It can be helpful to have a mental model in your head of how it all works so that you can fix things.

The best way to construct that mental model, is to implement it from scratch to understand it from first principles, which is what we are going to do here. There is no one way to do authentication, but the flow outlined below - a mix of JWT-based access tokens with long-lived refresh tokens - is a widely accepted standard, especially for django applications which primarily serve REST APIs.

Much of what we cover here would be handled by established libraries in real-life production applications. This is not instructions for how to actually implement authentication, it is a way to build a mental model in your head of how this particular flavour of authentication works.

The `User` model

What are users? This could be an article in itself, but we are going to assume that a user is some human being who is specifically permitted to interact with the application and has some way of verifying they are that person. The related term 'client' is whatever software they are using to send HTTP requests to us - often a browser, but may be a script, a dedicated REST API client, or anything else which can send requests.

Users are represented in our app with a users table (where each row corresponds to some real person) and in our case, a corresponding django User model. Each user needs, at minimum, some unique identifier that always refers to that user (a username), and some secret string which only they have which, when given, is treated as confirmation that it is indeed that person, and not someone pretending to be them.

For the purposes of working from first principles, we're going to be defining our own User model from scratch here. In real applications it is almost always wiser to inherit from django's own AbstractUser and add the extra functionality you want on top of that. This way third-party libraries know how to interact with your user model, and you don't need to re-implement everything from scratch.

This is what our minimal user model looks like:

from django.db import models

class User(models.Model):
	username = models.SlugField(max_length=100, unique=True)
	password = models.CharField(max_length=255)

That's it. You can add more, and if your application is doing anything other than letting people log in and out, you almost certainly would need to add more, but this is all that is needed for our authentication flow. Each user is uniquely identified by the username, and people interacting with the application prove they are the person corresponding to this user by providing the password which only they know.

The password field does not store the password itself, it stores a 'hash' of it. Hashing is a process which maps any string to some fixed-length scrambling of it, in such a way that (1) the same input always produces the same output hash, (2) producing the hash is easy, and (3) it is (essentially) impossible to work out the original input string from the hash. By storing the hash, we can check that a given password is correct by hashing it and checking it matches what is in the database, but we avoid the security implications of storing the password itself. This way if the database is breached, they only get the hashes, not the passwords.

This is important because the user may have re-used the password in other places, and so a breach here would give an attacker access to their other accounts. This feature of passwords - their potential re-use - will be something we have to work around again and again in authentication, as we try to limit its exposure as much as we can. Not storing it as plain text in our database, is merely the first and most obvious protection.

Creating users

We have a way of representing users, but unless we want admins to manually add new users, there needs to be a way for people who are not currently registered users to become so - i.e 'sign up'. There needs to be some endpoint to which people can send their desired username and password, and which will create a user row in the database with it.

This needs to do the following:

Check that there isn't already a user with the desired username.
Check that the password meets our minimum criteria. Users can pick whatever they want, but we should impose some limits or we will end up with half our users picking 'password' as their password. We have to save the users from themselves to some extent.
Create a hash of the password, while we have it. This is what all future password checks will be made against.
Add the user row to the database, with the username and password hash. The password itself is not saved anywhere.

Precisely how this is done doesn't matter too much, but here is how we might do it with Django REST Framework (DRF):

# serializers.py
from rest_framework import serializers
from django.contrib.auth.password_validation import validate_password
from django.contrib.auth.hashers import make_password
from .models import User

class UserSerializer(serializers.ModelSerializer):

	class Meta:
		model = User
		fields = ["username", "password"]
		extra_kwargs = {"password": {"write_only": True}}

	def validate_password(self, value):
		validate_password(value)
		return value

	def create(self, validated_data):
		validated_data["password"] = make_password(validated_data["password"])
		return super().create(validated_data)


# views.py
from rest_framework import generics
from .models import User
from .serializers import UserSerializer

class UsersView(generics.CreateAPIView):
	queryset = User.objects.all()
	serializer_class = UserSerializer


# urls.py
from django.urls import path
from .views import UsersView

urlpatterns = [
	# ...
	path("users", UsersView.as_view()),
	# ...
]

A few things to note here. Checking the password is valid is done with django's validate_password - this lets you define rules in settings.py and has the built-in machinery for checking password validation. Likewise, we use django's make_password to construct the hash. This handles hashing, salting, hashing algorithm upgrades, and more. Even when implementing from first principles for instructive purposes, implementing this from scratch would be madness.

So, our current auth flow looks like this:

Users send their desired username and password to /users.
We validate these and add the user to the database.

But how do you do things 'as' that user?

Access Tokens

Our django app only sees HTTP requests. If we want it to know it's 'us' sending the request, we need to put something in this request - a header, a cookie, or parameter - that lets it know who the user is. This isn't needed for our /users endpoint because the whole point is that it creates a new user. But many endpoints, sometimes almost all of them, will need some way for the request to indicate which user object it should be associated with. Some endpoints may want to reject a request that isn't associated with a user, or with a user with a particular attribute (such as being an admin) - or some may even reject any requests that aren't for one specific user.

We could just provide our username, which would indicate who we are, but for obvious reasons this isn't enough. We need to verify that an incoming request has come from the person associated with that username.

So then, we could send the username and password on every request. This is after all, the point of the password. Our app could take the password from the request, hash it, check the hash matches, and associate the relevant user object with the request.

This is a terrible idea though. If the password has to be sent with every request, then all it takes is for one request to be intercepted, and now an attacker has the password. The client software also has to store the password in its own memory so that it can use it for every request without re-prompting the user, which is now another place it can be stolen from.

We have to store and send something with every request that identifies the user, that is unavoidable. But we can reduce the consequences of something bad happening by having the user send something else instead. A random string we generate for them, which is only valid for a short period of time, and which is not going to be in use anywhere else - an access token.

The way this works is that the user makes one request to a dedicated endpoint for generating access tokens, with their username and password, and after verifying these we send back an access token which, when we see again, we will know corresponds to that user. All future requests simply provide this token. Now the password is only sent once, and doesn't need to be stored by the client at all. If this access token is stolen, it can't be used anywhere else, and can only be used here for ten minutes or so.

This is precisely what JSON Web Tokens (JWTs) do. A JWT is essentially a small JSON object, containing a user ID, and an expiry time, encoded as a base64 string. Crucially though, when the API generates it, it appends a cryptographic signature to the end of this string so that only it can generate valid ones. An attacker can create a fake JWT if they know the user ID, but they won't be able to add a valid signature because they don't know the secret key the API uses to make and verify them. There's also no database lookup required to look for a saved form of the token - the token itself contains the user ID and the expiry date (nothing actually happens to the token after the expiry date of course, it's just that the API will treat it as invalid after that point).

We can use the pyJWT library to generate and sign the token, and here we will construct the JSON objects ourselves. As with the decision to use an entirely custom user model, we are implementing this from scratch partly for demonstrative purposes. There are plenty of third-party libraries that will just handle this entire flow for you, and while they are great once you already understand how a JWT auth flow works, they can hide a lot of the details from you in a way that makes it hard to internalise a mental model of what is actually happening.

The following snippet will produce a JWT with a short expiration time, giving the user ID of the user it is for, and signed with the API's own secret key. As long as the secret key doesn't change, the app will always be able to confirm this was generated by itself, and isn't a forgery.

# tokens.py
import jwt
from datetime import datetime, timezone, timedelta
from django.conf import settings
from .models import User

def get_access_token(user):
	expires_in = settings.ACCESS_TOKEN_MINUTES
	now = datetime.now(timezone.utc)
	payload = {
		"sub": str(user.id),
		"iat": now,
		"exp": now + timedelta(minutes=expires_in),
	}
	return jwt.encode(payload, settings.SECRET_KEY, algorithm="HS256")

The expiry time is set some number of minutes into the future, based on a setting in settings.py. I typically set this to be somewhere between ten and thirty minutes.

The endpoint itself - let's say /authenticate - needs to take a username and password, check there is a user matching that username, check the password hashes correctly for that user, and then generate and return the access token. Here is an example view using DRF:

# serializers.py
from django.contrib.auth.hashers import check_password

class AuthenticateSerializer(serializers.Serializer):
	username = serializers.CharField()
	password = serializers.CharField()

	def validate(self, data):
		try:
			user = User.objects.get(username=data["username"])
		except User.DoesNotExist:
			raise AuthenticationFailed("Invalid credentials")
		if not check_password(data["password"], user.password):
			raise AuthenticationFailed("Invalid credentials")
		self.user = user
		return data


# views.py
from . import tokens

class AuthenticateView(APIView):

	def post(self, request):
		serializer = AuthenticateSerializer(data=request.data)
		serializer.is_valid(raise_exception=True)
		access_token = tokens.get_access_token(serializer.user)
		return Response({"access_token": access_token})



# urls.py
from .views import AuthenticateView

urlpatterns = [
	# ...
	path("authenticate", AuthenticateView.as_view()),
	# ...
]

We use django's check_password function to do the hashing and checking against the user's password hash in the database. Our current auth flow looks like this:

Users send their desired username and password to /users.
We validate these and add the user to the database.
The user sends their new username and password to /authenticate.
We check this is a valid user, and the password hashes to the correct hash.
We generate an access token with a short expiration date and return it to the user.
The user's client discards the password, and stores the access token in its memory.

So how do we actually deal with these access tokens when we encounter them?

Checking Access Tokens

It is no use sending these access tokens on every request if our application ignores them. Suppose we want to have an endpoint /me, which uses the user serialization we used on the user creation endpoint, but which returns whatever user the incoming access token is for.

# urls.py
from .views import MeView

urlpatterns = [
	# ...
	path("me", MeView.as_view()),
	# ...
]


# views.py
class MeView(generics.RetrieveAPIView):
	serializer_class = UserSerializer

	def get_object(self):
		return # ???

The whole point of the access token is to reliably identify what user object in the database the request should be associated with. We need to get the access token from the request, verify it's a valid token, and get the user object from the ID in that access token, and add it to the request as a user attribute. The logic for that is something like:

import jwt
from django.conf import settings

# Get token from request
token = request.headers.get("Authorization", "")

# Try to decode the token
try:
	payload = jwt.decode(token, settings.SECRET_KEY, algorithms=["HS256"])
	
	# If that raised no errors, we can get the user
	user = User.objects.get(id=payload["sub"])

# But don't get user if expired
except jwt.ExpiredSignatureError:
	user = None

# And we can't get user if the token is invalid
except (KeyError, User.DoesNotExist, jwt.InvalidTokenError):
	user = None

But where to put this? We can put the logic in our view, which is where we currently need to access that user object, but we will likely need access to the user in lots of views. Instead, we should put this logic somewhere where it will be done to every incoming request. This is exactly what middleware is for - code which applies to every incoming request and outgoing response. Django has built-in middleware machinery, and DRF has its own middleware-like layers. We will be using the latter here, but it would work equally well in the former:

# authentication.py
import jwt
from rest_framework.authentication import BaseAuthentication
from rest_framework.exceptions import AuthenticationFailed
from django.conf import settings
from .models import User

class AccessTokenAuthentication(BaseAuthentication):

	def get_token(self, request):
		return request.headers.get("Authorization", "")

	def authenticate(self, request):
		if not (token := self.get_token(request)): return None
		try:
			payload = jwt.decode(token, settings.SECRET_KEY, algorithms=["HS256"])
			user = User.objects.get(id=payload["sub"])
			return (user, None)
		except jwt.ExpiredSignatureError:
			raise AuthenticationFailed("Access token has expired")
		except (KeyError, User.DoesNotExist, jwt.InvalidTokenError):
			raise AuthenticationFailed("Invalid access token")



# views.py
from .authentication import AccessTokenAuthentication

class MeView(generics.RetrieveAPIView):
	serializer_class = UserSerializer
	authentication_classes = [AccessTokenAuthentication]

	def get_object(self):
		return self.request.user

Here we define a DRF authentication class, which when applied to a view will automatically set request.user on the request, based on the access token.

If we want to also bounce back any requests which don't have a valid access token, we can use a DRF permission class:

# permissions.py
from rest_framework.permissions import BasePermission

class IsAuthenticated(BasePermission):

	def has_permission(self, request, view):
		return request.user is not None


# views.py
from .permissions import IsAuthenticated

class MeView(generics.RetrieveAPIView):
	serializer_class = UserSerializer
	authentication_classes = [AccessTokenAuthentication]
	permission_classes = [IsAuthenticated]
	
	def get_object(self):
		return self.request.user

This is a simple example of authorization - granting or denying access to a resource based on who is asking. This is separate from authentication, but is built on top of it.

So, we now have a system of users with unique usernames and secret password, where users can be created and have their password hashed, where the API will issue short-lived access tokens for clients to identify themselves as a particular user in requests without needing to use the password, and machinery for turning that access token into a user object on every subsequent request.

Our auth flow is now:

Users send their desired username and password to /users.
We validate these and add the user to the database.
The user sends their new username and password to /authenticate.
We check this is a valid user, and the password hashes to the correct hash.
We generate an access token with a short expiration date and return it to the user.
The user's client discards the password, and stores the access token in its memory.
The user sends requests to other endpoints with the access token in the Authorization header.
We verify this, get the correct user object from the database, and assign it to the request where it can be accessed by the views.

Refresh Tokens

The system that we already have works. The client asks the user for their username and password once, uses that to get an access token, stores the access token, and uses that in all subsequent requests.

...For about thirty minutes, and then the access token expires. How many websites have you used where you sign in with your credentials and then have to sign in again after thirty minutes? This may be fine for scripts, which may (or may not!) get everything they need to do done within thirty minutes, but for browser-based web apps, this is not a very good experience.

So what to do? Well we could increase the access token lifetime, but the whole point of them is to be short-lived, and even if you increased it to a few hours that's still not a very good user experience.

Or, instead of discarding the password once the user provides it, we could store it in memory, and then whenever the access token expires we get a new one by sending another request to /authenticate in the background - 'refreshing' the token. This is a seamless user experience - until the user refreshes the page and clears whatever is in memory. And also, we really don't want the password in memory, and we don't want to be sending it off in a request every thirty minutes.

But this overall pattern is sound - refreshing the access token in the background whenever it expires (or gets close to expiring). We just don't want to use the password to do it. We need some credential which we can store somewhere secure, which is powerful enough to let us get a new access token but no more, and which doesn't expire quickly.

This is what refresh tokens are for. A refresh token is something the user can use to get new access tokens. We generate it, so we don't need to worry that it's also a credential for some other service (like passwords), and we store them in the database so we can invalidate them if we need to. It is a middle ground between the password itself, and access tokens.

And, crucially, the refresh token is stored (at least by browsers) as a 'HTTP-only cookie'. This means you can't access it with Javascript, it is stored in the browser's cookie storage, and sent only on requests to a specific /refresh endpoint.

The authentication flow now is therefore:

Users send their desired username and password to /users.
We validate these and add the user to the database.
The user sends their new username and password to /authenticate.
We check this is a valid user, and the password hashes to the correct hash.
We generate an access token and a refresh token and return these to the user - the latter as a 'set-cookie' instruction
The user's client discards the password, stores the access token in its memory, and stores the refresh token in a secure cookie storage.
The user sends requests to other endpoints with the access token in the Authorization header.
We verify this, get the correct user object from the database, and assign it to the request where it can be accessed by the views.
When the access token is about to run out, the user sends a request to /refresh with the cookie.
We hash this, find any refresh token in the database with this hash, get the associated user, and generate an access token for the user. We also generate a new refresh token to replace the old one.
The client replaces its access token with the new one.

Why do we replace the refresh token when it's used? Well, suppose the refresh token was stolen - we want to know so we can take action. If we delete and re-issue a refresh token every time it is used, and the attacker uses the refresh token, ours suddenly doesn't work despite not being expired, alerting us. (Note that the attacker still has a valid refresh token, they have just alerted us to the fact they have it.) It also means we can invalidate the stolen token simply by using it.

Let's implement this - first we need to create a new database table for refresh tokens:

# models.py
from django.utils import timezone
from datetime import datetime

class RefreshToken(models.Model):

	token_hash = models.CharField(max_length=255, db_index=True, unique=True)
	expires_at = models.DateTimeField()
	user = models.ForeignKey(User, on_delete=models.CASCADE)

	@property
	def is_expired(self):
		return self.expires_at < datetime.now(timezone.utc)

The actual value stored is the hash of the token. Next we need a /refresh endpoint which will send an access token if it receives a request with a valid refresh token. We also delete the refresh token and issue a new one.

# views.py
class RefreshView(APIView):

	def get_refresh_token(self):
		refresh_token = self.request.COOKIES.get("refresh_token")
		if not refresh_token: raise AuthenticationFailed("No refresh token")
		try:
			rt = tokens.get_refresh_token_from_value(refresh_token)
		except RefreshToken.DoesNotExist:
			raise AuthenticationFailed("Invalid refresh token")
		if rt.is_expired: raise AuthenticationFailed("Refresh token has expired")
		return rt
	
	def handle_exception(self, exc): # Delete incoming cookie if invalid
		response = super().handle_exception(exc)
		if isinstance(exc, AuthenticationFailed):
			response.delete_cookie("refresh_token")
		return response

	def post(self, request):
		original_refresh_token = self.get_refresh_token()
		access_token = tokens.get_access_token(original_refresh_token.user)
		new_refresh_token = tokens.get_refresh_token(original_refresh_token.user)
		original_refresh_token.delete()
		response = Response({"access_token": access_token})
		tokens.add_refresh_token_to_response(new_refresh_token, response)
		return response

  
# tokens.py
import hashlib
import secrets

def get_refresh_token(user):
	random_string = secrets.token_hex(32)
	hash_ = _hash_token(random_string)
	expires_in = settings.REFRESH_TOKEN_DAYS
	expiry = datetime.now(timezone.utc) + timedelta(days=expires_in)
	RefreshToken.objects.create(user=user, token_hash=hash_, expires_at=expiry)
	return random_string

def add_refresh_token_to_response(refresh_token, response):
	max_age = settings.REFRESH_TOKEN_DAYS * 24 * 60 * 60
	response.set_cookie(
		"refresh_token",
		refresh_token,
		httponly=True,
		path="/refresh",
		secure=True,
		samesite="Strict",
		max_age=max_age,
	)

def get_refresh_token_from_value(value):
	token_hash = _hash_token(value)
	return RefreshToken.objects.get(token_hash=token_hash)

def _hash_token(token):
	return hashlib.sha256(token.encode()).hexdigest()


# urls.py
from .views import RefreshView

urlpatterns = [
	# ...
	path("refresh", RefreshView.as_view()),
	# ...
]

We also need to update /authenticate so that it also returns the initial refresh token:

# views.py
class AuthenticateView(APIView):
	# ... as before

	def post(self, request):
		serializer = AuthenticateSerializer(data=request.data)
		serializer.is_valid(raise_exception=True)
		access_token = tokens.get_access_token(serializer.user)
		# Get refresh token too
		refresh_token_value = tokens.get_refresh_token(serializer.user)
		response = Response({"access_token": access_token})
		tokens.add_refresh_token_to_response(refresh_token_value, response)
		return response

Revoking Tokens

We're almost there now. One loose end to clear up, is that as clients have no control over what HTTP-only cookies they send or store, there is no way for browser-based frontend web apps to 'sign out' - which in our terms means removing that refresh token from the local cookie store.

We need a way for clients to be able to clear their local refresh token and remove it from the database. We can just extend our /refresh endpoint to also take DELETE requests:

class RefreshView(APIView):

	def get_refresh_token(self, ignore_errors=False):
		refresh_token = self.request.COOKIES.get("refresh_token")
		if not refresh_token and ignore_errors: return None
		if not refresh_token and not ignore_errors:
			raise AuthenticationFailed("No refresh token")
		try:
			rt = tokens.get_refresh_token_from_value(refresh_token)
		except RefreshToken.DoesNotExist:
			if ignore_errors: return None
			raise AuthenticationFailed("Invalid refresh token")
		if rt.is_expired and not ignore_errors:
			raise AuthenticationFailed("Refresh token has expired")
		return rt
	
	def delete(self, request):
		if (refresh_token := self.get_refresh_token(ignore_errors=True)):
			refresh_token.delete()
		response = Response(status=204)
		response.delete_cookie("refresh_token")
		return response

The delete method is straightforward - delete the refresh token in the database, and send a 'delete the cookie' instruction in the response. We also modify the logic for getting the cookie so that it can just fail silently when needed - we don't need to return an error response if the incoming token is bad, we just delete and assure the user it's gone.

However earlier we noted that one of the benefits of deleting an incoming refresh token when it is used is that this alerts the user when someone else uses their refresh token. But there is no point in alerting a user to the possibility that a refresh token has been compromised if they can't take any action in response. We need a way for them to say 'revoke all my refresh tokens - any one needing a refresh token will need to supply the password to get a new one'.

We can add a /refresh/all endpoint for this, which takes a DELETE request and simply deletes all the user's refresh tokens. You don't need to send a refresh token to this endpoint - you identify yourself via access token like any other regular endpoint.

# views.py
class RefreshAllView(APIView):
	permission_classes = [IsAuthenticated]
	authentication_classes = [AccessTokenAuthentication]

	def delete(self, request):
		RefreshToken.objects.filter(user=request.user).delete()
		response = Response(status=204)
		response.delete_cookie("refresh_token")
		return response


# urls.py
from .views import RefreshAllView

urlpatterns = [
	# ...
	path("refresh/all", RefreshAllView.as_view()),
	# ...
]

With these two token types then - access tokens and refresh tokens, one a short lived self-describing JWT and the other a long-lived database-stored cookie credential - we can provide a way for users to authenticate themselves in a way that optimally balances security and convenience. The password never has to be stored, and most requests to us use unique, short lived access tokens. Our final flow is:

Users send their desired username and password to /users.
We validate these and add the user to the database.
The user sends their new username and password to /authenticate.
We check this is a valid user, and the password hashes to the correct hash.
We generate an access token and a refresh token and return these to the user - the latter as a 'set-cookie' instruction
The user's client discards the password, stores the access token in its memory, and stores the refresh token in a secure cookie storage.
The user sends requests to other endpoints with the access token in the Authorization header.
We verify this, get the correct user object from the database, and assign it to the request where it can be accessed by the views.
When the access token is about to run out, the user sends a request to /refresh with the cookie.
We hash this, find any refresh token in the database with this hash, get the associated user, and generate an access token for the user. We also generate a new refresh token to replace the old one.
The client replaces its access token with the new one.
To remove the refresh token from cookie storage, the user sends a DELETE request to /refresh.
We send back a 'delete cookie' instruction - the client now has no refresh token, and no way to obtain new access tokens, unless the password is used again.

Possible Extensions

The above flow covers the basics, but there are lots of ways it can be extended to be more useful.

User CRUD

We added endpoints for creating and reading our user objects - but not updating or deleting. People will want to be able to edit their username, change their password, and delete their accounts, and may be frustrated if they can't do so. You can enable this by just extending MeView to also support PATCH and DELETE requests.

However, it can sometimes be more convenient to keep the user object in the database and set some is_deleted flag to True instead. Deleting a user will cascade a deletion to all objects linked by foreign key to it, which depending on what your application is doing, may not be desired. However there may also be legal restrictions on what you can retain of a user's data after they have deleted their account. For all these reasons, account deletion will be somewhat use-case specific, and outside the scope of what we are doing here.

Refresh token cleanup

Currently we delete refresh tokens when they come in, but if a user never uses a refresh token once it's issued, it never gets deleted from the database, and they just accumulate there. This is seldom a massive problem, but eventually it may make the table quite large if you have many users. A periodic task to delete them once expired will solve this, providing you have some kind of asynchronous worker system set up.

Rate limiting authentication

While you generally never want people to be sending a constant stream of requests to any endpoint, our /authenticate endpoint needs particular care and protection here, because there is currently nothing stopping somebody brute-forcing a user's password - sending request after request trying to guess it. I have omitted its handling from this discussion, as it generally requires caching to be set up, which again is outside the scope of this explanation of auth flows. But you really shouldn't deploy a production app without some kind of protection here.

User emails

Our user model currently has no email field, which is fine, but may cause issues. If a user forgets their password, what do they do? As long as you're fine with storing one extra piece of personal information, it is usually convenient to take an email address, have the user verify it, and set up a 'verify your email' flow.

This also lets you do password reset. People generally expect a way to 'get back in' to their account if they forget their password. The usual solution here is to assume that only the user has access to their email account, and to send a separate token there which will let them set a new password when used - in effect delegating some of our security to the (presumably) tight restrictions on the email account.

Oauth

Everything we've built so far assumes users will create accounts directly with us - providing a username and password that we store (hashed) in our database. But many users prefer to sign in with an existing account from Google, GitHub, or similar providers. This is what OAuth enables.

OAuth is a protocol that lets users grant our application limited access to their account on another service, without sharing their password with us. The precise details are well outside the scope of this article, but many APIs will either offer this as an alternative means of logging in, or even only offer oauth log in.

Though even in that case, OAuth doesn't replace our authentication system - it replaces the password verification step. Once we've confirmed the user's identity via Google (or whichever provider), we issue them the same JWTs and refresh tokens we've been using all along. From that point on, the flow is identical.

Closing Thoughts

There are lots of ways to do authentication. There are lots of ways to get it wrong - and scarily wrong, with severe security consequences for your users. The flow outlined here is close to the optimum balance of convenience and ease, without sacrificing security where it matters.

However you choose to do it, you should have a mental model in your head of how it works, even if you only have to call upon it when something goes wrong. And while third party libraries are great for when you already understand what they do, if you treat them as magic boxes, eventually you will get stuck. Implementing everything from scratch is rarely the right call for production apps - but doing it at least once when you're starting out, will teach you more than reading about it ever will.