This article is going to discuss several states of ORM instance when using SQLAlchemy.
Following is a brief diagram showing the relationships of states in SQLAlchemy:
---
title: SQLAlchemy States Explained
---
stateDiagram-v2
classDef transientStyle fill:transparent
classDef okStyle fill:#55ff5533
classDef warnStyle fill:#ff555533
direction LR
[*]-->Transient:::transientStyle
Transient --> Pending:::transientStyle: Session.add()
Pending --> Persist:::okStyle: Session.flush()
Pending --> Persist: Session.execute(select())
Pending --> Transient: Session.close()
Persist --> Dirty:::warnStyle: ORM.attr = ...
Dirty --> Persist: Session.flush()
Persist --> Detached:::transientStyle: Session.close()
Transient
This state actually means an ORM instance has been created, but not linked to any session. Therefore, we could consider this ORM instance actually just a pure Python object with no other special ORM features.
Pending
An Transient
object turns into Pending
once it's been added to a session using Session.add()
bob = User(name='bob') # transient
ss = Session()
ss.add(bob) # pending
Pending instances are tracked by SQLAlchemy, but not yet synced to database.
SQLAlchemy allows us to check all pending instances in a session:
>>> ss.new
IdentitySet([User(name='bob'), ...])
Persist
After Session.flush()
, all pending instances of this session becomes persist, which means the addition is synced to database.
Also notice:
- Session.commit() get called.
-
Session.execute() get called to perform
SELECT
oprations.
These operations above would trigger Session.flush()
automatically.
To check if an ORM instance is in the persist set of a session, we could do something like below:
>>> u: User
>>> ss: Session
>>> u in ss
True
Or we could also access the Identity Set of this session directly.
>>> ss.identity_set
...
Dirty
An instance turns from Persist
into Dirty
if some attributes of it has been changed.
user = Session.get(User, 1) # retrieve user with primary key == 1
user.name = 'New User Name' # attributes changed, become dirty
Like Pending
, Session allow us to check all dirty instances using Session.dirty
>>> ss: Session
>>> ss.dirty
Set([ORMInstance(), ...])
Based on the SQLAlchemy docs,
Session.dirty
is calculated and created on the fly everytime it got accessed.
Detached
Once Session.commit()
or Session.close()
being called, all ORM entities associated to this Session is expired and turns into Detached
state.
If we want to keep using these entities after commit or after session close, we could set Engine.expire_on_commit = False
engine = Engine(..., expire_on_comit=False)
Checkout SQLAlchemy - Opening & Closing a Session for more info.
Instance Caching
Concepts
SQLAlchemy has mechanisms to cache ORM entities in Session.identity_set
that currently in persist
state.
Identity sets distinguish instances using PRIMARY KEY
columns. If an ORM entity has been requested multiple times in a single session, no SQL operation will be emitted, the one in Identity Set will be reused, even if the select condition is different.
# first query, SQL emitted
user = ss.execute(
select(User).where(User.uid == 1)
)
# doesn’t work, User.addresses would not load
user = ss.execute(
select(User).where(User.uid == 1)
.options(selectinload(User.addresses))
)
Force Refresh Despite Cached
To force SQLAlchemy to refetch info from database despite the cached ORM entities, we need to use execution_options()
:
select(User)
.where(User.uid == 1)
.execution_options(populate_existing=True)
Note that by forcing re-populate the ORM entity, we will lose all changes in Pending
state that still unflushed into SQL databse.
Checkout SQLAlchemy - Populate Existing for more info.
No comment