This article is going to discuss several states of ORM instance when using SQLAlchemy.

Following is a brief diagram showing the relationships of states in SQLAlchemy:

---
title: SQLAlchemy States Explained
---
stateDiagram-v2
    classDef transientStyle fill:transparent
    classDef okStyle fill:#55ff5533
    classDef warnStyle fill:#ff555533
    direction LR
    [*]-->Transient:::transientStyle
    Transient --> Pending:::transientStyle: Session.add()
    Pending --> Persist:::okStyle: Session.flush()
    Pending --> Persist: Session.execute(select())
    Pending --> Transient: Session.close()
    Persist --> Dirty:::warnStyle: ORM.attr = ...
    Dirty --> Persist: Session.flush()
    Persist  --> Detached:::transientStyle: Session.close()

Transient

This state actually means an ORM instance has been created, but not linked to any session. Therefore, we could consider this ORM instance actually just a pure Python object with no other special ORM features.

Pending

An Transient object turns into Pending once it's been added to a session using Session.add()

bob = User(name='bob') # transient

ss = Session()
ss.add(bob) # pending

Pending instances are tracked by SQLAlchemy, but not yet synced to database.

SQLAlchemy allows us to check all pending instances in a session:

>>> ss.new
IdentitySet([User(name='bob'), ...])

Persist

After Session.flush(), all pending instances of this session becomes persist, which means the addition is synced to database.

Also notice:

  • Session.commit() get called.
  • Session.execute() get called to perform SELECT oprations.

These operations above would trigger Session.flush() automatically.

To check if an ORM instance is in the persist set of a session, we could do something like below:

>>> u: User
>>> ss: Session
>>> u in ss
True

Or we could also access the Identity Set of this session directly.

>>> ss.identity_set
...

Dirty

An instance turns from Persist into Dirty if some attributes of it has been changed.

user = Session.get(User, 1)  # retrieve user with primary key == 1
user.name = 'New User Name'  # attributes changed, become dirty

Like Pending, Session allow us to check all dirty instances using Session.dirty

>>> ss: Session
>>> ss.dirty
Set([ORMInstance(), ...])

Based on the SQLAlchemy docs, Session.dirty is calculated and created on the fly everytime it got accessed.

Detached

Once Session.commit() or Session.close() being called, all ORM entities associated to this Session is expired and turns into Detached state.

If we want to keep using these entities after commit or after session close, we could set Engine.expire_on_commit = False

engine = Engine(..., expire_on_comit=False)

Checkout SQLAlchemy - Opening & Closing a Session for more info.

Instance Caching

Concepts

SQLAlchemy has mechanisms to cache ORM entities in Session.identity_set that currently in persist state.

Identity sets distinguish instances using PRIMARY KEY columns. If an ORM entity has been requested multiple times in a single session, no SQL operation will be emitted, the one in Identity Set will be reused, even if the select condition is different.

# first query, SQL emitted
user = ss.execute(
  select(User).where(User.uid == 1)
)

# doesn’t work, User.addresses would not load
user = ss.execute(
  select(User).where(User.uid == 1)
  .options(selectinload(User.addresses))
)

Force Refresh Despite Cached

To force SQLAlchemy to refetch info from database despite the cached ORM entities, we need to use execution_options():

select(User)
.where(User.uid == 1)
.execution_options(populate_existing=True)

Note that by forcing re-populate the ORM entity, we will lose all changes in Pending state that still unflushed into SQL databse.

Checkout SQLAlchemy - Populate Existing for more info.

Refs