Adjacency List Relationships
In this example, we’ll work with a single mapped class called , representing a tree structure:
With this structure, a graph such as the following:
root --+---> child1
+---> child2 --+--> subchild1
| +--> subchild2
+---> child3
Would be represented with data such as:
id parent_id data
--- ------- ----
1 NULL root
2 1 child1
3 1 child2
4 3 subchild1
5 3 subchild2
6 1 child3
The relationship()
configuration here works in the same way as a “normal” one-to-many relationship, with the exception that the “direction”, i.e. whether the relationship is one-to-many or many-to-one, is assumed by default to be one-to-many. To establish the relationship as many-to-one, an extra directive is added known as , which is a Column
or collection of objects that indicate those which should be considered to be “remote”:
As always, both directions can be combined into a bidirectional relationship using the backref()
function:
class Node(Base):
__tablename__ = 'node'
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('node.id'))
data = Column(String(50))
children = relationship("Node",
backref=backref('parent', remote_side=[id])
)
There are several examples included with SQLAlchemy illustrating self-referential strategies; these include and XML Persistence.
A sub-category of the adjacency list relationship is the rare case where a particular column is present on both the “local” and “remote” side of the join condition. An example is the Folder
class below; using a composite primary key, the account_id
column refers to itself, to indicate sub folders which are within the same account as that of the parent; while refers to a specific folder within that account:
class Folder(Base):
__tablename__ = 'folder'
__table_args__ = (
['account_id', 'parent_id'],
['folder.account_id', 'folder.folder_id']),
)
account_id = Column(Integer, primary_key=True)
folder_id = Column(Integer, primary_key=True)
parent_id = Column(Integer)
name = Column(String)
parent_folder = relationship("Folder",
backref="child_folders",
remote_side=[account_id, folder_id]
)
Above, we pass account_id
into the list. relationship()
recognizes that the account_id
column here is on both sides, and aligns the “remote” column along with the folder_id
column, which it recognizes as uniquely present on the “remote” side.
However extra care is needed when attempting to join along the foreign key from one level of the tree to the next. In SQL, a join from a table to itself requires that at least one side of the expression be “aliased” so that it can be unambiguously referred to.
Recall from in the ORM tutorial that the aliased()
construct is normally used to provide an “alias” of an ORM entity. Joining from Node
to itself using this technique looks like:
from sqlalchemy.orm import aliased
nodealias = aliased(Node)
sqlsession.query(Node).filter(Node.data=='subchild1').\
join(Node.parent.of_type(nodealias)).\
filter(nodealias.data=="child2").\
all()
SELECT node.id AS node_id,
node.data AS node_data
FROM node JOIN node AS node_1
ON node.parent_id = node_1.id
WHERE node.data = ?
['subchild1', 'child2']
For an example of using to join across an arbitrarily long chain of self-referential nodes, see XML Persistence.
Eager loading of relationships occurs using joins or outerjoins from parent to child table during a normal query operation, such that the parent and its immediate child collection or reference can be populated from a single SQL statement, or a second statement for all immediate child collections. SQLAlchemy’s joined and subquery eager loading use aliased tables in all cases when joining to related items, so are compatible with self-referential joining. However, to use eager loading with a self-referential relationship, SQLAlchemy needs to be told how many levels deep it should join and/or query; otherwise the eager load will not take place at all. This depth setting is configured via relationships.join_depth
:
class Node(Base):
__tablename__ = 'node'
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('node.id'))
data = Column(String(50))
children = relationship("Node",
lazy="joined",
join_depth=2)
sqlsession.query(Node).all()
SELECT node_1.id AS node_1_id,
node_1.parent_id AS node_1_parent_id,
node_1.data AS node_1_data,
node_2.id AS node_2_id,
node_2.parent_id AS node_2_parent_id,
node_2.data AS node_2_data,
node.id AS node_id,
node.parent_id AS node_parent_id,
node.data AS node_data
FROM node
LEFT OUTER JOIN node AS node_2
ON node.id = node_2.parent_id
LEFT OUTER JOIN node AS node_1
ON node_2.id = node_1.parent_id