you write nested. Zef stores flat. the glue between.
In code, we think nested:
class User:
name: str
emails: list[str]
alice = User(name='Alice', emails=['a@x', 'a@y'])
In SQL, we must shred it flat β emails can't live in a column, so we make a second table and wire them with foreign keys.
You write:
ET.User(name='Alice', emails_=['a@x', 'a@y'])
Zef stores:
Distinct nodes for the User, for "Alice", for each email. The User node is a bare identity; the edges carry the information.
Mental model rules:
field=value β (node) --[field]--> value-node| entity node | value node | |
|---|---|---|
| role | identity anchor | data carrier |
| e.g. | ET.User, ET.Page | "Alice", 42, True |
| internal structure | none β bare identity | the data |
| two equal ones are⦠| still distinct (different identity) | the same node (by value) |
This is the Wittgenstein-Tractatus model: "objects are simple" β they carry no intrinsic structure. Structure emerges from the edges between them.
In Zef, the thing an entity has isn't stored inside it. It's stored on the edge out of it. An edge carries:
(source entity) ββ[RT.relation_name]βββΆ (target node)
RT is "relation type", the cousin of ET. Same rules
apply β you invent relation types on the fly.
Physically, every field is a set of outgoing edges. It might have 0, 1, or 37 edges β same storage mechanism.
user | Out[RT.nickname] | collect # [] no edges
user | Out[RT.name] | collect # ['Alice'] one edge
user | Out[RT.email] | collect # ['a@x', 'a@y'] two edges
Going from "one name" to "many names" isn't a schema change. The storage is already "many." You just start writing / reading the plural form.
Contrast with SQL: ALTER TABLE users DROP COLUMN name; CREATE TABLE user_names (...);
plus a data migration plus ORM retooling.
A Graph is a value (like a list). You construct one from entity declarations:
g = Graph([
ET.Person(
'π-01abc...',
name='Alice',
lives_in=ET.City(
'π-02def...',
name='Berlin',
population=3_600_000,
),
friend_=[
ET.Person('π-03ghi...', name='Bob'),
ET.Person('π-04jkl...', name='Carol'),
],
),
])
# commit to the graph store (so we can query it later)
g.add_to_graph_store()
Zef normalizes this tree of declarations into distinct entities + edges. The "Berlin" city is a single node, even if 100 Persons live there. Same for "Alice" as a value β reused across wherever it appears.
# all Persons in the graph
people = g | all(ET.Person) | collect
# find Alice
alice = g | all(ET.Person) | filter(F.name == 'Alice') | first | collect
# her friends' names
names = alice | Fs.friend | map(F.name) | collect
# {'Bob', 'Carol'}
# her city
city_name = alice | F.lives_in | F.name | collect # 'Berlin'
A Graph is just another value β like a string or a dict. You can:
.zef file| SQL | Zef |
|---|---|
| table | entity type (ET.Person) |
| row | entity node |
| column | relation type (RT.name) |
| cell | target value node |
| foreign key | edge |
| NULL | edge absent (empty set) |
| JOIN | follow the edge |
Lose: compile-time schema enforcement. There's no DDL.
Gain: evolution without migrations. Sparse, optional fields for free. Many-to-many without join tables. Graph traversals that read like English. Refinement types when you do want constraints.
When you pull an entity from a stored graph, you get a graph reference:
alice_ref = g | all(ET.Person) | first | collect
print(alice_ref) # ET.Person('πΈοΈ-1-...')
The πΈοΈ- prefix means "graph-local reference." It points to a
specific node in this specific graph. Chain ZefOps to traverse out from it.
g = Graph([
ET.User('π-001a...', name='Alice', age=30,
email_={'a@x.com', 'a@y.com'}),
ET.User('π-002b...', name='Bob', age=25),
ET.User('π-003c...', name='Carol', age=42,
email_={'c@x.com'}),
])
g.add_to_graph_store()
# Which users have zero emails?
g | all(ET.User) | filter(Fs.email | length == 0) | map(F.name) | collect
# ['Bob']
# average age of users with emails
g | all(ET.User)
| filter(Fs.email | length > 0)
| map(F.age) | apply({'sum': reduce(add), 'n': length})
| apply(Z['sum'] / Z['n']) | collect
# 36.0
For the graph above, draw the nodes and edges on a napkin. How many distinct value-nodes are there for "Alice", "Bob", "Carol"? (Hint: 3.) How many distinct email value-nodes? (Hint: 3.)
Next up: updating graphs β field= vs field_= vs this+β¦ β