entities & identity

ET, UIDs, and the 🍃 prefix family.

an entity is a typed thing

alice = ET.Person(name='Alice', age=30)
type(alice)       # Entity_

ET.Person is an entity type. You invent them on the fly: ET.Car, ET.Invoice, ET.Tamagotchi. The first time Zef sees a new type name, it assigns it a compact binary encoding.

wait — I don't have to define a class?

Correct. There's no schema declaration, no field list, no constructor you have to implement. ET.X is just "a new kind of thing" — and Zef figures out the fields from the kwargs you pass.

This is the "zero-one-infinity rule" applied to schemas: if constraints are optional, don't force them. You can add refinement types later if you want validation.

the mental model

an entity is a name for a thing

An entity itself carries no data. Its fields are edges pointing to other values. Think of ET.Person(...) as a little empty box labeled "person" — the contents come from the outgoing edges.

ET.Person ← the box (the identity) │ ├──[name]──▶ "Alice" ├──[age]──▶ 30 └──[loves]──▶ ET.Cat(name="Mittens")

the three flavors of identity

Every entity has an identity. But how you identify it — there are three levels:

Level 1 — no identity

alice_a = ET.Person(name='Alice')
alice_b = ET.Person(name='Alice')
# these are two distinct entities, even though they look the same

Bare entities are "value-like" — useful when you don't care about tracking them across operations.

Level 2 — local index (graph-scoped)

ET.Person(137)    # "the 137th Person in this graph"

Integer indexes are graph-local and deterministic: the Nth entity of this type as counted from 0. Useful for scripting, test fixtures, bulk loads. But the number means nothing outside this one graph.

Level 3 — global UID (platonic)

ET.Person('🍃-97421467198ef6d64520', name='Alice')

A UID is globally unique. 🍃 prefix + 20 hex characters. Generated randomly, guaranteed unique without coordination.

why UIDs are a "conscious choice"

Generating a UID calls rand(), which is impure. Every UID is an act of tagging a conceptual thing from the real world and pulling it into your data universe. That's a modeling decision — not something you want happening silently.

So in Zef, UID generation is explicit: either you pass one in, or you call generate_uid() or FX.Random(tp=UID) | run.

the 🍃 prefix family

Zef uses different prefix emojis to signal different kinds of identity. Seeing the prefix, you know what kind of thing you're dealing with:

prefix	meaning	example
`🍃-`	platonic UID — a global identity	`🍃-97421467198ef6d64520`
`🧊-`	snapshot UID — a specific DB state at a point in time	`🧊-6dc2ec6a470d33ec919b-...`
`🕸️-`	graph-local ref — position inside one graph	`🕸️-1-...`
`🗿-`	content-hashed value — identifies by content	`🗿-abc...`

Yes, real Zef code has emojis in UIDs. You get used to it quickly. They're visual handles that make logs and error messages instantly categorizable.

how to make a UID

# manually, if you need a specific one
ET.Person('🍃-97421467198ef6d64520', name='Alice')

# generated at runtime
uid = generate_uid()                      # '🍃-...20 hex...'
ET.Person(uid, name='Alice')

# via the FX system (when you want it logged/replayable)
uid = FX.Random(tp=UID) | run
ET.Person(uid, name='Alice')

fields: single vs multi-value

The most important syntactic rule about entities:

single value

use bare field name

ET.Person(name='Alice')

"name is one thing"

multi value

add trailing _

ET.Person(likes_={'🍔', '🍺'})

"likes is a SET"

And for ordered many:

ET.Person(visited_=[
    ET.City(name='Berlin'),
    ET.City(name='Paris'),
    ET.City(name='Tokyo'),
])

Square brackets (list) = ordered. Curly braces (set) = unordered.

field_ is load-bearing

Dropping the trailing underscore switches semantics entirely. likes means "one like" (Zef will error if you give a set). likes_ means "the set of likes" (Zef expects a collection). Always add the underscore when the field can have more than one value.

a fuller example — a graph fragment

company = ET.Company(
    '🍃-01abcdef01234567890a',                # global identity
    name='Green Widgets Inc',
    founded=2020,
    ceo=ET.Person(
        '🍃-02cafebabe...000b',
        name='Alice',
    ),
    employees_=[                              # ordered list
        ET.Person(name='Bob',   role='Engineer'),
        ET.Person(name='Carol', role='Designer'),
    ],
    tags_={'b-corp', 'startup'},           # unordered set
)

entity ≈ "labeled dict with identity"

Reaching for an analogy? An entity is like a Python dict, plus:

a type label on the front (ET.Something)
an optional identity (UID or local index)
a rule about single vs multi-value fields
value semantics — bytes you can copy

When you persist it, it becomes graph nodes + edges. When you read it back, you get the same shape.

identity equality

a = ET.Person('🍃-abc...', name='Alice')
b = ET.Person('🍃-abc...', name='Alice')

a == b        # True — same UID, same fields

c = ET.Person('🍃-abc...', name='Bob')
a == c        # False — same UID, different fields
a.same_entity_as(c)   # True — same UID (identity match)

Equality considers both identity and fields. If you want "are these the same thing, regardless of their current state?", use same_entity_as or compare UIDs.

quick practice

Define an entity ET.Book with:

a UID of your choice
a title
multiple authors (as a list of ET.Person)
a set of tags

answer

ET.Book(
    '🍃-deadbeefdeadbeef1234',
    title='The Zef Zine',
    authors_=[
        ET.Person(name='Ada'),
        ET.Person(name='Bea'),
    ],
    tags_={'functional', 'python', 'zines'},
)

Next up: the critical F vs Fs distinction — one-or-many field access. →