spawn a whole process when a thread isn't enough.
| actors | workers | |
|---|---|---|
| unit | Tokio task (in same process) | Python process (separate) |
| isolation | shared memory | OS-level separation |
| fault tolerance | crash = uncaught panic | crash = process exits, main survives |
| overhead | ~100 ns/msg | ~ms for IPC |
| GIL | Python on hot path β GIL-bound | each process has its own GIL |
| use for | concurrency, state machines | heavy CPU, untrusted code, crash containment |
worker = FX.StartWorkerProcess() | run
result = FX.SendCommandToWorkerProcess(
process=worker,
command=FX.Print(content='hello from worker!'),
) | run
FX.StopWorkerProcess(process=worker) | run
The command is an FX value. The worker receives it, runs it, sends back the result. All over a binary protocol β no JSON, no serialization tax.
The worker is a full-fledged Python process that happens to have the Zef runtime loaded. Send it FX values; it runs them.
A segfault in the worker doesn't take down the main process. You get the crash info and can spawn a new worker.
worker = FX.StartWorkerProcess() | run
try:
result = FX.SendCommandToWorkerProcess(
process=worker,
command=potentially_dangerous_fx,
) | run
except WorkerCrash as e:
print(f'worker died: {e}')
worker = FX.StartWorkerProcess() | run # respawn
Many workers in parallel. Because each has its own GIL, they actually run in parallel.
workers = [FX.StartWorkerProcess() | run for _ in range(4)]
chunks = split_data_into_chunks(dataset, 4)
results = [
FX.SendCommandToWorkerProcess(
process=w,
command=process_chunk(chunk), # a zef_function returning an FX
) | run
for w, chunk in zip(workers, chunks)
]
Run code from users (plugins, user scripts) in a separate process with a restricted FX whitelist. (This is a planned/early-stage feature β check current capabilities.)
# how many workers are alive?
FX.ListWorkerProcesses() | run
# stop one
FX.StopWorkerProcess(process=worker) | run
The command can be any Zef value β typically an FX or a list of them:
FX.SendCommandToWorkerProcess(
process=worker,
command=[
FX.Print(content='start'),
FX.HTTPRequest(url='https://api.example.com/big-job'),
FX.Print(content='done'),
],
) | run
Use workers when you specifically need isolation. For everything else, actors are faster and simpler.
@zef_function
def resize_image(img_bytes: Bytes) -> Bytes:
from PIL import Image
import io
im = Image.open(io.BytesIO(bytes(img_bytes)))
im.thumbnail((200, 200))
buf = io.BytesIO()
im.save(buf, 'JPEG')
return buf.getvalue()
# in main process β spawn a worker pool
POOL = [FX.StartWorkerProcess() | run for _ in range(4)]
def resize_batch(imgs):
return [
FX.SendCommandToWorkerProcess(
process=POOL[i % len(POOL)],
command=resize_image(img),
) | run
for i, img in enumerate(imgs)
]
CPU-bound image resizing, 4x parallel, each in its own process. If PIL crashes on some weird jpeg, one worker dies and you restart it; the main process keeps serving other images.
Zef also supports workers on other machines via Remote Containers:
FX.InstallPodmanRemote(ip_address='5.223.58.246') | run
FX.BuildZefImageRemote(ip_address='5.223.58.246', wheel='...whl') | run
FX.RunZefContainerRemote(ip_address='5.223.58.246', session=True) | run
Same programming model: send commands, receive results. Just happens over the network.
Because effects are data, the same FX.* value can be run locally,
in a worker, on another machine, or stored for later. The runtime picks
the transport. You write the effect once.
Use the lightest tier that gets the job done. Jump to a heavier tier only when you need what it specifically provides.
Which tier would you use for each of these?
Next up: the finale β an end-to-end LLM chat agent. β