Skip to content

Instantly share code, notes, and snippets.

@shykes
Created April 1, 2026 11:25
Show Gist options
  • Select an option

  • Save shykes/948cd5d82610d762d8d4473089d19cae to your computer and use it in GitHub Desktop.

Select an option

Save shykes/948cd5d82610d762d8d4473089d19cae to your computer and use it in GitHub Desktop.
ARCHIVE: Workspace Artifacts Design (pre-Modules v2 split)

ARCHIVE: Workspace Artifacts Design (Pre-Modules v2)

This document is archived. It was the original integrated design covering collections, addresses, verbs, plans, and provenance as one system. It has since been superseded by the split design docs in hack/designs/modules-v2/ on the modules-v2 branch of dagger/dagger.

See: https://github.com/dagger/dagger/tree/modules-v2/hack/designs/modules-v2


Workspace Artifacts

Status: Active design, initial prototype on workspace-artifacts

Builds on:

Supersedes:

  • docs/design/proposals/02-artifacts-checks.md
  • docs/design/proposals/03-checks-api.md
  • docs/design/proposals/03-ship.md
  • docs/design/workspace-artifacts.md

Tracking:

Summary

Artifacts make a module's dynamic project model directly targetable.

Modules already build runtime object graphs from the workspace. The missing piece is a clean way for users to point at the important objects in that graph and run standard actions on them.

The current design has five core pieces:

  • object graph: the published rooted graph exposed by modules
  • collections: keyed dynamic sets inside that graph
  • addresses: canonical typed addresses for collection-backed artifacts
  • verbs: standard actions planned over objects and artifacts
  • provenance: workspace-origin metadata used to filter and select relevant artifacts

This is broader than "filtered checks", but narrower than "every runtime object is an artifact".

Table of Contents

Problem

Modules v2 can dynamically inspect the workspace, but they still cannot present that dynamic model back to the user.

Examples:

  1. A Go module can discover all Go modules in the workspace and define test, lint, and build verbs for each one. Today the user can say "check", but not "check this module".
  2. A deploy module can know which service came from ./cmd/api, but the user cannot ask "ship what comes from this path" or "ship what changed in this commit".
  3. The artifact-centric object model inside Dagger is still hidden behind a tree of named functions. That prevents filtering, composition, and future features that want to operate on real objects rather than ad hoc function paths.

Current Snapshot

Settled

  • Artifact addressing no longer depends on multiple rooted constructors.
  • Collections are a separate first-class concept:
    • +items enumerates all items
    • +lookup resolves one item by key
    • +key marks the item's key field
  • Constructors should mean "construct a value/spec/handle", not "provision a real external resource".
  • Artifacts should come from collections:
    • top-level rooted singletons are object-graph roots, not artifacts
    • collection items are the main artifact shape
    • not every collection key becomes an artifact address
  • Verbs are decoupled from artifacts:
    • verbs can exist on addressed artifacts and on non-artifact object-graph nodes
    • verbs do not by themselves determine artifact status
  • Collections may implement verb handlers:
    • collection-level handlers pre-empt naive expansion to one item handler per item
  • Workspace provenance remains the core v1 filter mechanism.
  • dagger call should remain backward-compatible and procedural by default.

Leading Direction

  • Artifact should now be understood semantically as:
    • an addressed object surfaced by a collection somewhere in the object graph
  • The full object graph and the addressed artifact set are separate concepts:
    • the object graph is walked structurally from rooted objects
    • artifact addresses are rolled up only from collections whose key type has a wider, cross-cutting address namespace
  • Artifact identity and object traversal should stay conceptually separate:
    • an artifact address identifies a collection-backed object
    • graph traversal navigates rooted objects, fields, and collection items
  • Workspace provenance filtering stays separate from identity:
    • address = "which object?"
    • provenance filter = "which objects come from or depend on this path?"

Still Open

  • Whether and when to pursue multiple rooted constructors as a separate schema/rooting track.
  • Exact collection surface in the schema/runtime.
  • Exact CLI split between artifact targeting and lower-level object targeting.
  • How much of the deep object graph needs dedicated inspect/graph UX.
  • Whether general "origin" should grow beyond workspace provenance in v1.

Core Definitions

Object Graph

The object graph is the published runtime object model exposed by loaded modules.

It includes:

  • rooted objects
  • rooted collections
  • nested objects reachable from them
  • verbs on any of those objects

Traversal in the object graph follows published structure:

  • fields
  • collection hops

It does not auto-invoke arbitrary functions.

Collection

A collection is a keyed set of items.

Collections solve three separate problems:

  • enumerate all items
  • look up one item by key
  • optionally roll an item up into its canonical artifact address

The current leading shape is:

type NetlifySites {
  items: [NetlifySite!]! @items
  lookup(path: WorkspacePath!): NetlifySite! @lookup
}

type NetlifySite {
  path: WorkspacePath! @key
}

The collection itself may be rooted or nested in the object graph. The item type may still get a canonical artifact address derived from that collection, but only when the key type has a wider cross-cutting namespace such as WorkspacePath or HTTPAddress.

Collections may also implement verb handlers. When they do, they act as verb-planning boundaries:

  • item handler:
    • one handler for one item
  • collection handler:
    • one handler for the whole selected set
  • planner rule:
    • if a collection has a handler for verb V, prefer it over naively expanding V to every item

Artifact

An artifact is an addressed object surfaced by a collection in the object graph.

Every artifact is an object. Not every runtime object is an artifact.

This means:

  • top-level rooted singletons such as Netlify are not artifacts
  • collection items are the main artifact shape
  • not every collection item is an artifact
  • artifacts may live deep in the object graph if a nested collection gives them an address
  • verbs do not define artifact status

Address

Addresses identify artifacts.

The important semantic rule is:

  • the address names the artifact type
  • the address value is the canonical value derived from a collection key whose type has a wider cross-cutting namespace
  • address syntax is still open

Illustrative example:

  • netlify-site:/docs
  • a deployment collection keyed by local dep_123 does not automatically yield an artifact address
  • the same collection keyed by HTTPAddress such as https://netlify.com/v1/deployments/dep_123 could

Verb

Verbs are high-level action kinds on objects, especially artifacts and collections.

The first verbs are:

  • check
  • generate
  • ship
  • up

A verb is not just "one function with an annotation".

A verb invocation computes a plan over one or more handler functions. See Verbs.

Provenance

Provenance is the workspace-origin metadata used to filter and select artifacts.

It answers:

  • which objects come from this workspace path?
  • which objects are affected by these changes?

It is not the same thing as identity:

  • address = "which object?"
  • provenance = "which objects are relevant to this path or change?"

Artifact Enumeration and Object Traversal

Artifact enumeration no longer depends on multiple rooted constructors.

At the object-graph layer, a single rooted module object is enough:

  • acquire the rooted module object
  • walk fields
  • expand collections via +items
  • project addressable artifacts out of the discovered graph

Default artifact enumeration no longer means "list every rooted singleton".

The current leading split is:

  • object-graph traversal:
    • walk rooted objects
    • follow fields
    • expand collections via +items
  • artifact enumeration:
    • project addressed collection items out of that walked graph
    • wherever the relevant collections appear

Ordinary functions are not automatic discovery edges. Collections are the special case:

  • +items participates in graph discovery
  • +lookup participates in artifact identity
  • ordinary methods remain just methods

Multiple Rooted Constructors

Multiple rooted constructors are now a separate architectural track, not a prerequisite for typed artifact addresses.

Today the schema effectively assumes:

  • one module
  • one main object
  • one rooted constructor

Artifact addressing can still proceed without changing that:

  • keep the current rooted module object
  • walk its object graph
  • surface addressed artifacts from collections anywhere in that graph

Multiple rooted constructors remain valuable for a different reason:

  • cleaner root schema projection
  • less dependence on one public main object
  • room for richer object-layer modeling beyond artifacts

If pursued, modules could expose multiple rooted entrypoints such as:

  • a rooted object such as Netlify
  • a rooted collection object such as NetlifySites
  • other rooted helper/config objects when they are intentionally part of the object graph

The important distinction is:

  • today, the reachable object graph can still hang off one rooted module object
  • extra constructors/rooting would define additional root entrypoints into that graph
  • collections define which objects become addressable artifacts

These are orthogonal:

  • collections do not replace multiple rooted constructors
  • multiple rooted constructors do not replace collections

They work together:

  • rooted constructors expose additional entrypoints into the object graph
  • collections inside that graph define the canonical addresses of their item artifacts

Constructor semantics should stay narrow:

  • they construct values, handles, config objects, or specs
  • they do not, by themselves, imply provisioning or remote creation

This section is about a possible future root-model cleanup, not a blocker for artifact identity.

Schema Change

The current DAGQL projection assumes one rooted constructor per module and flattens it onto Query.

Before:

type Query {
  container(): Container!
  http(url: String!): File!

  # module foo
  foo(...): Foo!
  loadFooByID(id: FooID!): Foo!
}

That is not enough once a module wants more than one rooted object type in the public schema.

The canonical projection must support multiple rooted types cleanly. One leading direction is a namespaced/object-wrapper root model:

type Query {
  objects: Objects!
}

type Objects {
  foo: FooRoots!
}

type FooRoots {
  foo: FooRoot!
  sites: FooSitesRoot!
}

type FooRoot {
  new(...): Foo!
  load(id: FooID!): Foo!
}

type FooSitesRoot {
  new(...): FooSites!
  load(id: FooSitesID!): FooSites!
}

The exact names are still open. What matters is the shape:

  • multiple rooted object types per module
  • no dependence on one flat Query field per module
  • room for rooted collections as peers to rooted object singletons

Compatibility Projection

Backward compatibility can be layered on top of the canonical rooted model.

For example, the engine could still project selected roots back onto flat Query for old clients:

type Query {
  foo(...): Foo!
  loadFooByID(id: FooID!): Foo!

  # optional compat bridge for additional rooted types
  newFooSites(...): FooSites!
  loadFooSitesByID(id: FooSitesID!): FooSites!
}

The exact compat bridge is still open, but the design direction is:

  • canonical schema first
  • compatibility projection second

This keeps the root model correct without forcing an immediate breaking change on all clients.

Collections

Collections are required for dynamic artifacts.

They are the mechanism for saying:

  • here is the full keyed set of objects of this kind
  • here is how to look up one by key

The current collection contract is:

  • +items: enumerate all items
  • +lookup: resolve one item by key
  • +key: mark the item's key field

Example:

type NetlifySites {
  items: [NetlifySite!]! @items
  lookup(path: WorkspacePath!): NetlifySite! @lookup
}

type NetlifySite {
  path: WorkspacePath! @key
}

Rules:

  • key uniqueness is enforced within the collection
  • the collection may itself be rooted
  • collection lookup always works within the collection
  • canonical artifact addressing only rolls up from recognized cross-cutting address types such as WorkspacePath

Collections are about lookup and enumeration, not provisioning.

If a module wants to create a new remote resource, that should be modeled explicitly, for example:

  • a method on the collection such as create(...)
  • or a separate spec/request object

It should not be implicit in constructor or lookup semantics.

Addresses and Traversal

The current leading direction is to separate artifact addressing from object traversal:

  • artifact address:
    • canonical, typed, collection-derived
    • only for collection keys whose type carries a wider cross-cutting namespace
  • object traversal:
    • structural navigation from rooted objects and collections

Design rules:

  • the scheme names a type
  • the part after : is the canonical address value for an artifact of that type
  • exact concrete artifact URI syntax is still open
  • graph traversal syntax is also still open
  • graph traversal follows:
    • fields
    • collection hops
  • graph traversal does not auto-invoke arbitrary functions

Illustrative examples:

  • artifact address:
    • netlify-site:/docs
  • structural graph traversal:
    • netlify-sites["./docs"].build
    • or an equivalent future syntax

What stays separate is provenance filtering:

  • --path ./docs means "match objects related to ./docs"
  • it does not mean "the object keyed by ./docs"

This distinction matters even when a collection key is itself a WorkspacePath.

It also means:

  • a local collection key such as dep_123 may be perfectly valid for lookup(...)
  • but it does not automatically become a top-level artifact address
  • a wider key such as https://netlify.com/v1/deployments/dep_123 could
  • those objects remain reachable through object traversal unless and until they use a recognized cross-cutting address type

Verbs

A verb is a high-level action kind that projects to a plan of function calls.

This is the important distinction:

  • a handler is one local function annotated for a verb
  • a verb invocation is the computed plan that runs zero or more handlers

So check(netlify-site:/docs) is not "call one @check method". It is:

  • find local check handlers
  • expand across related artifacts according to verb rules
  • order the resulting calls
  • execute the plan

Verb Handlers

A verb handler is a local artifact method annotated for a verb.

Examples:

  • GoModule.lint() annotated @check
  • GoModule.test() annotated @check
  • NetlifySite.deploy() annotated @ship

The mapping from artifact type to local handlers is statically known from the schema.

Verb Plans

A verb plan is the effective set of handler calls for one artifact and one verb kind.

The plan is derived from:

  • local verb handlers on the artifact
  • local verb handlers on collections that own or batch those artifacts
  • object-graph structure, especially references
  • verb-specific orchestration rules

This is distinct from runtime telemetry:

  • verb plans are part of the static/execution model
  • runtime calls relations are observed concrete function-call edges during execution

A useful consequence is that every artifact may have an effective plan for every verb kind:

  • the plan may be empty
  • the plan may be local only
  • the plan may expand through referenced artifacts

Verb Semantics

check

check(A) is the clearest recursive verb.

Current leading rule:

  • include local check handlers on A
  • recursively include check(B) for each artifact B referenced by A
  • if A references B, run check(B) before local check handlers on A
  • if a collection has a check handler, prefer it over expanding to one item-level check handler per item in that collection

This makes aggregate artifacts useful by default.

generate

generate(A) should stay conservative.

Current leading rule:

  • include local generate handlers on A
  • do not recursively generate through references by default
  • do not make generate an implicit prerequisite of other verbs

This avoids surprising workspace mutations.

ship

ship(A) should be stricter than check(A).

Current leading rule:

  • include local ship handlers on A
  • do not recursively ship every referenced artifact by default
  • usually require check(A) first unless explicitly skipped

Raw references are too broad to define automatic ship propagation on their own.

up

up(A) is closer to ship(A) than to check(A).

Current leading rule:

  • include local up handlers on A
  • do not recursively follow all references by default
  • likely require check(A) or equivalent readiness checks first

Verb Policy

Workspace or user policy may add gates and ordering on top of core verb semantics.

Examples:

  • require check before ship
  • require explicit confirmation or target selection for production ship
  • default ship target to preview rather than prod

Policy should refine orchestration, not redefine the core meaning of a verb.

Workspace Access

Verb methods must not accept Workspace arguments.

Allowed:

  • constructors and discovery helpers may accept Workspace
  • non-verb helper methods may accept Workspace
  • an object may store Workspace in a field

Forbidden:

  • verb methods such as check, generate, ship, up taking Workspace

This forces an explicit tradeoff:

  • precise artifact: materialize Directory / File inputs early
  • dynamic artifact: store Workspace, become rooted at /

Shipping in CI

Artifacts and verbs make ship targetable, but they do not by themselves settle how shipping should work in CI.

The main tensions are:

  • environment specificity:
    • the same artifact may ship to preview, staging, or prod depending on context
    • PR workflows should skew toward preview/dev, not production
  • dependency policy:
    • check can recurse over references
    • ship likely needs stricter and sometimes explicit dependencies
  • workflow shape:
    • some teams will want a custom declarative workflow that composes generate, check, ship, and approvals
    • it is still open whether that belongs in schema, workspace config, or external CI
  • safety and policy:
    • manual approval, secret availability, branch/event gating, and protected environments all affect what ship should do

The current design intent is:

  • keep core ship semantics narrow at the artifact layer
  • let policy/workflow layers add target selection, approvals, and extra dependencies
  • avoid baking one CI workflow model into the artifact foundation too early

This document deliberately plants the flag here without fully solving:

  • how to express explicit ship dependencies
  • how CI context selects ship targets
  • how far ship should imply check, generate, or other gates
  • whether custom declarative workflows become a first-class Dagger concept

CLI Model

The default CLI should stay small:

  • find artifacts
  • filter by provenance
  • run verbs

The main commands are:

  • dagger artifact list
  • dagger artifact inspect
  • dagger check
  • dagger generate
  • dagger ship
  • dagger up

Default behavior should center collection-backed artifacts.

Rooted object-graph singletons should remain reachable, but as a more explicit/power-user path:

  • inspect/debug flows
  • lower-level object targeting

dagger call should remain backward-compatible and procedural by default.

If artifact/object targeting is added to dagger call, it should be additive rather than a silent reinterpretation of existing call syntax.

The key UX rule is:

  • address syntax means exact artifact identity
  • --path means provenance filtering

These must not blur together, even when a rooted artifact key is itself a workspace path.

Provenance

Workspace provenance is the major v1 feature.

Provenance is best understood as artifact metadata, not as a graph edge.

That means:

  • provenance is stored on artifacts
  • path/diff/git matching is computed at query time
  • provenance-based matches are not first-class permanent relations in the object graph

Examples of query-time provenance predicates:

  • "matches path ./docs"
  • "matches diff HEAD~1..HEAD"
  • "overlaps this changed path set"
  • "is entirely contained within this path set"

Source of Provenance

V1 provenance comes from workspace API reads such as:

  • workspace.directory(...)
  • workspace.file(...)

This is intentionally narrow:

  • not general lineage
  • not git metadata itself
  • not module ownership

Git- and path-based selection are derived by evaluating predicates against stored provenance; they are not themselves provenance records.

Provenance Union

Provenance unions across fields and composed values.

In general:

provenance({foo: A, bar: B}) = union(provenance(A), provenance(B))

Root-Path Provenance

Storing Workspace is allowed, but it taints the object with root-path provenance at /.

That means:

  • it matches all path and git filters
  • it loses precise source filtering
  • it should also be treated as workspace-sensitive for caching

Caching Semantics

Workspace-sensitive calls already have special cache behavior today when Workspace is injected as a function argument.

The same semantic taint should extend to stored Workspace fields:

  • if a function takes Workspace, it is workspace-sensitive
  • if a function operates on an object that stores Workspace, it is also workspace-sensitive

This taint should affect both:

  • artifact filtering semantics (/)
  • downstream cache sensitivity

The exact cache-key mechanism is an implementation detail, but the semantic rule should be the same in both cases.

Artifacts are not a second invalidation system. Provenance exists for UX and orchestration, not to replace the engine's existing content-addressed execution model.

Examples

Precise Artifact

helm-dev already follows the desired precise pattern:

  • constructor receives Workspace
  • it materializes ws.Directory(chartPath) into a field
  • artifact verbs operate on that field

This yields precise provenance and precise filtering.

Dynamic Artifact

Current workspace-API dogfooding also shows the coarse pattern:

  • markdownlint is a good candidate for a MarkdownFiles collection:
    • MarkdownFile.check() can lint one file
    • MarkdownFiles.check() can lint the selected set in one runner
  • netlify discovers Site objects from the workspace, but verbs reopen workspace state later
  • docusaurus discovers Site objects and also performs just-in-time runtime workspace tracing

These are valid patterns, but they effectively root provenance at / unless and until they materialize more precise inputs.

Non-Goals

V1 artifacts do not attempt to solve everything implied by the broader composition vision.

In particular, this document does not define:

  • a general built-artifact namespace
  • full artifact-to-artifact composition across produced outputs
  • sequencing and dependency rules between verbs
  • runtime-discovered provenance as a first-class model

Those are important directions, but they should not blur the v1 artifact foundation.

Open Questions

  1. Should runtime-discovered provenance become a second layer later? docusaurus suggests a real use case, but it is not a good foundation for v1 filtering semantics.
  2. How far should the artifact model extend beyond source-backed workspace objects to built outputs such as containers, packages, and services?
  3. What is the best concrete syntax for structural object traversal?
  4. Should the address vocabulary eventually be generalized to other cross-cutting typed address families beyond WorkspacePath?

Transition

Implementation should proceed in this order:

  1. Add collection semantics (+items, +lookup, +key) to the rooted object model.
  2. Add collection-level verb handlers and planner rules that let them pre-empt naive item expansion.
  3. Add workspace provenance on Directory / File.
  4. Add Workspace-taint semantics for stored Workspace fields.
  5. Build artifact enumeration and filtering over collection-backed artifacts whose keys use recognized cross-cutting address types, especially path/git filtering.
  6. Rebase workspace.checks() / workspace.generators() and CLI commands on that collection- centered model.

Optional parallel track:

  • pursue multiple rooted constructors and cleaner root schema projection independently of the artifact-addressing work

This keeps the current UX working while replacing the old special check/generator tree with a more object-native, collection-centered model underneath.

Tracking

Active themes, checkpoints, and the running discussion log live in workspace-artifacts-tracking.md.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment