ARCHIVE: Workspace Artifacts Design (Pre-Modules v2)

This document is archived. It was the original integrated design covering collections, addresses, verbs, plans, and provenance as one system. It has since been superseded by the split design docs in hack/designs/modules-v2/ on the modules-v2 branch of dagger/dagger.

See: https://github.com/dagger/dagger/tree/modules-v2/hack/designs/modules-v2

Workspace Artifacts

Status: Active design, initial prototype on `workspace-artifacts`

Builds on:

Supersedes:

docs/design/proposals/02-artifacts-checks.md
docs/design/proposals/03-checks-api.md
docs/design/proposals/03-ship.md
docs/design/workspace-artifacts.md

Tracking:

Workspace Artifacts Tracking

Summary

Artifacts make a module's dynamic project model directly targetable.

Modules already build runtime object graphs from the workspace. The missing piece is a clean way for users to point at the important objects in that graph and run standard actions on them.

The current design has five core pieces:

object graph: the published rooted graph exposed by modules
collections: keyed dynamic sets inside that graph
addresses: canonical typed addresses for collection-backed artifacts
verbs: standard actions planned over objects and artifacts
provenance: workspace-origin metadata used to filter and select relevant artifacts

This is broader than "filtered checks", but narrower than "every runtime object is an artifact".

Problem
Current Snapshot
Core Definitions
Artifact Enumeration and Object Traversal
Multiple Rooted Constructors
Collections
Addresses and Traversal
Verbs
Shipping in CI
CLI Model
Provenance
Caching Semantics
Examples
Non-Goals
Open Questions
Transition
Tracking

Problem

Modules v2 can dynamically inspect the workspace, but they still cannot present that dynamic model back to the user.

Examples:

A Go module can discover all Go modules in the workspace and define test, lint, and build verbs for each one. Today the user can say "check", but not "check this module".
A deploy module can know which service came from ./cmd/api, but the user cannot ask "ship what comes from this path" or "ship what changed in this commit".
The artifact-centric object model inside Dagger is still hidden behind a tree of named functions. That prevents filtering, composition, and future features that want to operate on real objects rather than ad hoc function paths.

Current Snapshot

Settled

Artifact addressing no longer depends on multiple rooted constructors.
Collections are a separate first-class concept:
- +items enumerates all items
- +lookup resolves one item by key
- +key marks the item's key field
Constructors should mean "construct a value/spec/handle", not "provision a real external resource".
Artifacts should come from collections:
- top-level rooted singletons are object-graph roots, not artifacts
- collection items are the main artifact shape
- not every collection key becomes an artifact address
Verbs are decoupled from artifacts:
- verbs can exist on addressed artifacts and on non-artifact object-graph nodes
- verbs do not by themselves determine artifact status
Collections may implement verb handlers:
- collection-level handlers pre-empt naive expansion to one item handler per item
Workspace provenance remains the core v1 filter mechanism.
dagger call should remain backward-compatible and procedural by default.

Leading Direction

Artifact should now be understood semantically as:
- an addressed object surfaced by a collection somewhere in the object graph
The full object graph and the addressed artifact set are separate concepts:
- the object graph is walked structurally from rooted objects
- artifact addresses are rolled up only from collections whose key type has a wider, cross-cutting address namespace
Artifact identity and object traversal should stay conceptually separate:
- an artifact address identifies a collection-backed object
- graph traversal navigates rooted objects, fields, and collection items
Workspace provenance filtering stays separate from identity:
- address = "which object?"
- provenance filter = "which objects come from or depend on this path?"

Still Open

Whether and when to pursue multiple rooted constructors as a separate schema/rooting track.
Exact collection surface in the schema/runtime.
Exact CLI split between artifact targeting and lower-level object targeting.
How much of the deep object graph needs dedicated inspect/graph UX.
Whether general "origin" should grow beyond workspace provenance in v1.

Core Definitions

Object Graph

The object graph is the published runtime object model exposed by loaded modules.

It includes:

rooted objects
rooted collections
nested objects reachable from them
verbs on any of those objects

Traversal in the object graph follows published structure:

fields
collection hops

It does not auto-invoke arbitrary functions.

Collection

A collection is a keyed set of items.

Collections solve three separate problems:

enumerate all items
look up one item by key
optionally roll an item up into its canonical artifact address

The current leading shape is:

type NetlifySites {
  items: [NetlifySite!]! @items
  lookup(path: WorkspacePath!): NetlifySite! @lookup
}

type NetlifySite {
  path: WorkspacePath! @key
}

The collection itself may be rooted or nested in the object graph. The item type may still get a canonical artifact address derived from that collection, but only when the key type has a wider cross-cutting namespace such as WorkspacePath or HTTPAddress.

Collections may also implement verb handlers. When they do, they act as verb-planning boundaries:

item handler:
- one handler for one item
collection handler:
- one handler for the whole selected set
planner rule:
- if a collection has a handler for verb V, prefer it over naively expanding V to every item

Artifact

An artifact is an addressed object surfaced by a collection in the object graph.

Every artifact is an object. Not every runtime object is an artifact.

This means:

top-level rooted singletons such as Netlify are not artifacts
collection items are the main artifact shape
not every collection item is an artifact
artifacts may live deep in the object graph if a nested collection gives them an address
verbs do not define artifact status

Address

Addresses identify artifacts.

The important semantic rule is:

the address names the artifact type
the address value is the canonical value derived from a collection key whose type has a wider cross-cutting namespace
address syntax is still open

Illustrative example:

netlify-site:/docs
a deployment collection keyed by local dep_123 does not automatically yield an artifact address
the same collection keyed by HTTPAddress such as https://netlify.com/v1/deployments/dep_123 could

Verb

Verbs are high-level action kinds on objects, especially artifacts and collections.

The first verbs are:

check
generate
ship
up

A verb is not just "one function with an annotation".

A verb invocation computes a plan over one or more handler functions. See Verbs.

Provenance

Provenance is the workspace-origin metadata used to filter and select artifacts.

It answers:

which objects come from this workspace path?
which objects are affected by these changes?

It is not the same thing as identity:

address = "which object?"
provenance = "which objects are relevant to this path or change?"

Artifact Enumeration and Object Traversal

Artifact enumeration no longer depends on multiple rooted constructors.

At the object-graph layer, a single rooted module object is enough:

acquire the rooted module object
walk fields
expand collections via +items
project addressable artifacts out of the discovered graph

Default artifact enumeration no longer means "list every rooted singleton".

The current leading split is:

object-graph traversal:
- walk rooted objects
- follow fields
- expand collections via +items
artifact enumeration:
- project addressed collection items out of that walked graph
- wherever the relevant collections appear

Ordinary functions are not automatic discovery edges. Collections are the special case:

+items participates in graph discovery
+lookup participates in artifact identity
ordinary methods remain just methods

Multiple Rooted Constructors

Multiple rooted constructors are now a separate architectural track, not a prerequisite for typed artifact addresses.

Today the schema effectively assumes:

one module
one main object
one rooted constructor

Artifact addressing can still proceed without changing that:

keep the current rooted module object
walk its object graph
surface addressed artifacts from collections anywhere in that graph

Multiple rooted constructors remain valuable for a different reason:

cleaner root schema projection
less dependence on one public main object
room for richer object-layer modeling beyond artifacts

If pursued, modules could expose multiple rooted entrypoints such as:

a rooted object such as Netlify
a rooted collection object such as NetlifySites
other rooted helper/config objects when they are intentionally part of the object graph

The important distinction is:

today, the reachable object graph can still hang off one rooted module object
extra constructors/rooting would define additional root entrypoints into that graph
collections define which objects become addressable artifacts

These are orthogonal:

collections do not replace multiple rooted constructors
multiple rooted constructors do not replace collections

They work together:

rooted constructors expose additional entrypoints into the object graph
collections inside that graph define the canonical addresses of their item artifacts

Constructor semantics should stay narrow:

they construct values, handles, config objects, or specs
they do not, by themselves, imply provisioning or remote creation

This section is about a possible future root-model cleanup, not a blocker for artifact identity.

Schema Change

The current DAGQL projection assumes one rooted constructor per module and flattens it onto Query.

Before:

type Query {
  container(): Container!
  http(url: String!): File!

  # module foo
  foo(...): Foo!
  loadFooByID(id: FooID!): Foo!
}

That is not enough once a module wants more than one rooted object type in the public schema.

The canonical projection must support multiple rooted types cleanly. One leading direction is a namespaced/object-wrapper root model:

type Query {
  objects: Objects!
}

type Objects {
  foo: FooRoots!
}

type FooRoots {
  foo: FooRoot!
  sites: FooSitesRoot!
}

type FooRoot {
  new(...): Foo!
  load(id: FooID!): Foo!
}

type FooSitesRoot {
  new(...): FooSites!
  load(id: FooSitesID!): FooSites!
}

The exact names are still open. What matters is the shape:

multiple rooted object types per module
no dependence on one flat Query field per module
room for rooted collections as peers to rooted object singletons

Compatibility Projection

Backward compatibility can be layered on top of the canonical rooted model.

For example, the engine could still project selected roots back onto flat Query for old clients:

type Query {
  foo(...): Foo!
  loadFooByID(id: FooID!): Foo!

  # optional compat bridge for additional rooted types
  newFooSites(...): FooSites!
  loadFooSitesByID(id: FooSitesID!): FooSites!
}

The exact compat bridge is still open, but the design direction is:

canonical schema first
compatibility projection second

This keeps the root model correct without forcing an immediate breaking change on all clients.

Collections

Collections are required for dynamic artifacts.

They are the mechanism for saying:

here is the full keyed set of objects of this kind
here is how to look up one by key

The current collection contract is:

+items: enumerate all items
+lookup: resolve one item by key
+key: mark the item's key field

Example:

type NetlifySites {
  items: [NetlifySite!]! @items
  lookup(path: WorkspacePath!): NetlifySite! @lookup
}

type NetlifySite {
  path: WorkspacePath! @key
}

Rules:

key uniqueness is enforced within the collection
the collection may itself be rooted
collection lookup always works within the collection
canonical artifact addressing only rolls up from recognized cross-cutting address types such as WorkspacePath

Collections are about lookup and enumeration, not provisioning.

If a module wants to create a new remote resource, that should be modeled explicitly, for example:

a method on the collection such as create(...)
or a separate spec/request object

It should not be implicit in constructor or lookup semantics.

Addresses and Traversal

The current leading direction is to separate artifact addressing from object traversal:

artifact address:
- canonical, typed, collection-derived
- only for collection keys whose type carries a wider cross-cutting namespace
object traversal:
- structural navigation from rooted objects and collections

Design rules:

the scheme names a type
the part after : is the canonical address value for an artifact of that type
exact concrete artifact URI syntax is still open
graph traversal syntax is also still open
graph traversal follows:
- fields
- collection hops
graph traversal does not auto-invoke arbitrary functions

Illustrative examples:

artifact address:
- netlify-site:/docs
structural graph traversal:
- netlify-sites["./docs"].build
- or an equivalent future syntax

What stays separate is provenance filtering:

--path ./docs means "match objects related to ./docs"
it does not mean "the object keyed by ./docs"

This distinction matters even when a collection key is itself a WorkspacePath.

It also means:

a local collection key such as dep_123 may be perfectly valid for lookup(...)
but it does not automatically become a top-level artifact address
a wider key such as https://netlify.com/v1/deployments/dep_123 could
those objects remain reachable through object traversal unless and until they use a recognized cross-cutting address type

Verbs

A verb is a high-level action kind that projects to a plan of function calls.

This is the important distinction:

a handler is one local function annotated for a verb
a verb invocation is the computed plan that runs zero or more handlers

So check(netlify-site:/docs) is not "call one @check method". It is:

find local check handlers
expand across related artifacts according to verb rules
order the resulting calls
execute the plan

Verb Handlers

A verb handler is a local artifact method annotated for a verb.

Examples:

GoModule.lint() annotated @check
GoModule.test() annotated @check
NetlifySite.deploy() annotated @ship

The mapping from artifact type to local handlers is statically known from the schema.

Verb Plans

A verb plan is the effective set of handler calls for one artifact and one verb kind.

The plan is derived from:

local verb handlers on the artifact
local verb handlers on collections that own or batch those artifacts
object-graph structure, especially references
verb-specific orchestration rules

This is distinct from runtime telemetry:

verb plans are part of the static/execution model
runtime calls relations are observed concrete function-call edges during execution

A useful consequence is that every artifact may have an effective plan for every verb kind:

the plan may be empty
the plan may be local only
the plan may expand through referenced artifacts

Verb Semantics

`check`

check(A) is the clearest recursive verb.

Current leading rule:

include local check handlers on A
recursively include check(B) for each artifact B referenced by A
if A references B, run check(B) before local check handlers on A
if a collection has a check handler, prefer it over expanding to one item-level check handler per item in that collection

This makes aggregate artifacts useful by default.

`generate`

generate(A) should stay conservative.

Current leading rule:

include local generate handlers on A
do not recursively generate through references by default
do not make generate an implicit prerequisite of other verbs

This avoids surprising workspace mutations.

`ship`

ship(A) should be stricter than check(A).

Current leading rule:

include local ship handlers on A
do not recursively ship every referenced artifact by default
usually require check(A) first unless explicitly skipped

Raw references are too broad to define automatic ship propagation on their own.

`up`

up(A) is closer to ship(A) than to check(A).

Current leading rule:

include local up handlers on A
do not recursively follow all references by default
likely require check(A) or equivalent readiness checks first

Verb Policy

Workspace or user policy may add gates and ordering on top of core verb semantics.

Examples:

require check before ship
require explicit confirmation or target selection for production ship
default ship target to preview rather than prod

Policy should refine orchestration, not redefine the core meaning of a verb.

Workspace Access

Verb methods must not accept Workspace arguments.

Allowed:

constructors and discovery helpers may accept Workspace
non-verb helper methods may accept Workspace
an object may store Workspace in a field

Forbidden:

verb methods such as check, generate, ship, up taking Workspace

This forces an explicit tradeoff:

precise artifact: materialize Directory / File inputs early
dynamic artifact: store Workspace, become rooted at /

Shipping in CI

Artifacts and verbs make ship targetable, but they do not by themselves settle how shipping should work in CI.

The main tensions are:

environment specificity:
- the same artifact may ship to preview, staging, or prod depending on context
- PR workflows should skew toward preview/dev, not production
dependency policy:
- check can recurse over references
- ship likely needs stricter and sometimes explicit dependencies
workflow shape:
- some teams will want a custom declarative workflow that composes generate, check, ship, and approvals
- it is still open whether that belongs in schema, workspace config, or external CI
safety and policy:
- manual approval, secret availability, branch/event gating, and protected environments all affect what ship should do

The current design intent is:

keep core ship semantics narrow at the artifact layer
let policy/workflow layers add target selection, approvals, and extra dependencies
avoid baking one CI workflow model into the artifact foundation too early

This document deliberately plants the flag here without fully solving:

how to express explicit ship dependencies
how CI context selects ship targets
how far ship should imply check, generate, or other gates
whether custom declarative workflows become a first-class Dagger concept

CLI Model

The default CLI should stay small:

find artifacts
filter by provenance
run verbs

The main commands are:

dagger artifact list
dagger artifact inspect
dagger check
dagger generate
dagger ship
dagger up

Default behavior should center collection-backed artifacts.

Rooted object-graph singletons should remain reachable, but as a more explicit/power-user path:

inspect/debug flows
lower-level object targeting

dagger call should remain backward-compatible and procedural by default.

If artifact/object targeting is added to dagger call, it should be additive rather than a silent reinterpretation of existing call syntax.

The key UX rule is:

address syntax means exact artifact identity
--path means provenance filtering

These must not blur together, even when a rooted artifact key is itself a workspace path.

Provenance

Workspace provenance is the major v1 feature.

Provenance is best understood as artifact metadata, not as a graph edge.

That means:

provenance is stored on artifacts
path/diff/git matching is computed at query time
provenance-based matches are not first-class permanent relations in the object graph

Examples of query-time provenance predicates:

"matches path ./docs"
"matches diff HEAD~1..HEAD"
"overlaps this changed path set"
"is entirely contained within this path set"

Source of Provenance

V1 provenance comes from workspace API reads such as:

workspace.directory(...)
workspace.file(...)

This is intentionally narrow:

not general lineage
not git metadata itself
not module ownership

Git- and path-based selection are derived by evaluating predicates against stored provenance; they are not themselves provenance records.

Provenance Union

Provenance unions across fields and composed values.

In general:

provenance({foo: A, bar: B}) = union(provenance(A), provenance(B))

Root-Path Provenance

Storing Workspace is allowed, but it taints the object with root-path provenance at /.

That means:

it matches all path and git filters
it loses precise source filtering
it should also be treated as workspace-sensitive for caching

Caching Semantics

Workspace-sensitive calls already have special cache behavior today when Workspace is injected as a function argument.

The same semantic taint should extend to stored Workspace fields:

if a function takes Workspace, it is workspace-sensitive
if a function operates on an object that stores Workspace, it is also workspace-sensitive

This taint should affect both:

artifact filtering semantics (/)
downstream cache sensitivity

The exact cache-key mechanism is an implementation detail, but the semantic rule should be the same in both cases.

Artifacts are not a second invalidation system. Provenance exists for UX and orchestration, not to replace the engine's existing content-addressed execution model.

Examples

Precise Artifact

helm-dev already follows the desired precise pattern:

constructor receives Workspace
it materializes ws.Directory(chartPath) into a field
artifact verbs operate on that field

This yields precise provenance and precise filtering.

Dynamic Artifact

Current workspace-API dogfooding also shows the coarse pattern:

markdownlint is a good candidate for a MarkdownFiles collection:
- MarkdownFile.check() can lint one file
- MarkdownFiles.check() can lint the selected set in one runner
netlify discovers Site objects from the workspace, but verbs reopen workspace state later
docusaurus discovers Site objects and also performs just-in-time runtime workspace tracing

These are valid patterns, but they effectively root provenance at / unless and until they materialize more precise inputs.

Non-Goals

V1 artifacts do not attempt to solve everything implied by the broader composition vision.

In particular, this document does not define:

a general built-artifact namespace
full artifact-to-artifact composition across produced outputs
sequencing and dependency rules between verbs
runtime-discovered provenance as a first-class model

Those are important directions, but they should not blur the v1 artifact foundation.

Open Questions

Should runtime-discovered provenance become a second layer later? docusaurus suggests a real use case, but it is not a good foundation for v1 filtering semantics.
How far should the artifact model extend beyond source-backed workspace objects to built outputs such as containers, packages, and services?
What is the best concrete syntax for structural object traversal?
Should the address vocabulary eventually be generalized to other cross-cutting typed address families beyond WorkspacePath?

Transition

Implementation should proceed in this order:

Add collection semantics (+items, +lookup, +key) to the rooted object model.
Add collection-level verb handlers and planner rules that let them pre-empt naive item expansion.
Add workspace provenance on Directory / File.
Add Workspace-taint semantics for stored Workspace fields.
Build artifact enumeration and filtering over collection-backed artifacts whose keys use recognized cross-cutting address types, especially path/git filtering.
Rebase workspace.checks() / workspace.generators() and CLI commands on that collection- centered model.

Optional parallel track:

pursue multiple rooted constructors and cleaner root schema projection independently of the artifact-addressing work

This keeps the current UX working while replacing the old special check/generator tree with a more object-native, collection-centered model underneath.

Tracking

Active themes, checkpoints, and the running discussion log live in workspace-artifacts-tracking.md.

shykes/workspace-artifacts-gist.md

ARCHIVE: Workspace Artifacts Design (Pre-Modules v2)

Workspace Artifacts

Status: Active design, initial prototype on workspace-artifacts

Summary

Table of Contents

Problem

Current Snapshot

Settled

Leading Direction

Still Open

Core Definitions

Object Graph

Collection

Artifact

Address

Verb

Provenance

Artifact Enumeration and Object Traversal

Multiple Rooted Constructors

Schema Change

Compatibility Projection

Collections

Addresses and Traversal

Verbs

Verb Handlers

Verb Plans

Verb Semantics

check

generate

ship

up

Verb Policy

Workspace Access

Shipping in CI

CLI Model

Provenance

Source of Provenance

Provenance Union

Root-Path Provenance

Caching Semantics

Examples

Precise Artifact

Dynamic Artifact

Non-Goals

Open Questions

Transition

Tracking

Status: Active design, initial prototype on `workspace-artifacts`

`check`

`generate`

`ship`

`up`