ProcessHub: GitHub, but for bioprocesses

Core premise: each Process is a repo-like object with its own history, forks, releases, CI pipeline, and machine-readable specification.

Platform stance: no inherent restriction on what’s stored — attributes (like safety_class, hazard_rating, biosafety_level) are part of the data model, not a platform hard limit.

1) Core concepts

Users / Orgs
Just like GitHub, with profile, contribution graph, following, org repos.
Processes (the "repos")
A self-contained project describing what can be made, from what, and how.
- Contains: /graph.yaml, /nodes, /unitops, /edges, /README, /provenance.
- Has version history, releases, tags.
- Metadata includes attributes like:
  - process_type: synthesis, purification, analysis
  - safety_class: safe / controlled / hazardous
  - hazard_rating: numeric scale
  - biosafety_level: BSL-1, BSL-2, etc.
  - execution_ready: true/false
UnitOp Registry
Shared, versioned definitions of reusable Unit Operations (e.g., RP-HPLC abstract, solvent extraction).
Processes can import UnitOps at a specific version.
Forks, PRs, and Reviews
Same mechanics as GitHub — propose edits, merge after review.
Pipelines (CI)
Validates schemas, checks graph connectivity, enforces org-defined policies (could block certain hazard ratings in public orgs, for example).

2) Process structure

/
├─ README.md
├─ graph.yaml                # DAG of nodes/edges
├─ nodes/                    # materials, products, hosts, etc.
├─ edges/                    # transformations & separations
├─ unitops.lock              # pinned UnitOps
├─ provenance.yaml           # references, contributors, lineage
├─ metadata.yaml             # attributes like safety\_class, hazard\_rating
└─ LICENSE

metadata.yaml example

id: process.c15_0.enrichment
label: Odd-chain fatty acid enrichment
process_type: purification
safety_class: controlled # controlled, unrestricted, hazardous
hazard_rating: 2 # scale 0–5
biosafety_level: BSL-1
execution_ready: false
tags: [lipid, fatty_acid, chromatography]

3) Example UnitOp (registry entry)

id: rp_hplc
version: "2.1.0"
label: Reverse-phase HPLC
inputs: [material:any_liquid_sample]
outputs: [material:fraction_collection]
parameters:
  - name: column_type
    type: enum
    values: ["C8", "C18", "polymer_reversed"]
  - name: detection_mode
    type: enum
    values: ["UV", "MS", "ELSD", "none"]
attributes:
  hazard_rating: 1
  safety_class: unrestricted
license: "Apache-2.0"

4) CI / Policy enforcement

Core validators (always on):
- Schema checks (graph, node, edge, UnitOp)
- Graph integrity (no dangling nodes)
- Provenance completeness
Org-defined policies (optional):
- Allow/block certain safety_class values in repos
- Auto-flag processes above certain hazard_rating
- Require biosafety_level to be declared

This means the platform doesn’t prohibit — it gives knobs for communities/orgs to set their own thresholds.

5) Discovery & search

Filter by:

Target product / intermediate
UnitOps used
Safety attributes (safety_class=unrestricted)
Hazard rating range
Host organism
Graph patterns

6) Execution adapters

Public Processes remain descriptive specs.
Orgs can attach private adapters mapping specs to LIMS or lab robots.
execution_ready: true marks processes that have been successfully run in at least one environment (pointer to private adapter).

7) Why this model works

Keeps safety and hazard as data, not a global constraint.
Makes the platform useful for research, industry, and education without hardcoding exclusions.
Allows federated communities to run their own governance models.

1) Core concepts​

2) Process structure​

metadata.yaml example​

3) Example UnitOp (registry entry)​

4) CI / Policy enforcement​

5) Discovery & search​

6) Execution adapters​

7) Why this model works​