Skip to main content

ProcessHub: GitHub, but for bioprocesses

Core premise: each Process is a repo-like object with its own history, forks, releases, CI pipeline, and machine-readable specification.

Platform stance: no inherent restriction on what’s stored — attributes (like safety_class, hazard_rating, biosafety_level) are part of the data model, not a platform hard limit.


1) Core concepts

  • Users / Orgs
    Just like GitHub, with profile, contribution graph, following, org repos.

  • Processes (the "repos")
    A self-contained project describing what can be made, from what, and how.

    • Contains: /graph.yaml, /nodes, /unitops, /edges, /README, /provenance.
    • Has version history, releases, tags.
    • Metadata includes attributes like:
      • process_type: synthesis, purification, analysis
      • safety_class: safe / controlled / hazardous
      • hazard_rating: numeric scale
      • biosafety_level: BSL-1, BSL-2, etc.
      • execution_ready: true/false
  • UnitOp Registry
    Shared, versioned definitions of reusable Unit Operations (e.g., RP-HPLC abstract, solvent extraction).
    Processes can import UnitOps at a specific version.

  • Forks, PRs, and Reviews
    Same mechanics as GitHub — propose edits, merge after review.

  • Pipelines (CI)
    Validates schemas, checks graph connectivity, enforces org-defined policies (could block certain hazard ratings in public orgs, for example).


2) Process structure


/
├─ README.md
├─ graph.yaml # DAG of nodes/edges
├─ nodes/ # materials, products, hosts, etc.
├─ edges/ # transformations & separations
├─ unitops.lock # pinned UnitOps
├─ provenance.yaml # references, contributors, lineage
├─ metadata.yaml # attributes like safety\_class, hazard\_rating
└─ LICENSE

metadata.yaml example

id: process.c15_0.enrichment
label: Odd-chain fatty acid enrichment
process_type: purification
safety_class: controlled # controlled, unrestricted, hazardous
hazard_rating: 2 # scale 0–5
biosafety_level: BSL-1
execution_ready: false
tags: [lipid, fatty_acid, chromatography]

3) Example UnitOp (registry entry)

id: rp_hplc
version: "2.1.0"
label: Reverse-phase HPLC
inputs: [material:any_liquid_sample]
outputs: [material:fraction_collection]
parameters:
- name: column_type
type: enum
values: ["C8", "C18", "polymer_reversed"]
- name: detection_mode
type: enum
values: ["UV", "MS", "ELSD", "none"]
attributes:
hazard_rating: 1
safety_class: unrestricted
license: "Apache-2.0"

4) CI / Policy enforcement

  • Core validators (always on):

    • Schema checks (graph, node, edge, UnitOp)
    • Graph integrity (no dangling nodes)
    • Provenance completeness
  • Org-defined policies (optional):

    • Allow/block certain safety_class values in repos
    • Auto-flag processes above certain hazard_rating
    • Require biosafety_level to be declared

This means the platform doesn’t prohibit — it gives knobs for communities/orgs to set their own thresholds.


Filter by:

  • Target product / intermediate
  • UnitOps used
  • Safety attributes (safety_class=unrestricted)
  • Hazard rating range
  • Host organism
  • Graph patterns

6) Execution adapters

  • Public Processes remain descriptive specs.
  • Orgs can attach private adapters mapping specs to LIMS or lab robots.
  • execution_ready: true marks processes that have been successfully run in at least one environment (pointer to private adapter).

7) Why this model works

  • Keeps safety and hazard as data, not a global constraint.
  • Makes the platform useful for research, industry, and education without hardcoding exclusions.
  • Allows federated communities to run their own governance models.