Distribution Pipeline Methodology
Note: First draft — to be expanded with concrete examples (e.g. slope persistence, ATR).
Overview:
This article describes how studies in QLIR can be treated as a functional pipeline that builds and transforms distributions step-by-step.
Each stage in the pipeline produces a new distribution that can be inspected, compared, or passed forward as input to the next transformation.
🧩 1. Concept
A distribution pipeline is a deterministic sequence of transformations:
- : the global baseline distribution.
- : a conditioning or transformation function (add/remove condition, resample, re-weight).
- : the resulting conditional distribution at stage i.
Each step has a purpose — to clarify a hypothesis, reduce variance, isolate a context, or expose a deformation in likelihood space.
🖼️ Placeholder: Functional Pipeline Diagram
Visual: boxes labeled Global → Condition A → Condition B → Condition C, each with small distribution sketch beneath.
🧮 2. Pseudo-Code Representation
A conceptual pseudocode block makes this concrete:
# Baseline: build global distribution for slope persistence
D0 = get_distribution(feature="slope_persistence")
# Step 1: add volatility condition
D1 = condition(D0, where="ATR_percentile > 0.8")
# Step 2: add RSI condition
D2 = condition(D1, where="RSI > 70")
# Step 3: remove temporal condition (broadens slice)
D3 = remove_condition(D2, name="session")
# Evaluate shape differences
compare_distributions([D0, D1, D2, D3])
Each can expose:
- Shape: symmetry, skew, kurtosis.
- Location: median shift.
- Spread: σ changes.
- Mass redistribution: probability moving between regions.
🖼️ Placeholder: Stepwise Comparison Plot Visual: overlaid histograms showing deformation at each pipeline step.
🧠 3. Why Pipelines Matter
- They formalize reasoning by successive conditioning.
- Each step is explicit — no hidden filters or accidental data leakage.
- They allow auditability (you can reproduce any distribution).
- They provide explainability for decision engines (“this prediction derives from these conditions applied to this baseline”).
🧭 4. Evaluating Output at Each Stage
At each node ( D_i ):
| Metric | Purpose |
|---|---|
median(D_i) | central tendency |
σ(D_i) | overall dispersion |
skew(D_i) | asymmetry |
kurtosis(D_i) | tail heaviness |
Δμ = median(D_i) - median(D_{i-1}) | shift from previous stage |
Δσ = σ(D_i) / σ(D_{i-1}) | relative volatility |
🖼️ Placeholder: Metric Table Example Visual: small table showing how median and σ evolve through pipeline steps.
🔄 5. Non-Normal and Asymmetric Shapes
Distributions in markets are rarely Gaussian. The pipeline method doesn’t assume normality — instead it stores full percentile arrays or empirical CDFs for each step.
Example representation:
{
"percentiles": {
"0.01": -0.045,
"0.05": -0.021,
"0.25": -0.010,
"0.50": 0.002,
"0.75": 0.014,
"0.95": 0.035,
"0.99": 0.058
}
}
This structure allows comparisons like “after applying Condition B, the 95th percentile moved +0.012 relative to baseline.”
🖼️ Placeholder: Non-Normal Distribution Sketch Visual: skewed density curve highlighting percentiles.
⚙️ 6. Linking Back to Conditionality
- Conditionality defines what each transformation means.
- The pipeline defines how those transformations are chained and evaluated.
Together they form the backbone of regime-aware analytics:
“Start global → apply specific conditions → observe deformation → interpret deviation.”
🖼️ Placeholder: Combined Overview Diagram Visual: flow showing integration of the “Conditional vs. Non-Conditional” concept with the functional pipeline.
🔗 Related Pages
Summary
A distribution pipeline is a stepwise process for generating, conditioning, and evaluating empirical distributions. Each transformation narrows or broadens the statistical landscape, revealing how context alters likelihoods. By chaining these transformations, we can trace how real-time market conditions project onto historical probability space.