Skip to main content

Distribution Pipeline Methodology

Note: First draft — to be expanded with concrete examples (e.g. slope persistence, ATR).

Overview:
This article describes how studies in QLIR can be treated as a functional pipeline that builds and transforms distributions step-by-step.
Each stage in the pipeline produces a new distribution that can be inspected, compared, or passed forward as input to the next transformation.


🧩 1. Concept

A distribution pipeline is a deterministic sequence of transformations:

D0f1D1f2D2f3fnDnD_0 \xrightarrow{f_1} D_1 \xrightarrow{f_2} D_2 \xrightarrow{f_3} \ldots \xrightarrow{f_n} D_n
  • (D0)( D_0 ): the global baseline distribution.
  • (fi)( f_i ): a conditioning or transformation function (add/remove condition, resample, re-weight).
  • (Di)( D_i ): the resulting conditional distribution at stage i.

Each step has a purpose — to clarify a hypothesis, reduce variance, isolate a context, or expose a deformation in likelihood space.

🖼️ Placeholder: Functional Pipeline Diagram
Visual: boxes labeled Global → Condition A → Condition B → Condition C, each with small distribution sketch beneath.


🧮 2. Pseudo-Code Representation

A conceptual pseudocode block makes this concrete:

# Baseline: build global distribution for slope persistence
D0 = get_distribution(feature="slope_persistence")

# Step 1: add volatility condition
D1 = condition(D0, where="ATR_percentile > 0.8")

# Step 2: add RSI condition
D2 = condition(D1, where="RSI > 70")

# Step 3: remove temporal condition (broadens slice)
D3 = remove_condition(D2, name="session")

# Evaluate shape differences
compare_distributions([D0, D1, D2, D3])

Each DiD_i can expose:

  • Shape: symmetry, skew, kurtosis.
  • Location: median shift.
  • Spread: σ changes.
  • Mass redistribution: probability moving between regions.

🖼️ Placeholder: Stepwise Comparison Plot Visual: overlaid histograms showing deformation at each pipeline step.


🧠 3. Why Pipelines Matter

  • They formalize reasoning by successive conditioning.
  • Each step is explicit — no hidden filters or accidental data leakage.
  • They allow auditability (you can reproduce any distribution).
  • They provide explainability for decision engines (“this prediction derives from these conditions applied to this baseline”).

🧭 4. Evaluating Output at Each Stage

At each node ( D_i ):

MetricPurpose
median(D_i)central tendency
σ(D_i)overall dispersion
skew(D_i)asymmetry
kurtosis(D_i)tail heaviness
Δμ = median(D_i) - median(D_{i-1})shift from previous stage
Δσ = σ(D_i) / σ(D_{i-1})relative volatility

🖼️ Placeholder: Metric Table Example Visual: small table showing how median and σ evolve through pipeline steps.


🔄 5. Non-Normal and Asymmetric Shapes

Distributions in markets are rarely Gaussian. The pipeline method doesn’t assume normality — instead it stores full percentile arrays or empirical CDFs for each step.

Example representation:

{
"percentiles": {
"0.01": -0.045,
"0.05": -0.021,
"0.25": -0.010,
"0.50": 0.002,
"0.75": 0.014,
"0.95": 0.035,
"0.99": 0.058
}
}

This structure allows comparisons like “after applying Condition B, the 95th percentile moved +0.012 relative to baseline.”

🖼️ Placeholder: Non-Normal Distribution Sketch Visual: skewed density curve highlighting percentiles.


⚙️ 6. Linking Back to Conditionality

  • Conditionality defines what each transformation means.
  • The pipeline defines how those transformations are chained and evaluated.

Together they form the backbone of regime-aware analytics:

“Start global → apply specific conditions → observe deformation → interpret deviation.”

🖼️ Placeholder: Combined Overview Diagram Visual: flow showing integration of the “Conditional vs. Non-Conditional” concept with the functional pipeline.



Summary

A distribution pipeline is a stepwise process for generating, conditioning, and evaluating empirical distributions. Each transformation narrows or broadens the statistical landscape, revealing how context alters likelihoods. By chaining these transformations, we can trace how real-time market conditions project onto historical probability space.