Cohort Differentiation (Backward Analysis): Practical Implementation
Cohort Differentiation (Backward Analysis): Practical Implementation
Summary
Backward analysis begins where forward testing ends.
Rather than applying new filters to change performance, we hold the outcome labels fixed and examine what the world looked like when a trading condition succeeded versus when it failed.
By comparing these contexts, we can discover hidden dependencies, identify environment-sensitive behavior, and build a more robust understanding of why a rule works.
1. Building Event Cohorts
Start from a labeled event table — the output of a forward study:
event_id | ts_start | cond_persisted | atr_z | bb_pos | vwap_dist | htf_slope | volume_z | ...
Where:
cond_persisted = 1→ the rule held (success cohort)cond_persisted = 0→ the rule failed (failure cohort)
These two groups define your analytical universe. The key is that the overall ratio (e.g., 70/30) is fixed — backward analysis does not modify the sample; it investigates its composition.
2. Summary Statistics
The first layer of insight comes from simple groupwise comparisons:
| feature | mean_true | mean_false | diff | pct_change | note |
|---|---|---|---|---|---|
| atr_z | 0.8 | 1.4 | -0.6 | -43% | Failures cluster in high-vol regimes |
| bb_pos | 0.25 | -0.1 | +0.35 | +140% | Successes start near upper band |
| vwap_dist | -0.02 | -0.08 | +0.06 | +75% | Distance below VWAP correlates w/ success |
This table format can be generated automatically and expanded with metrics like median, variance, or even correlation-to-outcome.
3. Distributional Differences
Averages often hide the story. Plot and quantify the full shape of each feature distribution between cohorts:
- Violin plots → show overlap and skewness.
- Histograms / KDEs → show whether one cohort dominates a subrange.
- Kolmogorov–Smirnov or Wasserstein tests → provide numeric distance between distributions.
For example:
“The ATR_z distribution for failed events shifts right by 0.7 standard deviations — a strong indicator that extreme volatility undermines persistence.”
These metrics help distinguish magnitude effects from shape effects.
4. Conditional-on-Conditional Slices
Once you see which features differ, test interactions by applying secondary conditions:
“What happens when ATR_z < 1 and HTF slope > 0?”
This lets you observe whether the earlier relationships persist under new contexts, e.g.:
- Success rate 83% in low-vol + aligned-slope regions.
- Success rate 40% in high-vol + misaligned regions.
Each slice adds dimensional depth — helping you identify stability zones where your rule behaves predictably.
5. Visualizing Cohort Separation
Visualization is where backward analysis becomes intuitive. You can use two primary techniques:
(a) Pairwise Scatter Plots
Plot pairs of contextual features and color by cond_persisted:
sns.pairplot(df, vars=["atr_z", "vwap_dist", "bb_pos"], hue="cond_persisted")
If the clusters separate visually, that feature pair carries discriminative power.
(b) Dimensionality Reduction (PCA)
When there are many contextual variables, use PCA to reveal geometry:
- PCA rotates the data to find axes of maximum variance.
- Each point represents an event in reduced space (PC1 vs. PC2).
- If true/false events cluster separately, your contextual variables encode meaningful structure.
Interpretation guideline:
| PCA Outcome | Interpretation |
|---|---|
| Clear label separation | Context features explain persistence behavior |
| Mixed overlap | Rule performance is context-independent (or non-linear) |
| Low explained variance | Context features are noisy or redundant |
6. Quantifying Separation
Beyond visuals, compute numeric separability:
- KS / Wasserstein distances for each feature.
- Feature correlation with outcome (
corr(feature, cond_persisted)). - Simple logistic regression or random forest to rank importances.
This isn’t about predictive modeling — it’s diagnostic. You’re identifying which features consistently differ across outcomes.
Example summary:
| feature | KS distance | corr | importance (logit β) | note |
|---|---|---|---|---|
| atr_z | 0.42 | -0.31 | -0.48 | volatility inversely related to success |
| htf_slope | 0.33 | +0.22 | +0.31 | alignment improves persistence |
| volume_z | 0.05 | +0.02 | +0.01 | negligible difference |
7. From Diagnosis to Refinement
Backward analysis is most valuable when its findings feed forward into new hypotheses. Once you identify consistent differentiators:
-
Promote them to candidate filters in your forward tests.
- Example:
ATR_z < 1.5,HTF_slope > 0.
- Example:
-
Recalculate persistence rate under these refined subsets.
-
Repeat until cohort differences shrink — a sign the system is context-stable.
This is the virtuous loop in action: diagnose → refine → re-test → re-diagnose.
8. Implementation Notes
Recommended column and naming conventions for backward analysis:
| Column | Meaning |
|---|---|
event_id | Unique event identifier |
ts_start | Timestamp of event trigger |
cond_persisted | 1 if condition held, else 0 |
feature_* | Any contextual variable at event time |
diff__feature | Optional computed diff between cohorts |
ks__feature | Kolmogorov–Smirnov distance statistic |
Having a consistent schema allows all your studies to reuse the same cohort comparison pipeline.
Key Takeaway
Backward (diagnostic) analysis doesn’t ask how to act, it asks why something worked. By holding outcomes fixed and examining contextual differences, you uncover the environmental texture of your strategy — the regimes, volatility states, and alignment factors that quietly determine success or failure.
This deeper understanding is what fuels the next forward iteration — and ultimately transforms reactive testing into deliberate, context-aware design.
---
Would you like me to produce the **Forward Analysis (Conditioning and Controlled Subsets)** article next — following this same tone and structure, but centered on the experimental/active side?