Methodology
How we move from raw news signals to causal inference about geopolitical risk and financial markets.
Data Pipeline
CausalAlpha operates a continuous pipeline that transforms unstructured news into quantified risk signals:
- Ingest — Continuously scrape messages from real-time news sources at regular intervals.
- Classify — Each message is scored against a multi-category keyword taxonomy spanning 5 standalone risk domains: Conflict, Political, Energy, Financial, and Trade.
- Normalize — Raw keyword counts are converted to share of daily messages, then smoothed with a 7-day rolling average. This controls for variation in message volume across days.
- Integrate — Daily risk indicators are merged with market data (Brent crude, Gold, VIX). Price series are first-differenced to ensure stationarity.
- Analyze — The full dataset is analyzed for causal structure using the PC Algorithm and Structural VAR.
Risk Indicators and Market Data
Five standalone risk categories tracked as normalized 7-day rolling shares, overlaid with Brent crude oil and the VIX volatility index.
Top: 5 risk categories (7-day rolling share of daily messages) | Middle: Brent Crude USD/barrel | Bottom: VIX
Causal Structure Discovery (PC Algorithm)
Rather than assuming relationships between variables, we discover them using the PC Algorithm — a constraint-based causal inference method that tests conditional independencies to identify directed edges.
Applied to 12 months of daily data (Fisher Z test, alpha = 0.10) with 8 variables — 5 normalized risk indicators, VIX, and first-differenced Brent Oil and Gold — the algorithm recovers a Directed Acyclic Graph (DAG):
PC Algorithm (Fisher Z, alpha=0.10) | 7-day rolling share | Price series first-differenced
Key Findings
- Political instability drives conflict coverage — directed edge from Political to Conflict, suggesting political crises precede conflict reporting.
- Financial stress drives conflict coverage — directed edge from Financial to Conflict, indicating economic turmoil amplifies conflict narratives.
- Trade tensions cascade to financial stress and volatility — Trade has directed edges to both Financial and VIX.
- Multiple pathways to VIX — Political, Financial, and Trade all have directed edges to VIX, confirming geopolitical risk as a driver of market volatility.
- Energy links are bidirectional — Energy has undirected edges with Financial and Political, suggesting mutual feedback rather than one-way causation.
- Gold and Brent Oil are largely independent — price changes (first-differenced) show few direct causal links from risk indicators, consistent with commodity-specific drivers.
Market Responses to Geopolitical Shocks (SVAR)
The Structural VAR model quantifies how geopolitical shocks propagate to markets over a 20-day horizon. The focused view below shows only the responses that matter most: how VIX, Brent Oil, and Gold respond to shocks in each of the 5 risk categories.
Market responses to geopolitical risk shocks — SVAR (Cholesky, 95% MC bands, 1000 replications)
Notable Patterns
- Trade shocks increase VIX — a trade tension shock causes a sustained increase in market volatility, peaking around day 10.
- Political shocks have wide uncertainty — the confidence bands for political shocks are among the widest, reflecting unpredictable market responses to political instability.
- Gold responds negatively to conflict and political shocks — counterintuitively, the first-differenced Gold price declines after geopolitical shocks, possibly reflecting risk-off selling across all assets.
- Financial shocks drive Brent Oil — financial stress creates a positive initial response in oil prices, potentially through supply-concern channels.
View full 8x8 IRF grid
Full IRF grid — all 8 variables × 8 shocks
Forecast Error Variance Decomposition
FEVD answers the question: what fraction of each variable's movements can be attributed to shocks from other variables? This is the core of risk attribution.
FEVD — Ordering: VIX → Conflict → Political → Energy → Financial → Trade → Brent → Gold
Risk Attribution
- Conflict — receives substantial variance from Political and Financial shocks at longer horizons, confirming the DAG's directed edges.
- Trade — one of the most externally driven variables, with significant contributions from Financial and Conflict shocks over time.
- Brent Oil — increasingly influenced by Trade and Financial shocks at the 15-20 day horizon, suggesting geopolitical risk does filter through to energy prices with a delay.
- VIX — while mostly self-explained at short horizons, geopolitical shocks (especially Trade) explain a growing fraction at longer horizons.
- Gold — predominantly driven by its own momentum, with modest contributions from other variables.
Limitations and Disclaimers
- Text mining is probabilistic — keyword matching captures signal but also noise. False positives and negatives exist.
- Causal inference assumes stationarity — relationships discovered over the last 12 months may not hold during unprecedented events.
- Cholesky ordering matters — the SVAR results depend on the assumed causal ordering of variables. Different orderings may yield different impulse responses.
- First-differencing removes level information — price changes (not levels) are used for stationarity, which may miss long-run relationships.
- This is not investment advice. CausalAlpha is a research tool. Do not make financial decisions based solely on this data.