Skip to content

Methodology

How we move from raw news signals to causal inference about geopolitical risk and financial markets.

Data Pipeline

CausalAlpha operates a continuous pipeline that transforms unstructured news into quantified risk signals:

  1. Ingest — Continuously scrape messages from real-time news sources at regular intervals.
  2. Classify — Each message is scored against a multi-category keyword taxonomy spanning 5 standalone risk domains: Conflict, Political, Energy, Financial, and Trade.
  3. Normalize — Raw keyword counts are converted to share of daily messages, then smoothed with a 7-day rolling average. This controls for variation in message volume across days.
  4. Integrate — Daily risk indicators are merged with market data (Brent crude, Gold, VIX). Price series are first-differenced to ensure stationarity.
  5. Analyze — The full dataset is analyzed for causal structure using the PC Algorithm and Structural VAR.

Risk Indicators and Market Data

Five standalone risk categories tracked as normalized 7-day rolling shares, overlaid with Brent crude oil and the VIX volatility index.

Three-panel chart: normalized geopolitical risk indicators (Conflict, Political, Energy, Financial, Trade), Brent crude oil price, and VIX over 12 months

Top: 5 risk categories (7-day rolling share of daily messages) | Middle: Brent Crude USD/barrel | Bottom: VIX

Causal Structure Discovery (PC Algorithm)

Rather than assuming relationships between variables, we discover them using the PC Algorithm — a constraint-based causal inference method that tests conditional independencies to identify directed edges.

Applied to 12 months of daily data (Fisher Z test, alpha = 0.10) with 8 variables — 5 normalized risk indicators, VIX, and first-differenced Brent Oil and Gold — the algorithm recovers a Directed Acyclic Graph (DAG):

Causal DAG with bipartite layout: geopolitical risk indicators (Conflict, Political, Energy, Financial, Trade) on the left, market variables (VIX, Brent Oil, Gold) on the right, with directed and undirected edges

PC Algorithm (Fisher Z, alpha=0.10) | 7-day rolling share | Price series first-differenced

Key Findings

  • Political instability drives conflict coverage — directed edge from Political to Conflict, suggesting political crises precede conflict reporting.
  • Financial stress drives conflict coverage — directed edge from Financial to Conflict, indicating economic turmoil amplifies conflict narratives.
  • Trade tensions cascade to financial stress and volatility — Trade has directed edges to both Financial and VIX.
  • Multiple pathways to VIX — Political, Financial, and Trade all have directed edges to VIX, confirming geopolitical risk as a driver of market volatility.
  • Energy links are bidirectional — Energy has undirected edges with Financial and Political, suggesting mutual feedback rather than one-way causation.
  • Gold and Brent Oil are largely independent — price changes (first-differenced) show few direct causal links from risk indicators, consistent with commodity-specific drivers.

Market Responses to Geopolitical Shocks (SVAR)

The Structural VAR model quantifies how geopolitical shocks propagate to markets over a 20-day horizon. The focused view below shows only the responses that matter most: how VIX, Brent Oil, and Gold respond to shocks in each of the 5 risk categories.

Focused impulse response functions: market responses (VIX, Brent Oil, Gold) to geopolitical risk shocks (Conflict, Political, Energy, Financial, Trade) over 20 days with 95% Monte Carlo confidence bands

Market responses to geopolitical risk shocks — SVAR (Cholesky, 95% MC bands, 1000 replications)

Notable Patterns

  • Trade shocks increase VIX — a trade tension shock causes a sustained increase in market volatility, peaking around day 10.
  • Political shocks have wide uncertainty — the confidence bands for political shocks are among the widest, reflecting unpredictable market responses to political instability.
  • Gold responds negatively to conflict and political shocks — counterintuitively, the first-differenced Gold price declines after geopolitical shocks, possibly reflecting risk-off selling across all assets.
  • Financial shocks drive Brent Oil — financial stress creates a positive initial response in oil prices, potentially through supply-concern channels.
View full 8x8 IRF grid
Full 8x8 impulse response function grid showing all variable responses to all shocks

Full IRF grid — all 8 variables × 8 shocks

Forecast Error Variance Decomposition

FEVD answers the question: what fraction of each variable's movements can be attributed to shocks from other variables? This is the core of risk attribution.

Forecast Error Variance Decomposition showing what fraction of each variable's variance is explained by shocks from the 8 variables over a 20-day horizon

FEVD — Ordering: VIX → Conflict → Political → Energy → Financial → Trade → Brent → Gold

Risk Attribution

  • Conflict — receives substantial variance from Political and Financial shocks at longer horizons, confirming the DAG's directed edges.
  • Trade — one of the most externally driven variables, with significant contributions from Financial and Conflict shocks over time.
  • Brent Oil — increasingly influenced by Trade and Financial shocks at the 15-20 day horizon, suggesting geopolitical risk does filter through to energy prices with a delay.
  • VIX — while mostly self-explained at short horizons, geopolitical shocks (especially Trade) explain a growing fraction at longer horizons.
  • Gold — predominantly driven by its own momentum, with modest contributions from other variables.

Limitations and Disclaimers

  • Text mining is probabilistic — keyword matching captures signal but also noise. False positives and negatives exist.
  • Causal inference assumes stationarity — relationships discovered over the last 12 months may not hold during unprecedented events.
  • Cholesky ordering matters — the SVAR results depend on the assumed causal ordering of variables. Different orderings may yield different impulse responses.
  • First-differencing removes level information — price changes (not levels) are used for stationarity, which may miss long-run relationships.
  • This is not investment advice. CausalAlpha is a research tool. Do not make financial decisions based solely on this data.

See the live data

Open Dashboard