Testing SqueezeMetrics GEX and DIX Indices

I previously looked at @SqueezeMetrics GEX and DIX indices to see if they had any casual impact on future returns in the S&P 500 index but my analysis had a fatal data error in it. In this post I revisit that analysis without the error and with more detail!

It’s been nearly a month, I think it’s time to finish my unfinished business.

(note: I accidentitly created this as a thread rather than a post… though there don’t seem to be serious consequences

The Data

GEX Index

According to its author, the GEX index is a measure of “gamma exposure”. Gamma is one of the many parameters used in option pricing, which we usually call the “greeks” even though not all of them are actually Greek letters.

To understand gamma it is useful to first understand delta. The delta represents the rate of change of value of an option with respect to a change in the stock1 it is an option of. It’s a sort of “velocity of value”. Delta answers the question: for every dollar the stock moves, how many dollars does the option move?

If delta is the velocity of the value, gamma is the acceleration. It tells us how much the delta of an option would move by a $1 move in the underlying. This is a second-order derivative, and that’s part of what makes options exciting (and dangerous)!

For the sake of interpreting GEX, you can think of gamma as the sensitivity of an option to changes in the underlying they represent.

The GEX index is supposed to represent an imbalance between market maker call and put option exposure, though I don’t honestly follow that explanation. A high number is supposed to act as a brake on market prices, and a low number should be an accelerator.

DIX Index

The DIX index is a measure of dark pool short volume. Dark pools are exchanges that retail investors do not have access to and are by their nature more opaque. There is no order book that you can see, so a large investor can execute large trades without getting screwed by traders and high frequency algos watching the book.

The author of the index describes it as a measure of market sentiment from the point of view of dark liquidity.

Discussion

I had a bit of back and forth with @SqueezeMetrics, and I’d like to incorporate their feedback here.

Takeaways:

  1. I needed to fix my data pipeline, which I have done

  2. They recommend to look at the DIX on a one-month timeframe

  3. They recommend normalizing the DIX value using realized volatility, 1-month ATM IV, or VIX

  4. Also, normalize the SPX returns by what the VIX “expected”

I’m happy to do the normalization suggested in #2 since it’s recommended by the author of the index; though otherwise I would not have. If I want to ask if this index value is casually related to SPX returns, I wouldn’t want to go dividing it by things first.

I will not be doing #4. The VIX is not a forecast of forward volatility, and we cannot trade that ratio. If we can’t trade the ratio, then it’s a pointless hypothesis in my opinion. If the DIX only helps us trade an untradable ratio, then it’s not something I am going to keep on my radar. Maybe it predicts the ratio well. Meh?

Leave a comment

When it comes time to implement a trading with a signal that has some demonstrated value, then and only then I might get weird with it, operationally.

Causality

I’ll be testing these causal statements:

  1. A high GEX/RVOL value today is bearish for forward 1-month returns in the S&P 500 index

  2. A low GEX/RVOL value today is bullish for forward 1-month returns in the S&P 500 index

  3. A high DIX/RVOL value today is bullish for forward 1-month returns in the S&P 500 index

  4. A high GEX value today is bearish for forward 1-month returns in the S&P 500 index

  5. A low GEX value today is bullish for forward 1-month returns in the S&P 500 index

  6. A high DIX value today is bullish for forward 1-month returns in the S&P 500 index

#3 and #6 is a bit strange considering it’s supposed to measure “short volume”, but this is explained in their paper2.

High and low are determined based on the mean absolute difference of the series. High is anything above the mean value of the series plus the mean absolute difference. Simple.

RVOL here will be the realized volatility in the S&P 500 index for the prior 30 trading days for the simple reason that my VIX data is old and my S&P 500 data is not.

Leave a comment

Results

That’s a lot to test, so I’m going to be pretty brief with my results. My Bayesian estimation difference of means 3chart is my “big guns”, so that’s what I’ll stick to.

✅ Statement 1 - High GEX/RVOL ⇒ Bearish

Verdict: True. This does skew negative. A high GEX/RVOL is more bearish on average than a non-high GEX/RVOL. Average difference is probably between -2.5% and +0.6%, midpoint is -1%.

✅ Statement 2 - Low GEX/RVOL ⇒ Bullish

Verdict: True. This skews positive. Average difference is probably between -0.5% and +5%, midpoint is +2.1%. This is overall a stronger signal than Statement 1.

❌ Statement 3 - High DIX/RVOL ⇒ Bullish

Verdict: Meh. You could argue that it’s true from the above, but it looks like the real effect is probably zero. More data would narrow the credible interval enough to be sure, but if it were a stronger effect it would be clear here.

❌ Statement 4 - High GEX ⇒ Bearish

Verdict: False. Maybe SqueezeMetrics knows what they’re talking about with their suggested normalization.

✅ Statement 5 - Low GEX ⇒ Bullish

Verdict: True. Interesting, a clear skew positive. Average difference ranges between -0.6% and +4.7%, midpoint +2.1%. Very similar to the normalized result. It’s curious we didn’t see this in Statement 4.

✅ Statement 6 - High DIX ⇒ Bullish

Verdict: True. Strong skew positive here. Average difference is between -0.1% and +3.9% between high DIX regimes and all other times.

Conclusions

My initial conclusions were wrong, as you’d expect given my data error. Overall it looks like SqueezeMetrics knows what they are talking about and their intuition about their indexes is supported by these findings.

One concern I have is with GEX/RVOL, here’s a chart to illustrate:

The orange/brown horizontal line is the mean + mean absolute difference of the series, which is how I’m determining “high”. There’s a lot of clustering above that line recently. My test dataset may not be well diversified across time as a result. Some kind of detrending could help, but the series is pretty flat until recently so it’s not clear to me how that should be done.

I’m still open to being wrong again, let me know if you think I am and I will celebrate the death of another weakness in my process!

Leave a comment

Share

1

I write “stock” here to attempt to be more clear to a general audience. More generally, we say “underlying”, since it could be a bond, a futures contract, etc.

2

There is also a scary chart crime on page 6 - a regression through an extremely weakly correlated cloud. These kinds of regressions give an illusion of relationship where it is nowhere near as clear as the line would suggest

3

I always talk about differences in distributions but only compare the means, what gives? Well, this is for trading. If your expectation of profit isn’t different, it’s not that interesting. Besides, for fat-tailed distributions I’m not even sure the higher moments exist.