For an intro into variance...check out these posts:
Problem with variance is calculating it in the traditional sense. Its costly to compute across a time series. It can be quite a drag on your simulation engine's performance. The way to reduce the cost is to calculate the running variance. And that's when you get into quite a briar patch - loss of precision and overflow issues. See John D. Cook's post covering the variance briar patch:
And a few more posts by John covering different variance formulas and their outcomes:
- Comparing three methods of computing standard deviation
- Theoretical explanation for numerical results
- Reduced the precision loss issue as much as possible;
- Allowed an easy way to window the running variance;
- Allowed an easy way to memoize the call.
So, let's start with the formula for the Power Sum Average (\(PSA\)):
\( PSA = PSA_{yesterday} + ( ( (x_{today} * x_{today}) - x_{yesterday} ) ) / n) \)
Where:
- \(x\) = value in your time series
- \(n\) = number of values you've analyzed so far
Once you have the \(PSA\) and \(SMA\); you can tackle the Running Population Variance (\(Var\) ):
\(Population Var = (PSA_{today} * n - n * SMA_{today} * SMA_{today}) / n \)
Now, one problem with all these formulas - they don't cover how to window the running variance. Windowing the variance gives you the ability to view the 20 period running variance at bar 150. All the formulas I've mentioned above only give you the running cumulative variance. Deriving the running windowed variance is just a matter of using the same SMA I've posted about before and adjusting the Power Sum Average to the following:
\( PSA = PSA_{yesterday} + (((x_{today} * x_{today}) - (x_{yesterday} * x_{yesterday}) / n) \)
Where:
- \(x\) = value in your time series
- \(n\) = the period
\(Sample Var = (PSA_{today} * n - n * SMA_{today} * SMA_{today}) / (n - 1) \)
Okay, on to the code.
Code for the Power Sum Average:
def powersumavg(bar, series, period, pval=None): """ Returns the power sum average based on the blog post from Subliminal Messages. Use the power sum average to help derive the running variance. sources: http://subluminal.wordpress.com/2008/07/31/running-standard-deviations/ Keyword arguments: bar -- current index or location of the value in the series series -- list or tuple of data to average period -- number of values to include in average pval -- previous powersumavg (n - 1) of the series. """ if period < 1: raise ValueError("period must be 1 or greater") if bar < 0: bar = 0 if pval == None: if bar > 0: raise ValueError("pval of None invalid when bar > 0") pval = 0.0 newamt = float(series[bar]) if bar < period: result = pval + (newamt * newamt - pval) / (bar + 1.0) else: oldamt = float(series[bar - period]) result = pval + (((newamt * newamt) - (oldamt * oldamt)) / period) return resultCode for the Running Windowed Variance:
def running_var(bar, series, period, asma, apowsumavg): """ Returns the running variance based on a given time period. sources: http://subluminal.wordpress.com/2008/07/31/running-standard-deviations/ Keyword arguments: bar -- current index or location of the value in the series series -- list or tuple of data to average asma -- current average of the given period apowsumavg -- current powersumavg of the given period """ if period < 1: raise ValueError("period must be 1 or greater") if bar <= 0: return 0.0 if asma == None: raise ValueError("asma of None invalid when bar > 0") if apowsumavg == None: raise ValueError("powsumavg of None invalid when bar > 0") windowsize = bar + 1.0 if windowsize >= period: windowsize = period return (apowsumavg * windowsize - windowsize * asma * asma) / windowsizeExample call and results:
list_of_values = [3, 5, 8, 10, 4, 8, 12, 15, 11, 9] prev_powersumavg = None prev_sma = None prev_sma = None period = 3 for bar, price in enumerate(list_of_values): new_sma = running_sma(bar, list_of_values, period, prev_sma) new_powersumavg = powersumavg(bar, list_of_values, period, prev_powersumavg) new_var = running_var(bar, list_of_values, period, new_sma, new_powersumavg) msg = "SMA=%.4f, PSA=%.4f, Var=%.4f" % (new_sma, new_powersumavg, new_var) print "bar %i: %s" % (bar, msg) prev_sma = new_sma prev_powersumavg = new_powersumavg ---------------------------------------------------------- Results of call: bar 0: SMA=3.0000, PSA=9.0000, Var=0.0000 bar 1: SMA=4.0000, PSA=17.0000, Var=1.0000 bar 2: SMA=5.3333, PSA=32.6667, Var=4.2222 bar 3: SMA=7.6667, PSA=63.0000, Var=4.2222 bar 4: SMA=7.3333, PSA=60.0000, Var=6.2222 bar 5: SMA=7.3333, PSA=60.0000, Var=6.2222 bar 6: SMA=8.0000, PSA=74.6667, Var=10.6667 bar 7: SMA=11.6667, PSA=144.3333, Var=8.2222 bar 8: SMA=12.6667, PSA=163.3333, Var=2.8889 bar 9: SMA=11.6667, PSA=142.3333, Var=6.2222Of course, as I said in the beginning of this post, just take the square root of this Running Windowed Variance to obtain the Standard Deviation.
Later Trades,
MT