TaylorTree

Thursday, November 25, 2010

Running Variance

Variance - kinda the bread and butter for analysis work on a time series. Doesn't get much respect though. But, take the square root of the variance and you get the almighty standard deviation. Today, though, let's give variance its due...
For an intro into variance...check out these posts:

Problem with variance is calculating it in the traditional sense. Its costly to compute across a time series. It can be quite a drag on your simulation engine's performance. The way to reduce the cost is to calculate the running variance. And that's when you get into quite a briar patch - loss of precision and overflow issues. See John D. Cook's post covering the variance briar patch:

Accurately computing running variance

And a few more posts by John covering different variance formulas and their outcomes:

John does great work and I learn a lot from his posts. But, I was still having problems finding a variance formula that fit my needs:

Reduced the precision loss issue as much as possible;
Allowed an easy way to window the running variance;
Allowed an easy way to memoize the call.

Thankfully, I found a post by Subluminal Messages covering his very cool Running Standard Deviations formula. The code doesn't work as is - needs correcting on a few items - but you can get the gist of the formula just fine. The formula uses the power sum of the squared differences of the values versus Welford's approach of using the sum of the squared differences of the mean. Which makes it a bit easier to memoize. Not sure if its as good in solving the precision loss and overflow issues as Welford's does....but so far I haven't found any issues with it.

So, let's start with the formula for the Power Sum Average ($PSA$):

$ PSA = PSA_{yesterday} + ( ( (x_{today} * x_{today}) - x_{yesterday} ) ) / n) $

Where:

$x$ = value in your time series
$n$ = number of values you've analyzed so far

You also need the Simple Moving Average, which you can find in one of my previous posts here.
Once you have the $PSA$ and $SMA$; you can tackle the Running Population Variance ($Var$ ):

$Population Var = (PSA_{today} * n - n * SMA_{today} * SMA_{today}) / n $

Now, one problem with all these formulas - they don't cover how to window the running variance. Windowing the variance gives you the ability to view the 20 period running variance at bar 150. All the formulas I've mentioned above only give you the running cumulative variance. Deriving the running windowed variance is just a matter of using the same SMA I've posted about before and adjusting the Power Sum Average to the following:

$ PSA = PSA_{yesterday} + (((x_{today} * x_{today}) - (x_{yesterday} * x_{yesterday}) / n) $

Where:

$x$ = value in your time series
$n$ = the period

[Update] If you want the sample Variance you just need to adjust the Var formula to the following:

$Sample Var = (PSA_{today} * n - n * SMA_{today} * SMA_{today}) / (n - 1) $

Okay, on to the code.

Code for the Power Sum Average:

def powersumavg(bar, series, period, pval=None):
    """
    Returns the power sum average based on the blog post from
    Subliminal Messages.  Use the power sum average to help derive the running
    variance.
    sources: http://subluminal.wordpress.com/2008/07/31/running-standard-deviations/
 
    Keyword arguments:
    bar     --  current index or location of the value in the series
    series  --  list or tuple of data to average
    period  -- number of values to include in average
    pval    --  previous powersumavg (n - 1) of the series.
    """
 
    if period < 1:
        raise ValueError("period must be 1 or greater")
 
    if bar < 0:
        bar = 0
 
    if pval == None:
        if bar > 0:
            raise ValueError("pval of None invalid when bar > 0")
            
        pval = 0.0
    
    newamt = float(series[bar])
 
    if bar < period:
        result = pval + (newamt * newamt - pval) / (bar + 1.0)
 
    else:
        oldamt = float(series[bar - period])
        result = pval + (((newamt * newamt) - (oldamt * oldamt)) / period)
 
    return result

Code for the Running Windowed Variance:

def running_var(bar, series, period, asma, apowsumavg):
    """
    Returns the running variance based on a given time period.
    sources: http://subluminal.wordpress.com/2008/07/31/running-standard-deviations/

    Keyword arguments:
    bar     --  current index or location of the value in the series
    series  --  list or tuple of data to average
    asma    --  current average of the given period
    apowsumavg -- current powersumavg of the given period
    """
    if period < 1:
        raise ValueError("period must be 1 or greater")

    if bar <= 0:
        return 0.0

    if asma == None:
        raise ValueError("asma of None invalid when bar > 0")

    if apowsumavg == None:
        raise ValueError("powsumavg of None invalid when bar > 0")

    windowsize = bar + 1.0
    if windowsize >= period:
        windowsize = period

    return (apowsumavg * windowsize - windowsize * asma * asma) / windowsize

Example call and results:

list_of_values = [3, 5, 8, 10, 4, 8, 12, 15, 11, 9]
prev_powersumavg = None
prev_sma = None
prev_sma = None
period = 3
for bar, price in enumerate(list_of_values):
    new_sma = running_sma(bar, list_of_values, period, prev_sma)
    new_powersumavg = powersumavg(bar, list_of_values, period, prev_powersumavg)
    new_var = running_var(bar, list_of_values, period, new_sma, new_powersumavg)

    msg = "SMA=%.4f, PSA=%.4f, Var=%.4f" % (new_sma, new_powersumavg, new_var)
    print "bar %i: %s" % (bar, msg)

    prev_sma = new_sma
    prev_powersumavg = new_powersumavg

----------------------------------------------------------
Results of call:
bar 0: SMA=3.0000, PSA=9.0000, Var=0.0000
bar 1: SMA=4.0000, PSA=17.0000, Var=1.0000
bar 2: SMA=5.3333, PSA=32.6667, Var=4.2222
bar 3: SMA=7.6667, PSA=63.0000, Var=4.2222
bar 4: SMA=7.3333, PSA=60.0000, Var=6.2222
bar 5: SMA=7.3333, PSA=60.0000, Var=6.2222
bar 6: SMA=8.0000, PSA=74.6667, Var=10.6667
bar 7: SMA=11.6667, PSA=144.3333, Var=8.2222
bar 8: SMA=12.6667, PSA=163.3333, Var=2.8889
bar 9: SMA=11.6667, PSA=142.3333, Var=6.2222

Of course, as I said in the beginning of this post, just take the square root of this Running Windowed Variance to obtain the Standard Deviation.

Later Trades,

MT

Saturday, November 06, 2010

Portfolio Performance for October 2010

Back to back months of these kind of numbers make an old trader like me nervous. When's the trap door going to fall?

No doubt the current market environment is to the system's liking. One thing I need to explore is the system's position sizing algo. I position size based on the volatility of the stock over x days. But, lately the volatility on the stocks selected have been so small. Which is seriously underestimating the true risk of the position. So, need to perform some studies on how to handle volatility shrinkage during boom times like these.

On to the charts...

As you can see, we are officially out of the almost 3 year drawdown (knocking on wood as I type this).

What's ahead for TaylorTree? Spending what free time I have on preparing for another Missouri winter. Don't believe this Texas boy will ever get used to the cold. Also, working on the continued upgrade of the database and record structures of the simulation engine. Tests so far have proved the new structures are much faster and memory efficient...but have yet to test on the type of data demands the simulation engine handles - 10GB+.

Later Trades,

MT

Saturday, October 09, 2010

Portfolio Performance for September 2010

September was a great month for the portfolio. Adding another positive month to the new system's belt. So far, 3 out of 4 positive months for the new system. Marking August as the only negative month. This reminds me, I need to code some reports to track this type of information. Add it to the many list of things to do. On to the charts:

As you can see, the portfolio dropped 6% in August followed by a 21% pop in September.

After a great month; it's easy to pat yourself on the back. If anything the years have taught me; September was just another month. I haven't done anything to garner a great month or a bad month. Perhaps this quote from Wall Street 2 says it best:

Jacob: "How's your day going?"
Lewis: "I told you...good day I'm okay, bad day...I'm okay. Stop bugging me on my feelings, their irrelevant."

Checking out the Drawdown chart above, we're inching ever closer to getting back to even. Hard to imagine we've been living in drawdown for almost 3 years.
That's it from the trading turret where I'm watching the colorful leaves fall. Looking forward to a weekend of grilling, finishing up some remodel work, and enjoying a few seasonal beers of the Fall.

Later Trades,

MT

Saturday, September 11, 2010

Running Sum

We've covered Running SMAs and EMAs...let's dig into Running Sums or often called Running Totals. Formula as follows:
$Sum_{today} = Sum_{yesterday} + (price_{today} - price_{today - period})$

Where $ price_{today - period} $ represents the price that is dropping off the slice you are summing. For example:

Take a list of numbers = 20, 40, 60, 80, 100, 120.
The formula for the 3-bar running sum would be:

bar 1: 20
bar 2: 20 + 40 = 60
bar 3: 20 + 40 + 60 = 120
bar 4: 40 + 60 + 80 = 180

Or we can apply our formula from above as $ Sum_{today} = 120 + (80 - 20) $

bar 5: 60 + 80 + 100 = 240

Or use formula of $ Sum_{today} = 180 + (100 - 40) $

bar 6: 80 + 100 + 120 = 300

Or use formula of $ Sum_{today} = 240 + (120 - 60) $

Coding in Python we get:

def running_sum(bar, series, period, pval=None):
    """
    Returns the running sum of values in a list of tuple - avoids summing
    entire series on each call.

    Keyword arguments:
    bar     --  current index or location of the value in the series
    series  --  list or tuple of data to sum
    period  -- number of values to include in sum
    pval    --  previous sum (n - 1) of the series.
    """
    if period < 1:
        raise ValueError("period must be 1 or greater")

    if bar <= 0:
        return series[0]

    if bar < period:
        return pval + series[bar]
    
    return pval + (series[bar] - series[bar - period])

Example call and results:

list_of_values = [20, 40, 60, 80, 100, 120]
prevsum = list_of_values[0]   #first sum is the first value in the series.

for bar, price in enumerate(list_of_values):
    newsum = running_sum(bar, list_of_values, 3, pval=prevsum)
    print "bar %i: %i" % (bar, newsum)
    prevsum = newsum

----------------------------------------------------------
bar 0: 20
bar 1: 60
bar 2: 120
bar 3: 180
bar 4: 240
bar 5: 300

Sunday, August 08, 2010

Portfolio Performance for July 2010

One thing I've noticed in trading systems over the years is the observer effect. Typically, when a thoroughly backtested system fails...the trader will dismiss the system as being too optimized.

You never hear about the observer effect with real-time trading of the system. That is probably the most difficult aspect of developing systems. Just the mere fact of participating in the price pattern you've discovered changes the price pattern. Despite how small a fish you may be in the market.

All we can hope for as system traders is finding an edge that is large enough to accommodate the increased order flow. So, when we jump in and reduce the edge...there is still enough leftover for us to be profitable. I guess, that is why I've always traded very long-term systems. And probably why I still lack confidence in this new short-term mean-reversion system.

Moving on...

July was an excellent month for the market. The portfolio was trounced. But, still finished the month with another positive number. This marks the second month trading the new system.

Adding a new chart to the reporting: Investment Levels. This reflects the amount of capital that TaylorTree is invested at the end of the month. As you can see, when the portfolio is less than 100% invested in the market and the market has a month like this one...it is extremely difficult to beat it. Cash drag kills you when the market turns around.

TaylorTree Investment Levels as of 07/2010

The picture above of the alligator is from our stay at The Retreat at Artesian Lakes located just outside Cleveland, TX. They had several cabins overlooking the lakes. Step out on your front porch and this 6ft alligator would come swimming up - day or night.

Later Trades,

MT

Saturday, August 07, 2010

New Books

Received 2 books today:

The Art of Capacity Planning by John Allspaw

The House Advantage by Jeffrey MA

I first heard about John Allspaw from his excellent slides, Ops Meta-Metrics. When I found out he had a book covering capacity planning...well, had to buy it.

Jeffrey MA was the the basis for the main character in the fun book, Bringing Down the House. I'm a stats junkie...so any book covering stats and business is a must-read for yours truly.

MT

Sunday, August 01, 2010

Exponential Moving Average (EMA)

Now that we've tackled Running Simple Moving Averages (SMA)...let's move on to Exponential Moving Averages (EMA). You may wonder why we're not covering Running Exponential Moving Averages? The default formula for EMA is the running method - so we're already covered.

Check out the posts below to understand the background on Exponential Moving Averages (EMA) and their calculation.

Be careful with using EMAs in your backtesting. Or any of these running type of indicators. Since all of them require a starting value. If that starting value changes - your signals change. Which can happen if you switch price quote providers that have different history requirements. Should not be a big deal but something to be aware of.

Let's begin. We need to calculate our smoothing factor for the time series. Typical use in technical analysis is:
$ \alpha = 2.0 / (periods + 1.0) $

We can use any value between 0 & 1 for the smoothing factor. Closer to one is less smooth and places greater weight on the more recent values. Use a value of 1 and you get the most recent value back. Closer to zero is more smooth and places greater weight on the older values.

Now, the formula for an EMA given our smoothing factor:
$ EMA_{today} = EMA_{yesterday} + \alpha(price_{today} - EMA_{yesterday}) $

Coding in Python we get:

def ema(bar, series, period, prevma, smoothing=None):
    '''Returns the Exponential Moving Average of a series.

    Keyword arguments:
    bar         -- currrent index or location of the series
    series      -- series of values to be averaged
    period      -- number of values in the series to average
    prevma      -- previous exponential moving average
    smoothing   -- smoothing factor to use in the series.
        valid values: between 0 & 1.
        default: None - which then uses formula = 2.0 / (period + 1.0)
        closer to 1 to gives greater weight to recent values - less smooth
        closer to 0 gives greater weight to older values -- more smooth
    '''
    if period < 1:
        raise ValueError("period must be 1 or greater")

    if smoothing:
        if (smoothing < 0) or (smoothing > 1.0):
            raise ValueError("smoothing must be between 0 and 1")

    else:
        smoothing = 2.0 / (period + 1.0)

    if bar <= 0:
        return series[0]

    elif bar < period:
        return cumulative_sma(bar, series, prevma)

    return prevma + smoothing * (series[bar] - prevma)


def cumulative_sma(bar, series, prevma):
    """
    Returns the cumulative or unweighted simple moving average.
    Avoids averaging the entire series on each call.

    Keyword arguments:
    bar     --  current index or location of the value in the series
    series  --  list or tuple of data to average
    prevma  --  previous average (n - 1) of the series.
    """

    if bar <= 0:
        return series[0]

    else:
        return prevma + ((series[bar] - prevma) / (bar + 1.0))

Example call and results using the typical smoothing factor of 2 / (period + 1):

prices = [32.47, 32.70, 32.77, 33.11, 33.25, 33.23, 33.23, 33.0, 33.04, 33.21]
period = 5   #number of bars to average
prevsma = prevema = prices[0]   #1st day nothing to average

for bar, close in enumerate(prices):
    currentema = ema(bar, prices, period, prevema, smoothing=None)

    #running_sma defined in simple moving average blog post
    currentsma = running_sma(bar, prices, period, prevsma)

    print "Day %02d Value=%.2f %i-bar SMA=%f and EMA=%f" % (bar + 1, close, period, currentsma, currentema)
    prevema = currentema
    prevsma = currentsma

----------------------------------------------------------
Results of call:

Day 01 Value=32.47 5-day SMA=32.470000 and EMA=32.470000
Day 02 Value=32.70 5-day SMA=32.585000 and EMA=32.585000
Day 03 Value=32.77 5-day SMA=32.646667 and EMA=32.646667
Day 04 Value=33.11 5-day SMA=32.762500 and EMA=32.762500
Day 05 Value=33.25 5-day SMA=32.860000 and EMA=32.860000
Day 06 Value=33.23 5-day SMA=33.012000 and EMA=32.983333
Day 07 Value=33.23 5-day SMA=33.118000 and EMA=33.065556
Day 08 Value=33.00 5-day SMA=33.164000 and EMA=33.043704
Day 09 Value=33.04 5-day SMA=33.150000 and EMA=33.042469
Day 10 Value=33.21 5-day SMA=33.142000 and EMA=33.098313

Sunday, July 11, 2010

Portfolio Performance for June 2010

"Time flies like an arrow; fruit flies like a banana" -- Groucho Marx

My how the months have zoomed by. As you can see from the site; I moved from blogger to wordpress. Haven't had the time to do much with the old posts. Most likely I will leave them as is and start from a clean slate. The most important posts anyway are of the portfolio performance and this post will take care of what's happening there.

You can see in the Figure 1 - VAMI; the portfolio is doing well...breaking free from the market. The main reason for this is a new trading system was added on June 1st. This trading system is a short-term mean-reversion system. It is doing as expected but taking some time to get used to. I've spent the better part of 10 years trading a long-term trend-following system. So, moving to a short-term mean-reversion system is going to take some adjustments on my part.

Again, all the charts show the portfolio breaking away from the market. June was a nasty month for the S&P 500...while the portfolio held its own. I feel I still have some work to do on the new system such as the proper risk allocation and determining a better weight for the systems in the portfolio. Also, the new system still has quite a bit of beta that I'd like to scale down. But, overall, a nice start.

On a personal note, I'm back from my yearly pilgrimage to Texas to see family & friends. Trip was good but volatile like the market.

We had to evacuate due to Hurricane Alex in the Gulf - almost lost our car.

Picture above is the day prior to the effects of Hurricane Alex.

And our transmission acted up on our stop at Broken Bow, OK - I'm guessing due to all the hill climbing.

I couldn't get over how clear the water was and the beautiful pine-lined hills surrounding the lake.

We had to cut our losses short in both of the places we would have liked to spend more time. Managing risk vs reward...being a portfolio manager is a 24x7 job.

Later Trades,

MT

Sunday, June 20, 2010

Moving to Wordpress - change your feeds

I've moved to Wordpress away from Blogger. Please change your RSS subscription feeds for this site to the following:

http://taylortree.com/feed/

And please check to see if your bookmark to the main page is working correctly. Should be: http://www.taylortree.com/

I'm not going to migrate all the Blogger posts to Wordpress. I've changed so much over the years...time for a fresh start.

Thanks,

MT

Running Simple Moving Average (SMA)

When building a platform to test trading ideas...one of the big issues to deal with is all the indicators that require a spin through the price series in order to calculate. For example, in order to calculate the 200 day simple moving average (SMA) of closing prices for Google today you would have to loop back 200 - 1 days ago and sum the closing prices and divide by 200.

When you are backtesting an idea you often need to start from day 1 of a stock's trading history and loop forward to the most current day. In essence, pretending each day is the current day at that point in time. Thus, you are looping back 200 - 1 data points for each day in the series. This isn't such a big deal with a stock such as Google whose trading history is rather limited (2004). But, take a stock like IBM with a more extensive trading history and your code is going to bog down with each call to the SMA indicator. Throw 20,000 securities into your backtest and the looping adds up.

Therefore, running calculations are the preferred method in order to spin just once through the data points. So, in order to calculate the running simple moving average for closing prices you apply the following formula:
$SMA_{today} = SMA_{yesterday} + ((Price_{today} - Price_{today - n}) /n)$
Where

$n$ = number of values included in your rolling computational window.

Straight-forward and avoids the loop. Here's the sample Python code for the Running SMA:

def cumulative_sma(bar, series, prevma):
    """
    Returns the cumulative or unweighted simple moving average.
    Avoids sum of series per call.

    Keyword arguments:
    bar     --  current index or location of the value in the series
    series  --  list or tuple of data to average
    prevma  --  previous average (n - 1) of the series.
    """
   
    if bar <= 0:
        return series[0]

    return prevma + ((series[bar] - prevma) / (bar + 1.0))

def running_sma(bar, series, period, prevma):
    """
    Returns the running simple moving average - avoids sum of series per call.

    Keyword arguments:
    bar     --  current index or location of the value in the series
    series  --  list or tuple of data to average
    period  --  number of values to include in average
    prevma  --  previous simple moving average (n - 1) of the series
    """

    if period < 1:
        raise ValueError("period must be 1 or greater")

    if bar <= 0:
        return series[0]

    elif bar < period:
        return cumulative_sma(bar, series, prevma)

    return prevma + ((series[bar] - series[bar - period]) / float(period))

And the example call and results:

prices = [10, 15, 25, 18, 13, 16]
prevsma = prices[0]   #1st day nothing to average so return itself.
for bar, close in enumerate(prices):
    currentsma = running_sma(bar, prices, 3, prevsma)
    print "Today's 3-day SMA = %.4f" % currentsma
    prevsma = currentsma

------- Results ----------------
Today's 3-day SMA = 10.0000
Today's 3-day SMA = 12.5000
Today's 3-day SMA = 16.6667
Today's 3-day SMA = 19.3333
Today's 3-day SMA = 18.6667
Today's 3-day SMA = 15.6667