Steady State Vector of Markov Chains
Numpy Strategies 0.1.3
Unlucky Sprint 13. Scary, especially if you suffer from Triskaidekaphobia. The task for this sprint:
Determine the steady state vector of our Markov chain model.
This task was part of the story:
I want to know how to bet in the long term.
A very trivial task especially with NumPy. Far into the distant future or in theory infinity, the state of our Markov chain system will not change anymore. This is also called a steady state. The stochasic matrix A we calculated last time applied to the steady state, will yield the same state x. Using mathematical notation:
Ax = x
Another way to look at this is as the eigenvector for eigenvalue 1.
1. Smoothing
Last time we counted the number of occurrences for each state transition in our model. This gave us a bit skewed results, because in our sample a Flat or no change situation never occurred. In real life we could have a day that the close price does not change, although unlikely for liquid stock markets. One way to deal with 0 occurrences is to apply additive smoothing. The idea is to add a certain constant to the number of occurrences we find, getting rid of zeroes. Of course, we will have to normalize the probabilites afterwards. Two lines have to change due to the smoothing process. First, change this line
N = len(start_indices)
to
N = len(start_indices) + k * NDIM
where k is an integer smoothing constant and NDIM is the number of states, in our case 3. Second, update the line
SM[i][j] = occurrences/float(N)
to
SM[i][j] = (occurrences + k)/float(N)
The stochastic matrix for k=1 becomes
[[ 0.47826087 0.00869565 0.51304348]
[ 0.33333333 0.33333333 0.33333333]
[ 0.41134752 0.0070922 0.58156028]]
2. Eigenvalues and eigenvectors
To get the eigenvalues and eigenvectors we will need the numpy linalg module and the eig function.
eig_out = numpy.linalg.eig(SM)
print eig_out
The eig function returns an array containing the eigenvalues and an array containing the eigenvectors.
(array([ 1. , 0.06739662, 0.32575786]), array([[ 0.57735027, 0.7670805 , 0.0082198 ],
[ 0.57735027, -0.19564812, -0.99986103],
[ 0.57735027, -0.61099044, 0.01450345]]))
3. Selecting the eigenvector for eigenvalue 1
Currently we are only interested in the eigenvector for eigenvalue 1. In reality the eigenvalue might not be exactly 1, so we should build in a margin for error. We can find the index for eigenvalue between 0.9 and 1.1 as follows
idx_vec = numpy.where(numpy.abs(eig_out[0] - 1) < 0.1)
print "Index eigenvalue 1", idx_vec
We get index 0 as expected
Index eigenvalue 1 (array([0]),)
Select the steady state vector
x = eig_out[1][:,idx_vec].flatten()
print "Steady state vector", x
The steady state vector
Steady state vector [ 0.57735027 0.57735027 0.57735027]
4. Check
The check is simple – just multiply the stochastic matrix and the steady state vector we got.
print "Check", numpy.dot(SM, x)
Seems like the result is correct. I will leave it up to you to decide whether it is meaningful or not. Task resolved.
Check [ 0.57735027 0.57735027 0.57735027]
Here is the code in its entirety with imports and all.
#!/usr/bin/env python
from matplotlib.finance import quotes_historical_yahoo
from datetime import date
import numpy
import sys
today = date.today()
start = (today.year - 1, today.month, today.day)
quotes = quotes_historical_yahoo(sys.argv[1], start, today)
close = [q[4] for q in quotes]
states = numpy.sign(numpy.diff(close))
NDIM = 3
SM = numpy.zeros((NDIM, NDIM))
signs = [-1, 0, 1]
k = int(sys.argv[2])
for i in xrange(len(signs)):
#we start the transition from the state with the specified sign
start_indices = numpy.where(states[:-1] == signs[i])[0]
N = len(start_indices) + k * NDIM
# skip since there are no transitions possible
if N == 0:
continue
#find the values of states at the end positions
end_values = states[start_indices + 1]
for j in xrange(len(signs)):
# number of occurrences of this transition
occurrences = len(end_values[end_values == signs[j]])
SM[i][j] = (occurrences + k)/float(N)
print SM
eig_out = numpy.linalg.eig(SM)
print eig_out
idx_vec = numpy.where(numpy.abs(eig_out[0] - 1) < 0.1)
print "Index eigenvalue 1", idx_vec
x = eig_out[1][:,idx_vec].flatten()
print "Steady state vector", x
print "Check", numpy.dot(SM, x)
Have a go
Let's put the Product Owner hat on. Possible improvement stories I can think of:
Fine tune the smoothing constant.
Introduce more states.
Use different parameters, for instance, option price or volume.
Try out different stocks, stock indices or a combination of stocks.
Extend to different time periods instead of days.
Make some unit tests.
I am looking forward to hearing about your improvements.
If you liked this post and are interested in NumPy check out NumPy Beginner's Guide by yours truly. I would like to thank Mike Driscoll for his book review on the Mouse vs Python blog and on Amazon.com I will be back next week with your Christmas present, which will remain a surprise for now.


