# Aligning Time Series data and calculating CAPM beta.

57 views (last 30 days)
Ms. Mat on 26 Dec 2012
I have time series data(daily) for 10 years for 500 firms. I would like to calculate monthly beta. I have the data on an excel sheet with column company code name, date, closing price. for the 100 firms. There are lots of gaps in the data and data is in 1 single excel sheet.
This is wat I did;
1. I pivoted the data. (date as column headers and company codes as rows). (Not sure if pivoting was absolutely required)
2. Created a FOR loop, looping through each company
for i=2:500
3. Used datevec to get the Y and M.
[Y, M] = datevec(datesCol);
4. Pulled out the closing price for the company (one column in the pivot)
coPriceCol = (pvt_dta(i,2:end))';
5. Created a FOR loop, looping through each year
for yr=2001:2010
6. Created a FOR loop, looping through each month
for mo=1:12
7. Pull out data for the particular year and month
stk_data_monthly = coPriceCol(M==mo & Y==yr);
mkt_dta_monthly = mktPriceCol(M==mo & Y==yr);
8. Check if stock data and market data for the month is completely empty
if(isnan(stk_data_monthly))
elseif(isnan(mkt_dta_monthly))
9. Now before I work with this I need only data for which dates are aligned between the stock data and market data. I knew time series could help, but I am a newbie, had no clue, so decided to work it out with what I already know. This is what I did,
nor = ~(isnan(stk_data_monthly) | isnan(mkt_dta_monthly));
stk_data_monthly = stk_data_monthly(nor);
mkt_dta_monthly = mkt_dta_monthly(nor);
10. I then calculated returns as follows,
stk_returns = (stk_data_monthly(2:end,:) - stk_data_monthly(1:end-1,:)) ./ stk_data_monthly(1:end-1,:) ;
mkt_returns = (mkt_dta_monthly(2:end,:) - mkt_dta_monthly(1:end-1,:)) ./ mkt_dta_monthly(1:end-1,:) ;
11. Then regressed the data,
mkt_returns = [ones(size(mkt_returns,1), 1) mkt_returns];
regRes = (inv(mkt_returns'*mkt_returns)) * (mkt_returns'*stk_returns);
Now, I am sure there is a better way to go about this. While I don't have a major problem with the time it takes to run, (except reading the excel file), I would like to know how can this be done differently and of course better and more efficiently.
1. how to eliminate the FOR loops and
2. if there is a smarter way to pull out data pertaining to a particular month and year and
3. if the aligning of dates can be done some way other than using the OR+NOT operation.
P.S: I am at the beginner's level. So this code might sound atrocious.

Laura Proctor on 26 Dec 2012
You're doing great for a beginner. You're using logical indexing in your code which is awesome.
I'm assuming that you are getting expected results with the code you have so far?
You don't need to pivot the data, but it might make it easier to work with. Did you use the import tool to import the data or did you do it programmatically? Could you show the code for importing the data?
You can probably remove the outer FOR loop that goes through each company completely and so for step 4:
coPrice = (pvt_data(2:500,2:end))';
I think that you might be able to do something that removes the for loops for the year and month using accumarray, but I'll need to think about that one for a bit.
I'm not sure what you are checking for step #8. If you want to check to see if there are all NaNs in the column of data, you can use all:
if all(isnan(coPrice))
But, I'm not sure that you even need to have this IF/ELSEIF statement in your code... I don't see how you are using it in the code that you have shown.
You may wish to use another variable name other than nor in step #9 since nor is a MATLAB function name.
For returns, you can use the diff function:
stk_returns = diff(stk_data_monthly) ./ stk_data_monthly(1:end-1,:) ;
And, you can regress the data using the backslash operator which is better practice than using inv:
regRes = (mkt_returns'*mkt_returns)\(mkt_returns'*stk_returns);
In fact, you may find that it isn't necessary to precondition both sides by mkt_returns'.
regRes = mkt_returns\stk_returns;
Ms. Mat on 26 Dec 2012
Thanks You Laura.
I am able to get desired results.
[dta_num , dta_txt , dta_raw] = xlsread('C:\.....);
clear dta_raw;
dta_num(:,2) = datenum(dta_txt(2:end,2),'dd/mm/yyyy');
pvt_dta = Pivot(dta_num(:,[2,1,3]));
I am not using step-8 subsequently, it is just for informative purpose, so that I can collect the data to make the results better. It is just to give me an idea how many betas I have calculated over all.
I don't follow your last statement where you have mentioned that it is not necessary to precondition both side by mkt_returns.
I read that Time Series Object ignores Nan as missing data. If i regress 2 time series data, will it take care of aligning the dates instead of me performing the NOR operation ?
Also by using datevec and using it to index data(step 3 and step 7), I am relying on the order of the data. Is this ok ? Will time series allow me to get the year and month in some better way ?