function f = calccod(x,y,dim,wantgain,wantmeansub) <x>,<y> are matrices with the same dimensions <dim> (optional) is the dimension of interest. default to 2 if <x> is a (horizontal) vector and to 1 if not. special case is 0 which means to calculate globally. <wantgain> (optional) is 0 means normal 1 means allow a gain to be applied to each case of <x> to minimize the squared error with respect to <y>. in this case, there cannot be any NaNs in <x> or <y>. 2 is like 1 except that gains are restricted to be non-negative. so, if the gain that minimizes the squared error is negative, we simply set the gain to be applied to be 0. default: 0. <wantmeansub> (optional) is 0 means do not subtract any mean. this makes it such that the variance quantification is relative to 0. 1 means subtract the mean of each case of <y> from both <x> and <y> before performing the calculation. this makes it such that the variance quantification is relative to the mean of each case of <y>. note that <wantgain> occurs before <wantmeansub>. default: 1. calculate the coefficient of determination (R^2) indicating the percent variance in <y> that is explained by <x>. this is achieved by calculating 100*(1 - sum((y-x).^2) / sum(y.^2)). note that by default, we subtract the mean of each case of <y> from both <x> and <y> before proceeding with the calculation. the quantity is at most 100 but can be 0 or negative (unbounded). note that this metric is sensitive to DC and scale and is not symmetric (i.e. if you swap <x> and <y>, you may obtain different results). it is therefore fundamentally different than Pearson's correlation coefficient (see calccorrelation.m). NaNs are handled gracefully (a NaN causes that data point to be ignored). if there are no valid data points (i.e. all data points are ignored because of NaNs), we return NaN for that case. note some weird cases: calccod([],[]) is [] history: 2013/08/18 - fix pernicious case where <x> is all zeros and <wantgain> is 1 or 2. 2010/11/28 - add <wantgain>==2 case 2010/11/23 - changed the output range to percentages. thus, the range is (-Inf,100]. also, we removed the <wantr> input since it was dumb. example: x = randn(1,100); calccod(x,x+0.1*randn(1,100))
0001 function f = calccod(x,y,dim,wantgain,wantmeansub) 0002 0003 % function f = calccod(x,y,dim,wantgain,wantmeansub) 0004 % 0005 % <x>,<y> are matrices with the same dimensions 0006 % <dim> (optional) is the dimension of interest. 0007 % default to 2 if <x> is a (horizontal) vector and to 1 if not. 0008 % special case is 0 which means to calculate globally. 0009 % <wantgain> (optional) is 0010 % 0 means normal 0011 % 1 means allow a gain to be applied to each case of <x> 0012 % to minimize the squared error with respect to <y>. 0013 % in this case, there cannot be any NaNs in <x> or <y>. 0014 % 2 is like 1 except that gains are restricted to be non-negative. 0015 % so, if the gain that minimizes the squared error is negative, 0016 % we simply set the gain to be applied to be 0. 0017 % default: 0. 0018 % <wantmeansub> (optional) is 0019 % 0 means do not subtract any mean. this makes it such that 0020 % the variance quantification is relative to 0. 0021 % 1 means subtract the mean of each case of <y> from both 0022 % <x> and <y> before performing the calculation. this makes 0023 % it such that the variance quantification 0024 % is relative to the mean of each case of <y>. 0025 % note that <wantgain> occurs before <wantmeansub>. 0026 % default: 1. 0027 % 0028 % calculate the coefficient of determination (R^2) indicating 0029 % the percent variance in <y> that is explained by <x>. this is achieved 0030 % by calculating 100*(1 - sum((y-x).^2) / sum(y.^2)). note that 0031 % by default, we subtract the mean of each case of <y> from both <x> 0032 % and <y> before proceeding with the calculation. 0033 % 0034 % the quantity is at most 100 but can be 0 or negative (unbounded). 0035 % note that this metric is sensitive to DC and scale and is not symmetric 0036 % (i.e. if you swap <x> and <y>, you may obtain different results). 0037 % it is therefore fundamentally different than Pearson's correlation 0038 % coefficient (see calccorrelation.m). 0039 % 0040 % NaNs are handled gracefully (a NaN causes that data point to be ignored). 0041 % 0042 % if there are no valid data points (i.e. all data points are 0043 % ignored because of NaNs), we return NaN for that case. 0044 % 0045 % note some weird cases: 0046 % calccod([],[]) is [] 0047 % 0048 % history: 0049 % 2013/08/18 - fix pernicious case where <x> is all zeros and <wantgain> is 1 or 2. 0050 % 2010/11/28 - add <wantgain>==2 case 0051 % 2010/11/23 - changed the output range to percentages. thus, the range is (-Inf,100]. 0052 % also, we removed the <wantr> input since it was dumb. 0053 % 0054 % example: 0055 % x = randn(1,100); 0056 % calccod(x,x+0.1*randn(1,100)) 0057 0058 % input 0059 if ~exist('dim','var') || isempty(dim) 0060 if isvector(x) && size(x,1)==1 0061 dim = 2; 0062 else 0063 dim = 1; 0064 end 0065 end 0066 if ~exist('wantgain','var') || isempty(wantgain) 0067 wantgain = 0; 0068 end 0069 if ~exist('wantmeansub','var') || isempty(wantmeansub) 0070 wantmeansub = 1; 0071 end 0072 0073 % handle weird case up front 0074 if isempty(x) 0075 f = []; 0076 return; 0077 end 0078 0079 % handle 0 case 0080 if dim==0 0081 x = x(:); 0082 y = y(:); 0083 dim = 1; 0084 end 0085 0086 % handle gain 0087 if wantgain 0088 % to get the residuals, we want to do something like y-x*inv(x'*x)*x'*y 0089 temp = 1./dot(x,x,dim) .* dot(x,y,dim); 0090 if wantgain==2 0091 temp(temp < 0) = 0; % if the gain was going to be negative, rectify it to 0. 0092 end 0093 x = bsxfun(@times,x,temp); 0094 end 0095 0096 % propagate NaNs (i.e. ignore invalid data points) 0097 x(isnan(y)) = NaN; 0098 y(isnan(x)) = NaN; 0099 0100 % handle mean subtraction 0101 if wantmeansub 0102 mn = nanmean(y,dim); 0103 y = bsxfun(@minus,y,mn); 0104 x = bsxfun(@minus,x,mn); 0105 end 0106 0107 % finally, compute it 0108 f = 100*(1 - zerodiv(nansum((y-x).^2,dim),nansum(y.^2,dim),NaN,0)); 0109 0110 0111 0112 0113 0114 0115 % JUNK: 0116 % 0117 % % <wantr> (optional) is whether to apply signedarraypower(f,0.5) 0118 % % at the very end, giving something like a correlation coefficient (r). 0119 % % default: 1. 0120 % 0121 % if ~exist('wantr','var') || isempty(wantr) 0122 % wantr = 1; 0123 % end 0124 % 0125 % if wantr 0126 % f = signedarraypower(f,0.5); 0127 % end