Home > analyzePRF > utilities > calccorrelation.m

calccorrelation

PURPOSE ^

function f = calccorrelation(x,y,dim,wantmeansubtract,wantgainsensitive)

SYNOPSIS ^

function f = calccorrelation(x,y,dim,wantmeansubtract,wantgainsensitive)

DESCRIPTION ^

 function f = calccorrelation(x,y,dim,wantmeansubtract,wantgainsensitive)

 <x>,<y> are matrices with the same dimensions
 <dim> (optional) is the dimension of interest.
   default to 2 if <x> is a (horizontal) vector and to 1 if not.
   special case is 0 which means to calculate globally.
 <wantmeansubtract> (optional) is whether to subtract mean first. default: 1.
 <wantgainsensitive> (optional) is whether to make the metric sensitive
   to gain. default: 0.

 calculate Pearson's correlation coefficient between <x> and <y> for
 each case oriented along <dim>.  this is achieved by taking
 the two vectors implicated in each case, subtracting each vector's mean,
 normalizing each vector to have unit length, and then taking their dot product.

 a special case is when <wantmeansubtract>==0, in which case we omit the mean-
 subtraction.  this technically isn't Pearson's correlation any more.

 another special case is <wantgainsensitive>, in which case we take the two
 vectors implicated in each case, subtract each vector's mean (if 
 <wantmeansubtract>), normalize the two vectors by the same constant
 such that the larger of the two vectors has length 1, and then take the dot-
 product.  note that this metric still ranges from -1 to 1 and that any gain
 mismatch can only hurt you (i.e. push you towards a correlation of 0).
 (in retrospect, we now observe that it is probably better to use calccod.m in order
 to be sensitive to gain, rather than to use <wantgainsensitive> in this function.)

 if there is no variance in one (or more) of the inputs, the 
 result is returned as NaN.

 NaNs are handled gracefully (a NaN causes that data point to be ignored).

 if there are no valid data points (i.e. all data points are
 ignored because of NaNs), we return NaN for that case.

 we don't use caution with respect to cases involving low variance (see unitlength.m).

 be careful of the presumption that the mean and scale of <x> and <y> can be discounted.
 if you do not want to perform this discounting, use calccod.m instead!

 note some weird cases:
   calccorrelation([],[]) is []

 example:
 calccorrelation(randn(1,100),randn(1,100))

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function f = calccorrelation(x,y,dim,wantmeansubtract,wantgainsensitive)
0002 
0003 % function f = calccorrelation(x,y,dim,wantmeansubtract,wantgainsensitive)
0004 %
0005 % <x>,<y> are matrices with the same dimensions
0006 % <dim> (optional) is the dimension of interest.
0007 %   default to 2 if <x> is a (horizontal) vector and to 1 if not.
0008 %   special case is 0 which means to calculate globally.
0009 % <wantmeansubtract> (optional) is whether to subtract mean first. default: 1.
0010 % <wantgainsensitive> (optional) is whether to make the metric sensitive
0011 %   to gain. default: 0.
0012 %
0013 % calculate Pearson's correlation coefficient between <x> and <y> for
0014 % each case oriented along <dim>.  this is achieved by taking
0015 % the two vectors implicated in each case, subtracting each vector's mean,
0016 % normalizing each vector to have unit length, and then taking their dot product.
0017 %
0018 % a special case is when <wantmeansubtract>==0, in which case we omit the mean-
0019 % subtraction.  this technically isn't Pearson's correlation any more.
0020 %
0021 % another special case is <wantgainsensitive>, in which case we take the two
0022 % vectors implicated in each case, subtract each vector's mean (if
0023 % <wantmeansubtract>), normalize the two vectors by the same constant
0024 % such that the larger of the two vectors has length 1, and then take the dot-
0025 % product.  note that this metric still ranges from -1 to 1 and that any gain
0026 % mismatch can only hurt you (i.e. push you towards a correlation of 0).
0027 % (in retrospect, we now observe that it is probably better to use calccod.m in order
0028 % to be sensitive to gain, rather than to use <wantgainsensitive> in this function.)
0029 %
0030 % if there is no variance in one (or more) of the inputs, the
0031 % result is returned as NaN.
0032 %
0033 % NaNs are handled gracefully (a NaN causes that data point to be ignored).
0034 %
0035 % if there are no valid data points (i.e. all data points are
0036 % ignored because of NaNs), we return NaN for that case.
0037 %
0038 % we don't use caution with respect to cases involving low variance (see unitlength.m).
0039 %
0040 % be careful of the presumption that the mean and scale of <x> and <y> can be discounted.
0041 % if you do not want to perform this discounting, use calccod.m instead!
0042 %
0043 % note some weird cases:
0044 %   calccorrelation([],[]) is []
0045 %
0046 % example:
0047 % calccorrelation(randn(1,100),randn(1,100))
0048 
0049 % input
0050 if ~exist('dim','var') || isempty(dim)
0051   if isvector(x) && size(x,1)==1
0052     dim = 2;
0053   else
0054     dim = 1;
0055   end
0056 end
0057 if ~exist('wantmeansubtract','var') || isempty(wantmeansubtract)
0058   wantmeansubtract = 1;
0059 end
0060 if ~exist('wantgainsensitive','var') || isempty(wantgainsensitive)
0061   wantgainsensitive = 0;
0062 end
0063 
0064 % handle weird case up front
0065 if isempty(x)
0066   f = [];
0067   return;
0068 end
0069 
0070 % handle 0 case
0071 if dim==0
0072   x = x(:);
0073   y = y(:);
0074   dim = 1;
0075 end
0076 
0077 % propagate NaNs (i.e. ignore invalid data points)
0078 x(isnan(y)) = NaN;
0079 y(isnan(x)) = NaN;
0080 
0081 % subtract mean
0082 if wantmeansubtract
0083   x = zeromean(x,dim);
0084   y = zeromean(y,dim);
0085 end
0086 
0087 % normalize x and y to be unit length
0088 [xnorm,xlen] = unitlength(x,dim,[],0);
0089 [ynorm,ylen] = unitlength(y,dim,[],0);
0090 
0091 % if gain sensitive, then do something tricky:
0092 if wantgainsensitive
0093 
0094   % case where x is the bigger vector
0095   temp = xnorm .* bsxfun(@(x,y) zerodiv(x,y,NaN,0),y,xlen);
0096   bad = all(isnan(temp),dim);
0097   f1 = nansum(temp,dim);
0098   f1(bad) = NaN;
0099 
0100   % case where y is the bigger vector
0101   temp = bsxfun(@(x,y) zerodiv(x,y,NaN,0),x,ylen) .* ynorm;
0102   bad = all(isnan(temp),dim);
0103   f2 = nansum(temp,dim);
0104   f2(bad) = NaN;
0105   
0106   % figure out which one to use
0107   f = f1;                              % this is when x is bigger
0108   f(xlen < ylen) = f2(xlen < ylen);    % this is when y is bigger
0109 
0110 % if not, just take the dot product
0111 else
0112   temp = xnorm .* ynorm;
0113     % at this point, we want to sum along dim.  however, if a case has all NaNs, we need the output to be NaN.
0114   bad = all(isnan(temp),dim);  % record bad cases
0115   f = nansum(temp,dim);        % do the summation
0116   f(bad) = NaN;                % explicitly set bad cases to NaN
0117 end

Generated on Wed 18-Jun-2014 21:47:41 by m2html © 2005