clear
format compact
format short e
echo on
%Solution to Problem 7
%
%Save the files problem7.dat and linfit.m
%to the appropriate directory before running
%this program!
%
%In this problem we are asked to do a
%combined linear and non-linear regression
%of a set of data.  We are trying to get
%two exponential growth factors and two
%preexponential constants.  The fitting
%function is of the form:
%
%  c = A exp(ka*t) + B exp(kb*t)
%
%We shall do the linear regression part in
%a separate function, and only do the non-linear
%regression (optimization) on the two rate
%constants.  This is much more efficient than
%doing the whole problem as a four parameter
%non-linear optimization problem.
pause

%We want to pass the arrays t, c, and the parameters
%A and B to the function linfit which does the linear
%part of the optimization.  Thus we use the global
%command:
global tpass cpass capass cbpass
pause

%First we read in the data:
load problem7.dat
t = problem7(:,1);
c = problem7(:,2);
tpass=t;
cpass=c;
%Note that the data should be saved under the
%name problem7.dat In the appropriate
%directory.  Let's plot this up:
figure(1)
plot(t,c,'o')
hold on
pause

%The routine for finding the optimum is very simple:
k=[1,-1];
k=fminsearch('linfit',k);
ca=capass;
cb=cbpass;
%The best fit values are thus:
[k(1),k(2),ca,cb]
pause

%We can plot this up:
figure(1)
plot(t,ca*exp(k(1)*t)+cb*exp(k(2)*t))
hold off
%which agrees very well.
pause

%It is interesting to see just how reliable this
%fit to the data is.  We can calculate the random
%error in the concentration data:
n=length(t);
cvar=linfit(k)/(n-4);
cstd=cvar^.5
%which is pretty small.
pause

%This does not really answer the question of how
%good the coefficients are, however, because it does
%not include the sensitivity of the coefficients
%to the error in the data.  The simplest way of
%calculating the matrix of covariance is to determine
%the sensitivity of the parameters to variations
%in the measured data points.  Effectively,
%this means that you are calculating grad F
%where F is the array of best fit parameters
%and the gradient is the gradient with respect
%to the measured value of each data point
%individually.  If the sensitivity to (or error
%in) each data point is small, then we can use
%our standard error propagation formula for
%arbitrary non-linear functions!
pause

%We must first calculate the matrix corresponding
%to grad F.  We do this numerically.
ep=0.1*cvar^.5;
%We have chosen eps to be scaled with the
%standard deviation of the data points.  Note that
%if we make it too small, we won't calculate the
%gradient properly due to the tolerance of fminsearch!
%We use this to compute the derivatives:
gradf=zeros(4,n);
for i=1:n
 cpass=c;
 cpass(i)=c(i)+ep;
 knew=fminsearch('linfit',k);
 gradf(1:2,i)=(knew-k)'/ep;
 gradf(3,i)=(capass-ca)/ep;
 gradf(4,i)=(cbpass-cb)/ep;
 echo off
end
echo on
pause

%We also need the variance in the measured
%concentrations.  We shall assume that this is
%constant, and that all of the measurements are
%independent.  We had gotten this already above
%as cvar.  Thus, we have:
varf=gradf*gradf'*cvar
%or the standard deviations:
[[k,ca,cb]',diag(varf).^.5]
%If the number of data points was very large,
%the time to compute grad F would have become
%prohibitive.  In that case it might be faster
%to use an alternative technique such as the
%bootstrap or undersampling.  In either case,
%it is much harder (computationally more
%expensive) to compute the error than it is to
%get the answer in the first place!
pause

%The 95% confidence intervals for the two growth
%rates are:
sig=diag(varf).^.5;
interval=[k'-2*sig(1:2),k'+2*sig(1:2)]
pause

%As a final note, we can examine (graphically) how
%variations in the two growth rates affect the fit to
%the data:
ka=[k(1)-1:.2:k(1)+1];
kb=[k(2)-1:.2:k(2)+1];
ni=length(ka);
nj=length(kb);
z=zeros(nj,nj);
for i=1:ni
 for j=1:nj
  z(i,j)=linfit([ka(i),kb(j)]);
  echo off
 end
end
echo on
figure(2)
contour(kb,ka,z,30)
xlabel('kb')
ylabel('ka')
pause

%Note that the sum of squares forms a curving trough
%rather than a sharp minimum.  For some data sets
%(particularly ones in which both growth rates are
%positive and not too different) the minimum is
%nearly impossible to find.  Such problems are
%close to degeneracy.
echo off