clear
clc
echo on
format compact
% In this example we will reexamine the Michaelis-Menten kinetics model
% demonstrated in Tutorial 5.  The kinetics can be used to describe enzyme
% activity or for cell growth - basically any case where you have a
% reaction which is first order in a substrate at low concentrations, but
% which reaches a limiting reaction rate at high substrate concentrations.
% The reaction rate expression is:
%
% r = Vmax [S] / (Km + [S]) 
% 
% Essentially, the reaction rate (or growth rate) saturates for
% concentrations much greater than Km to a maximum value of Vmax.  The goal
% is to figure out what Km and Vmax are!
pause
% We have a set of measured reaction rates r as a function of substrate
% concentrations S.  We want to solve for the rate via linear regression.
% As was shown in the tutorial, because the model is non-linear in Vmax and
% Km, we have to transform it to use linear regression.  We do this (in
% this case only!) by taking the inverse:
%
% 1/r = 1/Vmax + (Km/Vmax)*(1/[S])
%
% The new modeling parameters are 1/Vmax and Km/Vmax, and the modeling
% functions are just 1 and 1/S.  The "right hand side" measurements are
% just 1/r.  Usually we can measure S pretty accurately.  Unfortunately,
% calculation of the rate involves taking the time derivative of the
% measured product, and this produces a significant scatter.  When you then
% take the inverse, the error in 1/r becomes quite large when r is small.
% In this example we will see how to deal with this.
pause
% Let's generate some data:
Vmaxexact=2;
Kmexact=5;
S=[.5 1 2 4 10 20]';
r=Vmaxexact*S./(Kmexact+S);

% and we add in a little noise:
rn=r+.05*randn(size(r));

% We also have the smooth exact curve:
Sm=[min(S):.01:max(S)]';
rexact=Vmaxexact*Sm./(Kmexact+Sm);

% Now we can plot this up:
figure(1)
plot(S,rn,'o',Sm,rexact)
set(gca,'FontSize',18)
xlabel('[S]')
ylabel('Reaction Rate')

% So the exact relation (what was used to generate the data) matches up
% pretty well with the data - no surprise.
pause
% OK, now we want to do linear regression.  We have the matrix a and the
% measurement vector b:
a=[ones(size(S)),1.0./S]
b=1.0./rn

% and the fitted values:
x=a\b

%and we can plot this up:
figure(2)
am=[ones(size(Sm)),1.0./Sm];
plot(1.0./S,b,'o',1.0./Sm,am*x)
set(gca,'FontSize',18) 
xlabel('1/[S]')
ylabel('1/r')
pause
% The line appears to go through the data, but the constants are way off!
% It will be different every time you run it, but we can compare the
% calculated values to the correct Km and Vmax we generated the data from:
Vmaxfit=1./x(1)
Kmfit=x(2)/x(1)
% which are pretty far from the correct values of 2 and 5, respectively!
% Sometimes they aren't even the correct sign!
%
% The values are so far off because we are weighting all of the data points
% equally (reasonable if we were working with r), but this is incorrect if
% we are working with 1/r!  We can fix this using weighted linear
% regression.
pause
% Suppose we assume that each of the measured rates have the same variance.
% From error propagation, this means that the inverse of the rates have a 
% variance which is proportional to 1/r^4.  Thus, we should weight each 
% data point by r^4.  This yields the weighted problem:
kw=inv(a'*diag(rn.^4)*a)*a'*diag(rn.^4);
% and the solution:
xw=kw*b
Vmaxwfit=1./xw(1)
Kmwfit=xw(2)/xw(1)
% which isn't too bad!
pause
% We can also add these curves to the graphs:
figure(1)
hold on
plot(Sm,1.0./(am*x),'--r',Sm,1.0./(am*xw),'-.k')
hold off
legend('data','exact','unweighted fit','weighted fit','Location','SouthEast')
axis([0 20 0 2])

figure(2)
hold on
plot(1.0./Sm,am*xw,'-.k')
hold off
legend('data','unweighted fit','weighted fit','Location','SouthEast')
pause
% We could also determine the error in the fitted values.  Consistent with
% our assumption that the error in the measured rate is constant, we get:
resid=rn-1.0./(a*xw);
% which we can also plot up:
figure(3)
plot(S,resid,'o')
set(gca,'FontSize',18)
xlabel('[S]')
ylabel('residual in r')
grid on
% which looks nicely scattered.  We can get the variance in r:
rvar=norm(resid)^2/(length(resid)-2);
rstd=rvar.^.5
% which is pretty close to what we used to generate the data.
pause
% The variance in 1/r is obtained from the error propagation formula:
varb=rvar./rn.^4;
% and thus the error in xw is:
varxw=kw*diag(varb)*kw'
% which can be used to get the uncertainty in the desired values of Vmax
% and Km:
gradf=[-1/xw(1)^2,0;-xw(2)/xw(1)^2,1/xw(1)];
varVmaxKm=gradf*varxw*gradf';
Vmaxstd=varVmaxKm(1,1)^.5
Kmstd=varVmaxKm(2,2)^.5
% so we have the 70% confidence intervals:
Vmaxinterval=Vmaxwfit+tinv([.15 .85],(length(rn)-2))*Vmaxstd
Kminterval=Kmwfit+tinv([.15 .85],(length(rn)-2))*Kmstd
% Which, at least 70% of the time, contains the correct values of 2 and 5!
echo off