clear;
clc;
echo on
format compact
% Tutorial 6

% In this tutorial we will review how to do linear regressions by
% looking at the Michaelis Menten model. This model is used to study the
% enzyme kinetics.

% The following reaction takes place:
% E+S <---> ES ---> E+P
% 
% Where E stands for enzyme, S means substrate, ES is a complex that is
% decomposed back into the enzyme itself and a desired product P.
% 
% Leonor Michaelis and Maud Menten proposed the following kinetic model to
% describe the process:
% 
% d[P]/dt = Vmax [S] / (Km + [S]) = r
% 
% Consider the following experimental measurements:

S=[2.5 5.0 10.0 15.0 20.0]'; %Concentration of substrate
r = [0.024 0.036 0.053 0.060 0.064]'; %Rate of reaction

% Let's plot both quantities

figure(1)
plot(S,r,'*')
xlabel('Substrate')
ylabel('Rate of reaction')

% We can see that the reaction rate is not linear with concentration. It
% seems like the rate is leveling off as we increase the substrate
% concentration.

% Let's find out what the parameters "Vmax" and "Km" are. We can first
% linearize the above equation as follows:

% 1/r = (Km + [S])/(Vmax [S])
% 1/r = 1/Vmax + Km/(Vmax[S])

% This is linear in the modeling parameters (now 1/Vmax and Km/Vmax). Note
% that the terminology "linear" refers only to the model's dependence on
% its parameters.  This is known as the Lineweaver-Burke Plot (or
% linearization).

Sinv = S.^-1;
rinv = r.^-1;


figure(2)
plot(Sinv,rinv,'*')
xlabel('1/S')
ylabel('1/r')

% This is a linear function and we can easily find the parameters!
matrix = [ones(length(rinv),1),Sinv];

x=matrix\rinv

vmax=1/x(1)
km=x(2)/x(1)

% We add the fitted curve to the two figures:

figure(1)
hold on
myconc=[2.5:0.1:20.0];
myrate= vmax*myconc./(km+myconc);
plot(myconc,myrate,'r')
legend('data','Lineweaver-Burke Fit','location','SouthEast')
title('Michaelis Menten Kinetics')
hold off

figure(2)
hold on
plot(Sinv,matrix*x,'r')
legend('data','Lineweaver-Burke Fit','location','SouthEast')
title('Lineweaver-Burke Plot')
hold off

% Let's recompute the parameters using the normal equations and extract 
% from these the error of km and vm.

myrate=vmax*S./(km+S);
myrate_inv=myrate.^-1;

A=matrix;  %We use the same matrix
kmatrix=(A'*A)^-1*A';
newx=kmatrix*rinv
% Which is the same, of course!

% Now we want to calculate the standard deviation of these parameters.  
% First, we need to estimate the error in the residual - the difference
% between the fitted values and the measurements:

resid=rinv-A*x;

% We need to plot up the residuals to see if they are random and normally
% distributed!

figure(3)
plot(1./S,resid,'o')
grid on
xlabel('1/S')
ylabel('residual')
title('Plot of Residuals for Lineweaver-Burke Linearization')

% Assuming that the residuals are all independent and normally distributed
% with the same standard deviation (this may -not- be correct due to the 
% linearization procedure!), we get:

varrinv=norm(resid)^2/(length(resid)-2);

% where the -2 comes from the loss of two degrees of freedom due to the
% two fitting parameters.

% Because x is just kmatrix*rinv, the matrix of covariance is just:

varx=kmatrix*diag(varrinv)*kmatrix'

% Note that we have assumed that the measurement errors are all independent,
% otherwise the matrix of covariance of rinv wouldn't be diagonal!
%
% The diagonal elements of the matrix are the variances in x(1) and x(2).
% Because we are interested in the uncertainty in vmax and km, though, the
% off diagonal elements (covariances) matter too!
%
% Using the error propagation formula, we have for the gradients (dependence
% of vmax and km on x(1) and x(2):

gradf=[-1/x(1)^2,0;-x(2)/x(1)^2,1/x(1)]

% The first row is the gradient in vmax, and the second is the gradient in
% km.  Each column is the derivative with respect to x(1) and x(2).
%
% So we have the error in f:

varf=gradf*varx*gradf'

% The diagonal elements are the variances of vmax and km:

sigvmax=varf(1,1)^.5
sigkm=varf(2,2)^.5

% Now we need to get a 95% confidence interval.  We only had 5 data points,
% and lost two degrees of freedom from the fitting, so we only have 3
% degrees of freedom left.  Thus we use the t-statistic:

vmaxinterval=vmax+sigvmax*tinv([0.025 0.975],3)
kminterval=km+sigkm*tinv([0.025 0.975],3)

% Therefore, we have strong grounds for believing that the Vm and Km will
% lie somewhere between the aforemetioned values.
echo off