clc; clear; format compact echo on; %Tutorial 5 %In this tutorial we'll study the error propagation in estimating %the equilibrium constant for a gas phase reaction. Suppose that an %undergraduate researcher is studying the following reaction: % 2NO(g) + 2H2(g) <--> N2(g)+2H2O(g) %The student measured the concentrations once the system reached %equilibrium. The average measurements are the following: H2O=1.38 N2=0.19 NO=0.62 H2=0.12 %The equilibrium constant can be calculated as: %K = [N2]*[H2O]^2 / [NO]^2*[H2]^2 %We write a little function 'get_constant.m' to use these values to %calculate the equilibrium constant. It will be convenient to write this %function to take in all these values as an array 'concentrations': concentrations =[H2O,N2,NO,H2]; %Lets calculate the equilibrium constant the_constant =get_constant(concentrations) %We want to calculate the error in the equilibrium constant. Our %researcher estimated the uncertainty in each of these concentration %measurements, either by doing multiple measurements of each, or by other %means. We put these standard deviations in the vector sigma. % %sigma contains the magnitude of the std deviation for the water, Nitrogen, %nitrous oxide and Hydrogen concentrations, respectively. sigma = [0.005,0.003,0.004,0.006]; %For the error calculation we will do a numerical derivative to get the % gradient of the equilibrium constant with respect to each of the %concentrations. We'll use a forward difference scheme. %Forward difference scheme: % dy/dx ~= (y2-y1)/(x2-x1); %Where y2 is given by y2=f(x1+dx), dx is a small number. Because the %magnitude of all the x(i)'s are different, we choose a dxi which scales %with the sigma of each variable. Thus: ep=10^-4; %Small value gradf=zeros(size(concentrations)); %We initialize the array. m=length(concentrations); for i=1:m xt=concentrations; %Note that we have to put this -inside- the loop! dxi=ep*sigma(i); xt(i)=xt(i)+dxi; %We increase the ith measurement by a factor dx new_constant=get_constant(xt); gradf(i)=(new_constant-the_constant)/dxi; echo off end echo on %The calculated error is just that obtained from the error propagation %formula. The measurements are considered independent, so: varconst=gradf*diag(sigma.^2)*gradf'; %and we get the 1 sigma error bars: error=varconst^.5 %The error is pretty large, so we want to check to see if it is still %normally distributed (e.g., that our error propagation formula - that %ignores higher order terms - is still valid). We can do this using a %simple simulation. %We initialize the array of simulated concentration measurements: n=4000; concentration_matrix=zeros(n,length(concentrations)); for i=1:m %Loop over each measurement concentration_matrix(:,i)=concentrations(i)+randn(n,1)*sigma(i); echo off end echo on %We have written the function so that it can take in all the concentrations %for all the simulated experiments at once! Thus we get: eqmconstants=sort(get_constant(concentration_matrix)); %Where we sort the calculated values from smallest to largest. We can use %this to plot up the cumulative probability and compare it to a gaussian %distribution: cumprob=[.5:n-.5]/n; %We have the calculated normal distribution from the error propagation %formula too: cumprobest=(0.5 + erf((eqmconstants-the_constant)/error/sqrt(2))/2); figure(1) plot(eqmconstants,cumprob,'b',eqmconstants,cumprobest,'r') xlabel('equilibrium constant') ylabel('cumulative probability') legend('simulation','error calculation','Location','SouthEast') grid on %Comparison of the two results shows that the error formula is a bit off. %We can look at the 95% (2 sigma) confidence intervals. That calculated %from the error formula is: conf_interval=[the_constant-2*error,the_constant+2*error] %While that from the simulation is: sim_interval=[eqmconstants(round(n/40)),eqmconstants(round(n*39/40))] %Thus, the interval is shifted slightly to higher equilibrium constants. %This deviation would vanish for smaller values of measurement error %(divide sigma by 10 and re-run the code to see this). In general, the %error calculation is even more uncertain than the mean (error is a random %variable too!) so this deviation may not be significant for these %parameters. % %As a final note, we have assumed that the measurement errors are all %independent. This is probably incorrect, as all would likely be measured %by the same GC. Thus, they may well be positively covarying, and the %matrix of covariance of the concentration measurements won't be diagonal. %This would change the calculated result - either increasing or decreasing %the uncertainty. Modeling the matrix of covariance of the measurements is %often quite tricky...