clear reset(0) echo on %In this example we demonstrate the %techniques of undersampling and the %bootstrap to calculate the error in %fitted regression parameters. We shall %use a linear regression model in order %to compare the error with the exact %formula derived in class, however these %techniques work equally well on non-linear %regression problems. pause %We shall use our ball-in-the air example %again. We set up an array of 100 times: t=[0:.025:2.5]'; %and the corresponding "data set": b=(-0.5*t.^2+1.25*t)*9.8+randn(size(t)); %and we solve the regression problem: a=[ones(size(t)),t,t.^2]; k=inv(a'*a)*a'; x=k*b pause %We can estimate the regression error using %our standard formulae: [n m]=size(a); varb=norm(a*x-b)^2/(n-m); %and thus: varx=varb*k*k' pause %Now we shall try the undersampling approach. %Let's undersample the data set 10 times. Thus %we have to solve the regression problem (a smaller %one anyway) p=10 times. Thus: p=10; for i=1:p bus=b(i:p:n); aus=a(i:p:n,:); xus(:,i)=aus\bus; echo off end echo on pause %We now have a matrix of values of the fitted %parameters xus. Each column of xus is the %parameter values corresponding to one of the %samples of the data set. We may calculate the %averages: avexus=(sum(xus')')/p %which is close to the value which we had before: ratio=avexus./x %particularly for the value of the acceleration. %The parameter x(1) is a bit off, because it has %an expectation value of zero for this particular %problem. pause %Now we are ready to compute the error in the %fitted parameters using our matrix xus: for i=1:m for j=1:m varxus(i,j)=(xus(i,:)*xus(j,:)'/p-avexus(i)*avexus(j))/(p-1); echo off end end echo on varxus pause %Finally, we compare this to varx: ratiovar=varxus./varx %which closely matches to within about a factor %of two. The number of data points would have to be %much larger for a more accurate estimate. pause %Now let's look at the technique of bootstrapping. %Here we want to periodically replicate our data %an infinite number of times and then extract N %data points from this infinite sample. We calculate %the regression parameters from this new set, and %then repeat the operation a total of p times. We %calculate the matrix of covariance for the parameters %by comparing these values. pause %Rather than actually generating an infinite array %of times and positions, we simply generate a vector %of random indices pointing to the times and positions. %Thus: pb=100; for i=1:pb index=ceil(rand(n,1)*n); bboot=b(index); aboot=a(index,:); xboot(:,i)=aboot\bboot; echo off end echo on %Note that this took longer. That is because we %are solving for pb times as many points as in the %undersampling approach! pause %Now we evaluate these pb sets of fitting parameters %as before. First we compute the averages: avexboot=(sum(xboot')')/pb %which is close to the value which we had before: ratio=avexboot./x pause %Now we are ready to compute the error in the %fitted parameters using our matrix xboot: for i=1:m for j=1:m varxboot(i,j)=(xboot(i,:)*xboot(j,:)'/pb- ... avexboot(i)*avexboot(j))*pb/(pb-1); echo off end end echo on varxboot pause %Finally, we compare this to varx: ratiovar=varxboot./varx %which matches again to within about a factor %of two. The number of samples would have to be %much larger for a more accurate estimate. You %will also get a different answer every time you %run this! echo off