For each of the following exercises create the specified files with your answer(s). Submit your homework using the course submit utility.
Complete reconstruction is rarely possible to do, as we explain. The convolution in the Fourier domain is
(1) S[i][j] = F[i][j] * G[i][j]where F and G are "FT'd" arrays of the image f(x,y) and filter g(x,y), respectively, and S is the convolved result. Deconvolution of the smoothed image is thus
(2) F[i][j] = S[i][j] / G[i][j].What if G[i][j] has a value of zero? Then the information contained in the Fourier component F[i][j] can not be recovered. When we take the inverse FT, any single Fourier component contributes to all spatial components. So the loss of one component can degrade the entire image.
With our Gaussian filter used above, we never formally have a G[i][j] value which is zero. But care is needed. The G-values can get very close to zero - even to the point that they are unrepresentable in double precision. Any noise in the smoothed image S at these wavenumbers is then severely amplified. Indeed, if you do such a straightforward deconvolution, you may find the result completely unrecognizable.
Suppose noise creeps into the convolved image prior to the deconvolution step:
(3) C[i][j] = S[i][j] + N[i][j],where N is the FT of the noise component. If you do not know what N is a priori, and you attempt to just deconvolve C (i.e., divide it by G), you are risking trouble. So the deconvolution is often hugely unstable in the presence of even a small amount of noise.
At this point you may ask, why is there noise in the smoothed image at all? We have not added any explicitly. But we have done some "nonlinear" processing, just by converting the smoothed signal to a PPM image, wherein double precision numbers are rounded to integers between 0 and 255. The maximum error in the signal is everywhere less than unity on this scale (approximately 1% on average), but quite enough to spoil a straightforward deconvolution.
A cure is to perform "optimal filtering" (see NR, Section 13), whereby we estimate the original, unconvolved signal from
(4) U[i][j] = C[i][j] * Phi[i][j] / G[i][j].The "optimal" filter Phi (its FT, in this case) would be designed so that
(5) Phi[i][j] = 1 - N[i][j]/C[i][j]This will suppress modes for which the noise dominates the signal (N->C). The trick is to figure out a good model for N -- not always possible, but sometimes we can make a reasonable guess. Consider the power spectrum of the smoothed image from the previous exercise:
This figure shows the average power (absolute square of the signal) as a function of the magnitude of the 2-D wave vectors. (In 2-D, each mode i,j corresponds to a wave vector k such that
(6) k = (k_i,k_j) = (dk*wrap(i,nx)/nx, dk*wrap(j,ny)/ny)
dk = 2*pi/(delta)
where delta is the spacing between pixels (which we take to be
1). The sca;ed wavenumber (the horizontal axis) is here defined as
|k|/dk, (i.e. we have removed the 2*pi/delta factor for
convenience, leaving a value that ranges up to a maximum of
1/sqrt(2)). The power spectrum above shows the average value of
|C|^2 for the smoothed image as a function of |k|/dk.)
The spectrum shows that there are two components of S, one at
small scales which falls off exponentially, and another which is
roughly flat. The constant part is very likely noise. How much noise
do we expect? A noise of 1 grayscale unit per spatial pixel
translates with our normalization of the FT to a noise of
1*nx*ny per Fourier component. That puts the noise level at
about 2x105 for our image, roughly consistent with the
flat part of the power spectrum. So we could model the noise as a
constant and build the optimal filter. However, if you try it even
with the small degree of smoothing we did, you may find that the
deconvolved image is a mess. Part of the problem is that the effect of
noise is severely amplified because the filter, or rather its inverse,
rapidly approaches infinity at a modest wave vector magnitude.
We can still make progress. In the spirit of optimal filtering, try a function Phi that is unity at small k-values and cuts off much more agressively at larger values. At a scaled wavenumber of around 0.16, the power spectrum shows that the noise begins to dominate the signal; at higher wavenumbers, it completely dominates, so we may as well set Phi to be identically zero, or at least very small, falling off as exp(-const*k^2), to prevent the inverse filter (1/G) from amplifying the residual noise. Thus try
/ exp{-0.5*[(kn-kcut)/ksig]^2} {if (kn>kcut)}
(7) Phi[i][j] = |
\ 1.0 {otherwise}
where kn is |k/dk|, kcut=0.16, and
ksig=0.01. This has Phi going rapidly to zero much
beyond a wavenumber of 0.18.
At last, here is the exercise. Write a C++ code, invarf.cc and its Makefile (both to be handed in), to deconvolve the smoothed image from the previous exercise, according to equation (4). Use the filter Phi in equation (7), but make your own best choice for the values of kcut and ksig.
Tips
1. When dividing by G(k), beware of cases in which
G(k) is zero, or essentially zero and devise an appropriate
method for handling this case. Suggestion: If |G(k)| < eps
for a predefined tiny constant, say eps = 10-16,
set G(k) = (eps,0).
2. You may get image artifacts if you use the ppm.h functions to
write your arrays to a PPM file. These can be removed by clipping the
gray-scale, setting the min and max values of the arrays to be 0 and
255. Of course it would be better to rescale the entire image to this
range, but for present purposes clipping is also acceptable.
Turn in invarf.cc, Makefile and your deconvolved/reconstructed image in JPEG format, as invarf.jpg.
sr = sum over r' gr,r' fr'.where gr,r' is a symmetric function only of the relative coordinates r - r'.
Thought of this way, deconvolution means inverting the linear transform matrix g. The practical problem here is that the matrix can be seriously ill-conditioned. So we turn to SVD to help stabilize the inversion. Because it is real and symmetric, the singular value decomposition of g is equivalent to its diagonalization. Furthermore, the diagonalization process is nothing but the process of doing the Fourier transform!
g = U G Uadj
or in terms of components
gr,r' = sum over k Ur,k G(k) U*r',k
Here G(k) is the complex Fourier transform and U(r,k) =
exp(-ir*k)/sqrt(nx*ny). While SVD for real vector spaces is
formulated in terms of orthogonal transformations, note that our
Fourier transform conventions involve complex numbers, so U is
unitary and the Fourier components G are complex. However, we
could have changed conventions and done a real-valued Fourier
transform, and the analysis would be essentially the same. It is more
convenient to stick with our complex convention.
Now the SVD inverse of g is just
ginv = U (1/G) Uadj
which requires taking the reciprocal 1/G of the singular
values. This is where we run into trouble, since the singular values
can be nearly zero. The SVD expedient is to set the reciprocal to
zero when the singular value is too small. What does this do to the
restoration process
finv = ginv s ?Recall that the zero singular values correspond to vectors in the null space of g and near zero singular values to vectors that are "nearly" null. The null space contributions are the most troublesome because they are so sensitive to noise in s. Setting the reciprocal of the singular values to zero "tames" these null space contributions. The output image vector finv is no longer guaranteed to be identical to f. But it will be the vector that minimzes the magnitude of the difference |f - finv| among all vectors in the nonnull domain of g. That seems to be a desirable convention for image reconstruction.
So for this exercise, revise your code to create svdarf.cc for handing in. Choose a minimum value of G below which 1/G is replaced by zero. Experiment with this choice. Hand in your best result svdarf.jpg.
Write the complete program to simulate a random walk in d dimensions (d may be as large as 5). The walks take place on a rectangular grid with unit grid step size. The parameters for the walk are
ranwalk.cc MakefileYour makefile should then have the rules for making all of the code for this assignment.
All calculations are to be done in DOUBLE precision.
Here is how the program should operate
Here is how to simulate the walk. The algorithm is very simple. The state of the walker at any given step is defined by a vector r(i) for i = 1,2,...,d dimensions. Since the walk takes place on a grid with unit spacing, the vector components can be taken to be integers. You start by setting all components of the vector to zero. That means the walker starts at the origin. At each stage of the walk, the walker can take a step forwards or backwards in any of the d directions, or the walker may stay at the same place. The probability of taking a step in the +1 direction is p. The probability of taking a step in the -1 direction is also p. The same is true for all other directions. There are 2*d possible directions to take a step. Thus the remaining probability of not taking any step is 1 - 2*d*p. So the main challenge here is to figure out how to use the random number generator to help decide whether to take a step and, if so, in which direction to do it. Then you simply repeat the process for the requested number of steps (or nonsteps) in the walk.
a. Simulate 1000 walks in 1 dimension, each walk taking 25 steps (or nonsteps) with probability p = 0.1. Generate a histogram for your 1000 distance values. The histogram should range from -0.5 to 20.5 with bin width 1. Estimate the standard deviation of your distribution (based on the half-Gaussian shape) (use the 68% rule) and compare it with the expected standard deviation based on the diffusion equation (see the handout "Random Walk Diffusion"). In a file ranwalk.txt report your histogram, your estimate and the expected value.
b. Repeat the simulation, but change the number of steps to 250. Notice how the distribution has broadened. Again report your estimated and expected standard deviation. Then make a detailed comparison of the histogram bin counts you obtain with the expected Gaussian distribution (see the handout "Random Walk Diffusion"), and discuss any discrepancies. In particular, discuss the reason for the low bin at zero. In the file ranwalk.txt, append a table comparing histogram bin counts with the expected counts from the diffusion equation displayed next to each value. Put your discussion of the detailed comparison at the end of the file.
Hint: Use Maple to generate your expected values. You can cause Maple to generate a table of Gaussian values with the command
> for j from 0 to 20 do j, evalf(exp(-j^2/10)) od;The formula is not exactly what you want, but it is close.