\documentclass[doublespace,letterpaper,fullpage,12pt]{book}
\usepackage{amsmath}
\usepackage{amsfonts}
\usepackage{amscd}
\usepackage{latexsym}
\usepackage{graphicx}
\usepackage{doublespace}
\newtheorem{proposition}{Proposition}[section]
% Set dimensions of columns, gap between columns, and paragraph indent 
\input{/usr/figs/extracommands.tex}
\input{/usr/figs/tinycaptionfont.sty}
\input{/usr/figs/psfig.sty}
\input{/usr/figs/figcommands.tex}
\renewcommand{\baselinestretch}{1.2}

\topmargin      =20.mm          % beyond 25.mm
\oddsidemargin  =0.mm           % beyond 25.mm
\evensidemargin =0.mm           % beyond 25.mm
%\headheight     =0.mm
%%\headsep        =0.mm
\textheight     =210.mm
\textwidth      =165.mm

\usepackage{times}
\renewcommand{\ttdefault}{cmtt}

\begin{document}
\begin{titlepage}
  \begin{center}
  \Large

  Comparametric Lightspace Imaging and Analysis

  \normalsize
  \end{center}
  \begin{center}
  \vspace{1.25in}
  by\\
  \vspace{1.25in}
  Corey Manders\\
  \vspace{1.5in}
  A thesis submitted in conformity with the requirements\\
  for the degree of Master of Applied Science\\
  Department of Electrical and Computer Engineering\\
  University of Toronto\\
  \vspace{1in}
  Copyleft by Corey Manders, 2002
  \end{center}
\end{titlepage}

\doublespace



%\maketitle

% Title Page
     \author{
\textbf{Corey Manders}\\
\textit{corey@eyetap.org}\\
ECE Department, University of Toronto,\\ Toronto, Ontario, Canada\\
}

     \title{Comparametric Lightspace Imaging and Analysis}
%     \pagestyle{empty}
\date{}
%\newpage
\thispagestyle{empty} %gets rid of page number 2 on back of title page
\cleardoublepage %leaves a blank page for back of title page
\pagestyle{plain} %only centered page number at bottom 
\pagenumbering{roman}  %swithc to roman numbers for preamble stuff
%\addtocounter{page}{2} %abstract should be page ii
  \begin{center}
 
  \end{center}
  \begin{center}
  Corey Manders\\
  corey@eyetap.org\\
  Master of Applied Science\\
  2001\\
  Department of Electrical and Computer Engineering\\
  University of Toronto\\
  \vspace{0.5in}
  \large{Abstract\\}
  \end{center}
   Traditional processing on digitally collected images is done on
   pixel values with little or no concern as to what the pixels
   represent. This thesis first investigates what a pixel value
   really is. These values are converted to photoquantities which
   in essence linearize pixels, bringing them to mathematically
   meaningful values. Images represented in photoquantities may be
   composed, producing high-dynamic range images or manipulated
   such that when brought back into pixels (for viewing purposes)
   are superior. The function which brings pixel values into
   photoquantities is the camera response function. The thesis offers
   several methods for finding this function using analytical and
   numerical means. Given that photoquantities give mathematical
   meaning to images, existing methods of motion analysis are
   examined using photoquantities rather than pixels. The results
   of using photoquantities instead of pixel values are shown to 
   yield superior results.
%% 136 words
  

\newpage
\vspace{4in}
\begin{center}
\Large{Acknowledgements\\}
\end{center}
Steve Mann, who's previous work in photoquantimetric work made 
writing this thesis interesting. Also for his direction in writing
papers for the scientific community which made writing this thesis
relatively easy. Professor Mann's supervision and ability to focus
work and find interesting paths is unparalleled.

James Fung, Felix Tang, Mir Adnan Ali, Samir, Chris Aimone
\tableofcontents
\listoffigures
\listoftables

\newpage
\chapter{Introduction}
\pagestyle{headings}
\pagenumbering{arabic}

Image processing in current study commonly uses data captured from
digital cameras. These cameras may be video cameras or simply cameras
which take still digital images. In the current literature, very 
little is mentioned as to what pixel values really are. Such calculations
as motion estimation or image corrections and manipulations do not
take into account what the data is. It is certainly dubious to expect
that any meaningful calculations can be performed without first 
considering what the data represents.

For example, if one considers two simple greyscale images, we may have
one image with all of it's pixels with a value of 2. 
Similarly, in the second image
all of the pixels may have the value 4. One may ask, is the second image
twice as bright as the first? Or, how much light does it take to produce
these two images. These questions themselves start to beg other questions
 such as: what does twice as bright mean? These questions are rather difficult
to answer, and are largely qualitative judgments. 

Rather than answering these questions, this thesis will consider a more
scientific approach. If a camera is considered purely as an instrument
which measures light, then how are the pixel values which are recorded
relate to the amount of light which fell on the camera's sensor array?

This thesis shall first consider what it is that the camera is measuring.
Then, given there is an accurate understanding of what is being measured, 
the thesis will show several methods of determining how to relate pixel
values back to measured quantities collected by the camera's sensor array.
Finally, commonly used computations on images and manipulations in general
are re-evaluated using new techniques. 

Over the last few years, the author along with Professor Mann has published
much of the work in various journals and conferences. Many programs were
needed to carry out the work in this thesis since current programs 
do computations on pixel values. Previously, there were no programs
to aid in the discovery of a camera's response function. These programs
were developed along with 
much software for dealing with images using these new techniques. All
of these programs are freely available under the GNU public license and
may be found at http:$//$comparametric.sourceforge.net. 
Each result of this 
thesis may be rederived using the programs found there. Each program directory
also contains the data which was used in this thesis.


\chapter{What a Camera Measures and a reasonable unit to quantify this measurement}
\section{What a Camera Measures}
The question ``what does a camera measure?'' seems to at first have
a simple answer. Certainly a camera measures light, but the light measured
by a camera lies across a broad spectrum. The selenium cells which 
are used to measure the amount of light do not in fact react equally
across this spectrum. However, it must react predictably, or two images
of the same subject matter would look very different even though no
settings have changed on the camera. 


It is known that a camera actually acts as an array of sensors 
that each measure light.
In an electronic camera these sensors may simply be the pixels.  
Before the production of digital cameras, cameras simply used film
or glass plates, having some kind of light sensitive
chemical such as silver salts deposited thereupon. In this case, the
sensing surface was continuous rather than being on a discrete
lattice of pixels. For the purpose of this thesis, only the digital
discrete case will be addressed. However, it would be possible to
get similar results in the digital case
by simply considering a small patches of the sensing surface, and analysing
each patch. In the end, the collection of patches would in essence 
discretize the light sensitive surface.  

To demonstrate how a camera works, it is possible to consider
a simple one pixel camera. Once an
understanding of  what one pixel measures, it will be easier to 
understand what is measured by many pixels working together. 

\subsection{Constructing a One Pixel Camera as a simple experiment}
Perhaps the simplest way to construct a one pixel camera is to use
a light sensor.  The cheapest and most common kinds of light sensors
are usually photoresistors, such as the cadmium sulphide photocells
found in the small controllers that automatically turn streetlights
on after dark.

Cameras measure electromagnetic energy, but
unlike an ideal radio receiving antenna that one might connect to a
spectrum analyzer, a camera measures electromagnetic radiation only
within a certain region of the electromagnetic spectrum.
Typically cameras are sensitive to that which we call light,
e.g. to electromagnetic energy falling close to the visible,
the infrared, or the ultraviolet portion of the spectrum.  Moreover, the
camera's sensitivity to light is not very flat across the
spectrum over which it is sensitive.  Thus a camera is certainly not
a radiometer, and the measurements that it makes are certainly not
radiometric.

Moreover, cameras have spectral sensitivities that are quite different
from that of the human eye, and are therefore certainly not
photometers.

\subsubsection{Cameras as Quantimetric Sensing Instruments}

Different kinds of cameras have different spectral sensitivity profiles,
that are neither flat (radiometric) nor match the human eye (photometric),
and therefore cameras are referred to as quantimetric devices~\cite{procieee}.
A quantimetric measurement refers to a measurement that can be quantified
in some measurable units, where the camera itself can be used as the
measurement instrument.  The quantimetric unit, typically
denoted by the letter $q$, is usually made relative
to some reference value, $q_0$, so that it can be expressed as a ratio, or
as a logarithmic ratio (often in decibels).


The quantity, $q$, has been referred to in image processing literature
as a {\em photoquantigraphic} quantity~\cite{procieee}, or just the
photoquantity (or photoq) for short. This quantity is neither radiometric
(e.g. neither {\em radiance} nor {\em irradiance}) nor
photometric (e.g. neither {\em luminance} nor {\em illuminance}).
Most notably, since the camera will not necessarily have the same spectral
response as the human eye, or, in particular, that of the
photopic spectral luminous efficiency function
as determined by the CIE and standardized in 1924, $q$
is neither brightness, lightness, luminance, nor illuminance.
Instead, quantigraphic imaging measures the quantity of light
integrated over the spectral response of the particular camera system,
\begin{equation}
  q=\int_0^\infty q_s(\lambda) s(\lambda) \, d\lambda,
  \label{eq:spectralq}
\end{equation}
where $q_s(\lambda)$ is the actual light falling on the image sensor and
$s$ is the spectral sensitivity of an element of the sensor array.
It is assumed that the spectral sensitivity does not vary across the
sensor array.

It is this measurement $q$ which shall be dealt with in the following
chapters of the thesis. When dealing with the output of a given
camera, this quantity is easier to deal with than any other derived
measurement.

\subsubsection{Using a photoresistor as a single pixel}
Photoresistors are devices in which resistance is a function
of incident light.
Typically an increase in light results in a decrease in resistance.
Most are known to obey an empirical law: 
\begin{equation}
R = R_0 q^{-\gamma},
\label{eq:photoresistor}
\end{equation}
where $\gamma$ is usually a positive constant less than one.
The fact that $\gamma$ is less than one indicates that the
photocell tends to compress dynamic range.  Cameras also compress
dynamic range in a similar way.


If the assumption is made that a camera is an instrument for converting light
into numbers, a one pixel camera may be constructed from the photocell
by simply connecting it to an ohm meter.

It is more convenient to consider conductance, which is the
reciprocal of resistance, and to define the camera's response
function, $f$, as conductance of the photocell. Equation 
(\ref{eq:photoresistor}) now becomes:
\begin{equation}
  f(q) = \beta q^\gamma,
\end{equation}
where $\beta=1/R_0$.
The quantimetric function, $f$,
increases with increasing quantity of light, $q$.
Thus the ohm meter will show  $f=0=q$,
and will show larger values as $f$ and $q$ increase.

What is required is to determine
$f$ as a function of $q$ (e.g. suppose that we did not already know
the relationship between $f$ and $q$).
For the purpose of the experiment, all that is necessary is
a lamp and the camera (photocell plus ohm meter),
together with a piece of black cardboard that can be used to cover half of the
lamp.
An assumption is made that the  lamp is round or has some other shape 
that makes it
easy to cover up exactly half of it.

It is therefore possible to cover half of the lamp, and then
point the lamp at the camera (sensor), and take a reading $f(q_0)$,
(See Fig~\ref{fig:photocell_experiment}(a)), and then
uncover the lamp while leaving it exactly in the same place.
Uncovering the lamp exactly doubles the quantity of light received
by the camera (sensor), so that we then know what $f(2q_0)$ is.
(See Fig~\ref{fig:photocell_experiment}(b).)
\begin{figure}
 \figc{photocell_experiment/photocell_experiment.idraw,width=3in}
 \caption{Photocell experiment: for each of  various distances between
          the lamp and the camera, a reading is taken with the lamp
          exactly half covered, versus with the lamp not covered.
          Other than the camera, lamp, and black card, no other devices
          (such as rulers or other measurement instruments) are needed
          in order to characterize the response of the photocell to light.}
 \label{fig:photocell_experiment}
\end{figure}
Although the absolute quantity $q_0$ is not known, the
relative quantity $2q_0/q_0 = 2$ is known.
The fact that the quantity of light was exactly doubled, 
implies that it is now possible to record sets of ordered
pairs $(f(q_0),f(2q_0))$.  The lamp may be placed at various different
distances from the camera (sensor) and for each such lamp position,
another ordered pair will be produced.


Suppose
that ten such ordered pairs of measurements are taken, e.g. continuing with
$(f(q_1),f(2q_1))$ and so on, all the way up to $(f(q_9),f(2q_9))$.
These ordered pairs may now be plotted on a graph, as shown in
Fig~\ref{fig:photocell_experiment_comparametric_plot}.
\begin{figure}
 \figc{photocell_experiment/photocell_experiment_comparametric_plot.idraw,width=3in}
 \caption{Datapoints from photocell experiment: ordered pairs of points
          taken by reading the camera with the lamp uncovered
          and exactly half covered.}
 \label{fig:photocell_experiment_comparametric_plot}
\end{figure}
This is just like an $(x,f(x))$ plot, except that the axes are actually
functions, $f(q)$, and $f(2q)$ instead of just scalar quantities.
The first axis is $f$ so
it is convenient to use the next letter of the alphabet, $g$, after $f$
in order to denote the other axis.  Thus we have an $(f,g)$ plot ---
a plot of a function against a plot of a function of a function, where
$g=f(2q)$.

It is now possible to use a linear least squares algorithm to
fit a line (or curve) through the points.
The solid line shows one possible curve, namely
a straight line of slope approximately $1.68$.
The dotted line shows another possible curve.  Repeated measurements,
however, lead us to believe that the relationship is simply $g=1.68f$,
as shown by the solid line. However, to prove this using this technique
would imply the use of an uncountably infinite number of ordered pairs.

This is simply a plot of the photocell's response function $f(q)$ against
a contracted (squashed in) version of the same function $f(2q)$.
Such a plot is called a {\em comparametric plot}.

The notion of a parametric plot is certainly a familiar concept.
A parametric plot of a circle, for example, is simply a set of
ordered pairs $(rcos(\theta),rsin(\theta))$.
A comparametric plot is just a special kind of parametric plot, where
both axes pertain to the same function at different rates.

\subsubsection{Solving comparametric equations}
From the comparametric plot shown in
Fig~\ref{fig:photocell_experiment_comparametric_plot}
it has been determined that $g=f(2q)=1.68f(q)$.  This equation,
$g=1.68f$ is called a comparametric equation.
In general, solving a comparametric equation $g(f(q))=f(kq)$,
for some comparametric ratio $k$ (in this case $k=2$)
means analytically determining a family of possible functions $f(q)$ that
satisfy this equation.

In the  case above, therefore, what needs to be determined is what possible
functions $f(q)$ give a straight line (of slope $1.68$) when plotted
against themselves contracted (by a factor of 2).

Fig~\ref{fig:photocell_experiment_unrolled} depicts two plots,
one being a smooth function $f(q)$ who's comparametric plot is a
straight line of slope $1.68$, and the other being a quasi-periodic function
who's comparametric plot is also a straight line of slope $1.68$.
\begin{figure}
 \figc{photocell_experiment/photocell_experiment_unrolled.idraw,width=3in}
 \caption{Two examples of possible solutions to the comparametric function
          $g=1.68f$.  One solution, plotted as the thick line,
          is given by $f(q)=q^0.75$.
          Another solution plotted as the thin line, consists of
          a triangular wave, square-shaped wave, and round shaped wave
          between $q$ and $2q$ so that we can see how the waveform
          has comparametric periodicity.  In each ``period'', the function
          is stretched by
          a factor of two as we go to the right (or contracted by a factor of
          two as we go to the left), and
          composed by a factor of $g()$ as we go to the right,
          or $g^{-1}()$ as we go to the left.  This phenomenon is called
          comparametric periodicity. %% (or comparametric ambiguity).
          Thus there is a comparametric uncertainty in the solution to
          a comparametric equation.
          %%of Fig.~\protect\ref{fig:photocell_experiment_comparametric_plot}.
         }
 \label{fig:photocell_experiment_unrolled}
\end{figure}
Both of these functions have the same comparametric plot.
Both are solutions to the comparametric equation $g=1.68f$.

The quasi-periodic function illustrates that any function
can be specified on, for example, the interval from $q$ to $2q$,
and then merely replicated into the interval to the
right scaling by $(2*,g\circ)$, e.g.
by stretching to twice the width and composing to
$g()$ of the height,
in this case, merely multiplying by $1.68$ times the height, since
in this specific case, $g$ is linear.
This recipe can be applied recursively in both the left and right
directions.  Therefore, in general, solutions to comparametric equations
are not unique.  However, it may be possible to choose a 
function that is semimonotonic,
with semimonotonic slope, semimonotonic curvature,
(and so-on, including possibly further derivatives),
of which $f(q)=\beta q^\gamma$ is the preferred general solution
to $g=2^\gamma f$.

\subsection{Directly solving a comparametric equation by unrolling while
         collecting the data}
In the previous section, the response function of the photocell was known,
e.g. knowing the solution of the comparametric equation, and then
confirming that this known function was in fact a solution.
Now suppose the solution is not known, and did not know
how to solve the comparametric equation, in general.
In this case, we can  a simple way of simultaneously collecting 
and constructing
comparametric data, and arriving at a numerical solution to the
comparametric equation, namely, to obtain samples of the function $f(q)$.

Refer back to Fig~\ref{fig:photocell_experiment}(a) where half the
lamp is blocked with the black cardboard, to obtain $f(q_0)$.
Now suppose we unblock the lamp, as shown in
Fig~\ref{fig:photocell_experiment}(b) to obtain
$f(2_{q_0})$.  The next step is to now cover exactly half the lamp again
and then move the lamp toward the photocell until the observed meter reading
is exactly the same as it was when the lamp was not covered.
Thus the situation as shown in Fig~\ref{fig:photocell_experiment}(c)
will be that $q_1=2q_0$.

The procedure is repeated.  Uncover the lamp to obtain
$f(2q_1)=f(4q_0)$.  Then cover it again, and move it still closer
to the sensor, until the reading is the same as it was in
Fig~\ref{fig:photocell_experiment}(d).
Then uncover the lamp again, to obtain
$f(4q_1)=f(8q_0)$.

Plotting these data points will provide the eight points denoted as
filled in black circles in Fig~\ref{fig:photocell_experiment_unrolled}.
Of course there still exists the comparametric uncertainty of what should
be inserted between the points, but if a power law is suspected, we
can assume the smooth monotonic function of the form
$f=\beta q^\gamma$ and determine the value of $\gamma$ from the
data.

\subsection{Doing the experiment in bulk}
With an actual camera, there are multiple pixels, not just one.
So rather than exactly doubling the exposure by using a black card
to cover half the lamp, suppose that we take two pictures of the same
subject matter, the two pictures differing only
in exposure.  Suppose that one picture is exactly twice the exposure of the
other.

Pictures usually consist of a two dimensional lattice of pixels,
upon which falls a two dimensional distribution of light $q(x,y)$ for
the first picture, and $2q(x,y)$ for the second picture (because the
exposure is twice as much for the second picture.
Thus the two pictures may be written as:
\begin{eqnarray}
  v_f &=& f(q(x,y))\\
  v_g &=& f(2q(x,y)) = g(q(x,y)).
\end{eqnarray}

A typical size for a picture is an array that is
480 pixels high and 640 pixels wide (same aspect ratio
as standard NTSC television aspect ratio, namely 4:3).
Let us consider a simpler example, namely two pictures that are each
3 pixels high and 4 pixels wide:
\begin{equation}
 v_f = \left[ \begin{array}{cccc}
                           1 & 3 & 2 & 3\\
                           3 & 2 & 1 & 2\\
                           0 & 0 & 2 & 0
                    \end{array}
            \right];
 v_g = \left[ \begin{array}{cccc}
                           2 & 3 & 3 & 3\\
                           2 & 2 & 2 & 2\\
                           0 & 1 & 3 & 0
                    \end{array}
            \right].
\end{equation}

In this example, assume that we have a 2 bit camera, such that it
has grey values that go from 0 to 3.

We now introduce the so-called comparagram.
The comparagram is a two dimensional array of size $M$ by $N$
where $M$ is the number of grey values in the first image, and
$N$ is the number of grey values in the second image,
where entry $J[m,n]$ is a count of how many times a pixel
in image 1 has greyvalue $m$ and the corresponding pixel in
image 2 has greyvalue $n$.
In this case
both images have 4 grey values, so the comparagram is a 4 by 4 matrix:
\begin{equation}
	\nonumber
	\begin{bmatrix}
		2& 1& 0& 0 \\
		0& 0& 2& 0 \\
		0& 0& 2& 2 \\
		0& 0& 1& 2 
	\end{bmatrix}.
\end{equation}

Summing across rows gives the histogram of the first image:
\begin{equation}
	\nonumber
	h_f = \begin{bmatrix}3 &2& 4& 3\end{bmatrix}.
\end{equation}

Summing down columns gives the histogram of the second image,
\begin{equation}
	\nonumber
	h_g = \begin{bmatrix}2& 1& 5& 4\end{bmatrix}.
\end{equation}

\section{Typical cameras}
Some video cameras function much like the photocell, but with a value of
$\gamma = 0.54$ instead of the $\gamma$ values of $0.6$ to $0.9$ that
are typical of photocells. However, most cameras follow a more 
complicated law than the simple
power law.  In particular, various laws such as a simple 2 parameter law
$(\frac{q^a}{q^a+1})^c$ have been proposed~\cite{intelligentimageprocessing}
and used in various research applications such as wearable imaging
systems~\cite{intelligentimageprocessing}, computer vision, and
robotics~\cite{candocia2} %Candocia_FCRAR2002.pdf

Initially, no specific assumptions are made about the
response function, $f$, and a comparagram is constructed  from
two differently exposed pictures of the same subject matter.
See Fig.~\ref{inputpictures}(a) and Fig.~\ref{inputpictures}(b).
\begin{figure*}
 \figlcrabc{2.2in}{hart_house_soldiers_tower_newer/s013.eps,width=2in}
           {2.2in}{hart_house_soldiers_tower_newer/s016.eps,width=2in}
           {2.2in}{hart_house_soldiers_tower_newer/s035.eps,width=2in}
 \caption{Differently exposed pictures of the same subject matter
          arising from changes in camera settings or from camera motion.
          (a) When light colored objects enter the field of view,
          such as when a bright sky takes up a good portion of the image,
          the exposure will typically be reduced.
          (b) To obtain detail in darker areas of the image the exposure
          may be increased.  Here the exposure has been increased by a factor
          of two.
          (c) A change in composition (e.g. looking at darker subject matter
          to the right) gives rise to a change in exposure (in this case,
          the exposure also increased by a factor of two).
         }
 \label{inputpictures}
\end{figure*}
Typically a difference in exposure will result in any two
successive frames, due to some amount of camera motion,
in the sense that most cameras have some kind of automatic exposure mechanism.
Typically, due to noise in images (sensor noise, as well as
the inter-frame motion artifacts, etc), comparagrams do not provide points
that only lie on a slender curve.
More typically, comparagrams
define a cloud of points clustered along an underlying curve.
Thus slenderization of the comparagram~\cite{intelligentimageprocessing}
is a typical first step in recovery of the comparametric function, $g(f)$.
See Fig.~\ref{comparaslim}


%\section{Determining $q$}
%Unfortunately, the photoquantity $q$ explained in the last section is
%not immediately given by a camera. Rather, the output of a camera is
%usually JPEG images which later are decompressed to pixel values. Research
%in the past has been done by Mann and others to determine the photoquantity
%$q$ given only differently exposed images of the same subject matter.
%Recently, this work has been improved to determine the response function
%even if the images are not spatially aligned. More to this point, the
%author developed a program to quickly determine a camera response function
%even with images which are not spatially aligned.

%To determine a set of photoquantities (up to a scalar multiple), the most
%logical method is to determine the camera response function. The camera 
%response function is a monotonically increasing function which takes
%a photoquantity $q$ and returns a pixel value. The collection of returned 
%pixel values is commonly referred to as imagespace. Consequently, the domain
%of the camera response function (f) is referred to as lightspace.
%The camera response function (f) takes as input photoquantimetric values
%and returns pixel values.

\subsection{The subcatagorization of light into Red, Green and Blue}
To this point, one may have observed a crucial difference between the
current discourse and the typical output of a camera. The greatest 
difference being that a camera separates  it's output as quantities
of red, green and blue. 
The study of the function $f$ has typically disregarded all of the data given 
by the camera. The images are usually in RGB (Red Green Blue) format,
however are converted to greyscale. The composition of of color images
is in itself a special problem. Using the content of this thesis, one
should be able to compose two images with far better results than
traditional means. But to accurately compose two color images and produce
a likely color image with the same quality is a topic of further research.
The problem becomes particularly evident in later sections of the thesis
where computations such as local mean and variance statistics are done.
If these computations are done on the red, green and blue channels of
an image, the results are undesirable. 

\section{Comparagrams and Comparametric Equations}

Knowing that camera response functions are monotonically non-decreasing\footnote{
The fact that camera response functions are non-decreasing arises from the fact
that in an image, an increase in brightness in a scene should never result
in a lower pixel value. For example, given two image of exactly the same
scene with the exception that the second image has a longer exposer time
than the first, should result in all of the pixel values in the second image
begin larger in magnitude than then corresponding pixels in the first image}
gives rise to an inverse camera response function $f^{-1}(x)$. 
The collection of pixels collectively
representing an image may be thought of as an ``imagespace''\footnote{Though
it is common to say that images represented by a camera's output as 
a ``vector'' lie in imagespace, it should be noted that this is not a vector
space in the sense that lightspace is a true vector space. For example,
pixel values typically have values from 0 to 255, thus imagespace is an
R-module\cite{artin} at best. Even this is complicated by the fact that
pixel composition and scalar multiplication are ill-defined.}  representation of 
the sensor data. However, if the inverse camera response function
is used to transform the imagespace
representation into photoquantimetric values, the transformed data lies
in ``lightspace''. 
After applying the transformation, given two lightspace values $q$ and 
$2q$, the lightspace
value $2q$ has twice the photoquantimetric value of $q$. The converse however is not 
true in imagespace. More generally, we may call images which have been 
transformed from
imagespace into lightspace as lightvectors of a true vector space. As each
of the entries
of a lightvector obeys the operations of multiplication and addition of the field of
real numbers $\mathbb{R}$, the ordered collections of photoquantimetric values
are indeed a vector space. Comparagrams for a series of images is shown
in Fig.~\ref{fig:comparagram_pyramid}
 

\begin{figure*}
  \figc{/comparagram_pyramid/pyramid-final.eps,width=\textwidth}
  \caption{A sequence of
           differently exposed pictures of overlapping subject matter.
           Such variable gain sequences give rise to a family of
           comparagrams.  In this sequence the gain happens to have
           increased from left to right.
           The square matrix $J$ (called a comparagram) is given
           pairwise for the images. The first row shows the
	   relative exposure difference for $k=2^1=2$.
           The next row shows pairwise comparagrams for skip=2
           e.g. $k=2^2=4$, and then for skip=3, e.g. $k=2^3=8$.
           Various skip values give rise to families of comparagrams that
           capture all the necessary exposure difference information.
           Each skip value provides a family of comparagrams.
          }
  \label{fig:comparagram_pyramid}
\end{figure*}


\subsection{The mathematical importance and significance of the set of negative real
            numbers in quantimetric imaging}

Many observers of the previous section would question the use of
the complete set of reals as photoquantimetric values in the space 
we have called lightspace. Certainly the thought
of collecting a lightvector by use of a digital camera which produces 
negative vector
entries is indeed humorous (photocells emitting light when exposed to a 
scene). However,
though such vectors do not exist physically, the vectors do exist mathematically. For
example, consider two images which have been captured digitally and transformed into
lightspace vectors $v_1$ and $v_2$. We may compose these image in lightspace
producing $v_1+v_2$. This composition is certainly mathematically 
sound and has been shown also
to be physically sound \cite{barfieldbook19}. Now consider the additive 
inverse of $v_2$ added
to $v_1+v_2$. $v_2$ being a typical lightvector will ensure that
$-v_2$ will have  all non-positive entries.
Though this vector is not physically possible, adding $-v_2$ to $v_1+v_2$ is trivially
seen to produce $v_1$. 

Not only is the inclusion of $\mathbb{R}^-$ important to maintain the mathematically
tractable structure of a field, it promotes our larger structure to the 
status of a vector field and introduces an added ability, 
that of subtracting light vectors, into the capabilities of
lightspace imaging. Thus, what may initially be considered absurd in traditional 
imaging is in fact present and an effective tool in quantimetric imaging.    

\section{Determining a Camera Response Function}
Finding a camera's response function is not a trivial task. One method
is to use a chart of known reflectivity, such as that offered by DCS
labs of Toronto. However, often such a chart is not available. It is 
also unreasonable to expect that every scientist  who wants to conduct
studies on the output of a camera should have to obtain and maintain
such a chart.

As an alternative to using such a chart, methods have been developed 
for determining a camera's response function without such a chart, only
using the camera itself. If there exists two images taken by the same camera
which differ only in exposure, then this data may be used 
to find the response function.

The first step in this process is to determine what is known as a 
comparagram. The comparagram may be slimmed to produce a comparagraph.
It is this comparagraph which shall be used in the following chapters 
to determine a given camera's response function. 

\chapter{Analytic Methods of Solving Comparametric Equations}

The concept of a comparagraph presented in the previous chapters is 
a plot of a functional equation. That is to say, given a plot of 
a function $f(q)$ against $f(kq)$, not knowing the function $f$, we
wish to know what set of functions could produce the comparagraph.
Usually, when a function is plotted, $x$ is plotted against $f(x)$.
However, the following proposition may be used from
\cite{intelligentimageprocessing}:

{\proposition When a a function $f(q)$ is monotonic, the comparametric
plot $(f(q),f(kq))$ can be expressed as a monotonic function
$g(f))$ not involving q.}

Using the function $g$ from the proposition yields the following
diagram:

\begin{equation}
\begin{CD}
q @>k>>  kq \\
@VfVV    @VVfV\\
f(q) @>g>> f(kq) @=g(f(q))
\end {CD}
\label{eq:comm_eq}
\end{equation}

Using this, we now have that $g \circ f = f \circ k$. Or 
equivalently $g = f \circ k \circ f^{-1}$. This form of an equation
is referred to as a comparametric equation. 

\section{Finding a comparametric equation given it's solution}

Comparametric equations being a subset of functional equations
are usually very difficult to solve and generally never yield a
unique solution. There are other methods of solving comparametric 
equations. One other method is the use of a numerical solution,
discussed in the next chapter. However, much insight of the problem
can be gained from starting with a solution and then finding the 
comparametric equation to which the solution fits.

For example, often traditional image processing uses gamma correction
to brighten or darken an image in imagespace. This method uses a
constant $\gamma$ to which each pixel-value (in imagespace) is 
raised. This gives rise to the following comparametric 
equation:

\begin{equation}
   \begin{aligned}
	g(q) &= f(kq) &= (f(q))^\gamma.
   \end{aligned}
\end{equation}

The solution is:

\begin{equation}
    \begin{aligned}
	f(q) &= exp(q^\Gamma) &= exp(q^{log(\gamma)/log(k)})
    \end{aligned}
\end{equation}

where $\Gamma = \log_k(\gamma)$.

Starting with the solution $f(q)=\exp(q^\Gamma)$, then it is possible
to show that $(f(q))^\gamma$ is a solution simply by considering
$f(kq)$. Then

\begin{eqnarray*}
	f(kq) &=& \exp((kq)^\Gamma) \\
	      &=& \exp(k^\Gamma q^\Gamma)\\
	      &=& \exp(q^\Gamma)^{k^\Gamma}\\
	      &=& (f(q))^{k^\Gamma}\\
	      &=& (f(q))^\gamma.
\end{eqnarray*}

As mentioned earlier, the other direction is far from trivial. The general
problem of functional equations has been study primarily by Janos Acz\'{e}l in 
such works as \cite{aczel}. Because analytic solutions to comparametric
equations must be considered of great importance, this topic is mentioned
in the chapter entitled ``Future Directions''. It is also of interest in that
it creates a tie between contemporary image processing and an interesting
but somewhat obscure area of contemporary pure mathematics.

\section{Existing Solutions to Comparametric Equations}

Fortunately, in \cite{intelligentimageprocessing}, Mann includes a
table of comparametric solutions and they're solutions for many model
functions which are appropriate for various camera. Existing 
solutions to comparametric solutions are shown in 
Table~\ref{comparametictable}.

\begin{table*} \begin{center} \begin{tabular}{l|l} \hline \hline
comparametric    & solution\\
equation         & (camera response function) \\
$g(f(q))=f(kq)$  & $f(q)$    \\
\hline
\hline
$g = f^\gamma$   & $f=\exp(q^\Gamma), \;\;\; \gamma = k^\Gamma$ \\ \hline
$g = k^\gamma f$ & $f=q^\gamma$ \\ \hline
$g = af+b,\; \forall a\neq 1, \mbox{or} \;\; b=0$       & $f=\alpha + \beta q^\gamma, \;\;\;
                   a=k^\gamma, b=\alpha(1-k^\gamma)$ \\ \hline
$g=f+b$          & $f=B\log(\beta q), \;\;\; b=B\log k$ \\ \hline
$g=\left(\sqrt[\gamma]{f}+\log k\right)^\gamma$ 
                 & $f=\log^\gamma q$ \\ \hline
$
  g = \left\{ \begin{array}{ll}
    \left((2^\zeta-1) f + 1 \right)^\frac{1}{\zeta} -1, & \forall \zeta \neq 0
    \\
    2^f - 1,                                & \mbox{for} \; \zeta = 0
  \end{array} \right. %right}suppressed
$                & $\frac{e^q - 1}{2^\zeta - 1}$ \\ \hline
$g=e^bf^a=e^{\alpha(1-k^\gamma)}f^{(k^\gamma)}$
                 & $\log f = \alpha+\beta q^\gamma$  \\ \hline
$g=\exp((\log f)^{(k^b)})$
                 & $f=\exp(a^{(q^b)})$ \\ \hline
$g=\exp(\log^k f)$& $f=\exp(a^{bq})$ \\ \hline
$g=\frac{2}{\pi}\arctan \left(k\tan(\frac{\pi}{2}f) \right)$ &
$f=\frac{2}{\pi}\arctan(q)$  \\ \hline
$g=\frac{1}{\pi}
         \arctan(b \pi\log k + \tan((f-1/2)\pi))
            + \frac{1}{2}$
            & $f= \left\{ \begin{array}{ll}
             \frac{1}{\pi}\arctan(b \pi \log q)
                    + \frac{1}{2}, & \forall q\neq 0
             \\
             0, & \mbox{for} \; q=0
            \end{array} \right.
              $
\\ \hline
$g=\left(\frac{\sqrt[c]{f} \;\; k^a}{\sqrt[c]{f}(k^a-1)+1}\right)^c$
            & $f=\left(\frac{e^b q^a}{e^b q^a + 1}\right)^c
                = \left\{ \begin{array}{ll}
            \left(\frac{1}{1+e^{-(a\log q + b)}}\right)^c, & \forall q \neq 0
                     \\
            0,                                & \mbox{for} \; q = 0
            \end{array} \right. %right}suppressed
              $ \\ \hline
$g=exp\Bigg(\left(\frac{\sqrt[c]{log f}\;\;k^a}
  {\sqrt[c]{log f}(k^a-1)+1}\right)^c\Bigg)$
                & $f=exp\bigg(\left(\frac{e^b q^a}{e^b q^a + 1}\right)^c\bigg)
                = \left\{ \begin{array}{ll}
            exp\bigg(\left(\frac{1}{1+e^{-(a\log q + b)}}\right)^c\bigg),&
                     \forall q \neq 0
                     \\
            0,                                & \mbox{for} \; q = 0
            \end{array} \right. %right}suppressed
              $
\\ \hline \hline
\end{tabular} \end{center}
\caption[]{\small   Illustrative or
         useful examples of comparametric equations and their solutions.
         The third from the top and second from the bottom were found to
         describe a large variety of cameras and
         have been used in a wide variety of quantigraphic image processing
         applications.  The second one from the bottom is the one that is
         most commonly used by the author.}
\label{comparametictable}
\end{table*}








\chapter{Numerical Methods of Solving Comparametric Equations}

The last chapter ultimately presented a table which may be used
to find a solution to a comparametric equation. This implies
first fitting one of the functions in the table to a comparagraph.
Due to the construction of cameras by various manufacturers, a
sensible model from the table may not fit. This chapter describes
numerical methods of finding a camera's response function given 
a comparagram and it's corresponding comparagraph.

\section{The Comparagram and it's Importance}
Though a comparagram was derived in the experiment in chapter 2,
an method of generating a comparagram (and why we want to generate
it) is now discussed. 

If there are two images which lie in imagespace of the same subject
matter with different exposure, we may regard this as two sets of 
data. If the assumption is also made that the images have the same
dimensions, then each pixel value has a corresponding (x,y) coordinate
associated with it. One method of capturing the differences between the
two images is to create a plot of the changes in corresponding pixel
values. To do this, first a $M \times N$ matrix is constructed where the
number of rows corresponds to the number of possible grey values in the
first image, and the number of columns corresponds to the possible number
of grey values in the second image. Data from a typical camera will result
in a matrix of dimension $256 \times 256$. The matrix starts out initially as
the zero matrix. Given the data in the two images, entries in the matrix 
are incremented depending on the data present.
For example, if at location $(a,b)$ in the first image a pixel values of 
$\alpha$ is observed and at location $(a,b)$ in the second image, a pixel
value of $\beta$ is observed, the the entry $(\alpha,\beta)$ is incremented
in the matrix. This is done repeatedly until all data from both images has
been accounted for. This derived matrix is referred to as a comparagram.

To get an initial understanding of the collected data and the differences which
lie between the two images, the data may be compressed logarithmically to
fall in the range $[0,255]$. The purpose of doing this is that the value may
be rounded to integers, a simple pnm header may be added, and the comparagram may
be displayed using an image viewer. An example of a comparagram which has been 
manipulated in this manner is shown in Fig.~\ref{fig:comparagram}. Note that 
this logarithmic compression is only done for the sake of viewing the 
comparagrams. When computations are done using the comparagram, the raw data
is used with no logarithmic compression.
 
\begin{figure} %%%[!h]
 \figc{xsummed.eps,width=2.5in}
 \caption{A comparagram compressed into the range $[0,255]$ to be 
          easily displayed}
 \label{fig:comparagram}
\end{figure} 

What is evident about the the comparagram is that a ``blurred'' function is
being displayed. That is to say, the camera response function is underlying
the comparagram but is often being distorted by noise of various sorts. Some
of the noise may just be sensor noise, while other causes of the noise may be 
such factors as JPEG compression. Fortunately, the comparagram contains many 
data points, therefore methods to remove noise may be quite simple. 

If a method is chosen to reduce the noise, what is observed is itself a true
function referred to as a comparagraph. One simple method to compute first
moments along the columns of the comparagram. Thus, every column is reduced
to a single real number, leaving a one dimensional array. 


Recent work has shown that often it is possible to get relatively accurate
comparagraphs from images even when they are not spatially aligned, but have
some region of overlap. This will be discussed within this chapter.

\section{Deriving a Camera Response Function directly from a Comparagram}
If we know the exposure ratio between two images $k$, then from 
(\ref{eq:comm_eq}) we have that

\be
  \frac{1}{k} f^{-1} (f_1) = f^{-1} (f_0)
\ee

Taking the logarithm of both sides,
\be
  F^{-1} (f_1) - K_i = F^{-1} (f_0), 
\ee
where $K=log(k)$, and $F^{-1}$ is the logarithmic inverse camera response
function. 

Re-arranging, we have
\be
  F^{-1} (f_1) - F^{-1} (f_0) = K.
  \label{eq:imagepairsequation}
\ee
%Equation~\ref{eq:imagepairsequation} 
This relation suggests a way to estimate the camera response function, $f$,
from a pair of differently exposed images of the same subject matter.
%We consider now various special cases of estimating $f$ and $k_i$.

\subsection{Estimation in the presence of noise}
%with known exposure values $k_i$}

Given (\ref{eq:comm_eq}),
two images are related to  the actual quantity $q$ as well as to each
other, by:

\be
f_1 ({\bf x}) = f_0\left( k q \left(
%                          \frac{
%                                A{\bf x+b}
%                               }{
%                                 {\bf c^Tx}+d
%                                }
{\bf x}
                   \right)
      +n_{q} \right) +n_{f}
  \label{eq:noisemodel}
\ee
where $n_{q}$ includes sensor noise,
and $n_{f}$ includes image noise due to
quantization, compression, transmission.
(For precise definitions of these two noise sources, see~\cite{comparam}.)

In the presence of noise,
each picture provides an estimate of the actual quantity of light
falling on the image sensor:
\be
  \hat{q}({\bf x}) = \frac{1}{\hat{k}}
  \hat{f}^{-1}\left(f(
   %{\bf x\hat{c}_i^T - \hat{A}_i})^{-1}({\bf \hat{b}_i} - \hat{d}_i{\bf x}
   {\bf x})\right)
  \label{eq:photoqwyckoffestimate}
\ee
%where $\hat{A}_i$, $\hat{\bf b}_i$, $\hat{bf c}_i$, and $\hat{d}$
%are the parameters of projective coordinate transformation between the two
%images, which are simultaneously estimated, as previously described in the
%literature\cite{procieee}.
where $\hat{k}$ is an estimate of the actual exposure constant $k$,
and $\hat{f}$ is an estimate of the true camera response function $f$,
assuming $n_{q} >> n_{f}$~\cite{comparam}.

%When we have more than one of these estimates, they
%More than one of these estimates
%Multiple such estimates of the actual quantity of light falling on the
Multiple estimates of the actual quantity of light falling on the
image sensor may be combined as follows:
%{\footnotesize
\be
\hat{q}({\bf x})
=\frac{
       \sum_i \hat{c}_i \hat{q}_i({\bf x})
      }
      {
       \sum_i \hat{c}_i
      }
  \label{eq:wyckoffprojectivecomposite}
\ee

IMPORTANT \\
$c$ is undefined, talk to Steve about this\\
!!!!!!!!!\\



%}% end footnotesize
%inverse of ax+b/cx+d  is    $(yc^T - A)^{-1}(b-dy)$

Photographic film is traditionally characterized by the so-called
``Density versus log Exposure'' {\em characteristic
curve}\cit{wyckoff}\cite{wyckofftr}.
Similarly, in the case of electronic imaging,
we may also use logarithmic exposure units, $Q=\log(q)$,
so that one image will be $K = log(k)$ units darker than the other:
\be
  \log(f^{-1}(f_0({\bf x}))) =Q =\log(
                                      f^{-1}(
                                             f_1(
                                             %    \frac{
                                             %          A{\bf x+b}
                                             %         }
                                             %         {
                                             %          {\bf c^Tx}+d
                                             %         }
                                             {\bf x}
                                                )
                                            )
                                     ) - K
\ee

Since the logarithm function is also monotonic,
the problem comes down to estimating the semimonotonic function
$F()=\log(f^{-1}())$ and the scalar constant $K$.

The unknowns ($F$ and $K$)
may be solved in a least squares
sense\footnote{In a typical imaging situation with two 480 by 640
               images and 256
               grey values, this amounts to solving 307200 equations in
               257 unknowns: 256 for F and one for K.
              }.

\subsection{Estimating camera response function}
%Suppose for the moment that $K$, ${A}$, ${\bf b}$, ${\bf c}$, and $d$ are known
 From (\ref{eq:imagepairsequation}),
we may find $F$ by minimizing the sum of squared errors:
\be
  \epsilon=\sum_x \sum_y \left( F(f_0({\bf x})) 
   - F\left(f_1\left(   %\frac{
                        %      A{\bf x+b}
                        %     }
                        %     {
                        %      {\bf cx}+d
                        %     }
                        {\bf x}
                           \right)\right)
   + K \right) ^2
  \label{eq:likeriemann}
\ee

Since $F$ can only be found up to a single unknown scalar constant,
$F(N)$ is fixed at
zero\footnote{This choice is arbitrary, as $F$ could have been
              fixed at any point.  However, it is common practice, when working
              with logarithmic units, to define the maximum quantity as zero
              (e.g. audio and video recorders typically have signal measurement
              meters calibrated so maximum signal input corresponds to 0dB),
              and since $F$ is assumed to be monotonic, the constraint
              makes $F$ entirely negative when images are of increasing
              exposure ($K>0$).
             },
where $N$ is the number of greyvalues (typically $N=256$).

Equation~\ref{eq:imagepairsequation} gives a set of equalities of the form:
%by finding $F$ that minimizes the norm $\|AF+K\|^2$.
${\bf A} {\bf F} = - {\bf K}$, where
%This minimization can be expressed in matrix form:
%\be
%  \min ({\bf A} {\bf F} + {\bf K})^T ({\bf A} {\bf F} + {\bf K})
%  \label{eq:likeriemannmatrix}
%\ee
${\bf A}\in \mathbb{R}^{L+1\times N}$,
and $L$ is the number of pixels in one of the images $f_1$.
The first $L$ rows of ${\bf A}$ are constructed by inserting $1$ in
the column index corresponding to the pixel value of $f_0$
and inserting $-1$
into the column index corresponding to the pixel value of $f_1$:
\bea
  {\bf A}(x+wy,f_0(x,y))&=&1\nonumber\\
%  {\bf A}(x+wy,f_2(\check{x},\check{y}))&=&-1
  {\bf A}(x+wy,f_1(x,y))&=&-1
  \label{eq:reimanna}
\eea
%where $[\check{x},\check{y}]^T
% = \frac{\hat{A}[x,y]^T+\hat{{\bf b}}}{\hat{{\bf c}}^T[x,y]^T+\hat{d}}$,
where $w$ is the width of one of the images,
and the last row of $\bf A$ is all zeros except its last entry which is 1:
\be
 {\bf A}(L+1,N)=1
\ee
All unspecified entries of matrix $\bf A$ are zero.
Vector $\bf K$ is constructed by placing the value $K$ in the first
$L$ entries and $0$ in the last entry.
This is an over-determined system of equations.

The solution that minimizes the
error $\varepsilon = \|AF + K\|^2$ in (\ref{eq:likeriemann})
is the maximum likelihood solution according to the noise model of
(\ref{eq:noisemodel}),
assuming $n_{q_i} >> n_{f_i}$~\cite{comparam}, and is given by:
\be
 \frac{d\epsilon}{d{\bf F}} = 2{\bf A}^T{\bf AF} + 2{\bf A}^T{\bf K} = 0
 \label{eq:normaleq}
\ee
giving
\be
  {\bf F}=({\bf A}^T{\bf A})^{-1}{\bf A}^T (-{\bf K}),
  \label{eq:normaleqsolution}
\ee
%but, rather than computing the above {\em pseudo inverse} or,
%equivalently, 
%computing a Singular Value Decomposition (SVD), (\ref{eq:normaleq})
%is instead generally solved using Gaussian elimination,
%since this leads to fewer floating point operations.
%, which requires $2n^3/3+{\cal O}(n^2) floating point operations.
assuming additive white Gaussian noise.
This solution gives us a way of estimating the camera response function from
two or more differently exposed pictures of overlapping subject matter.

Although this system is massively over-determined, the constraints
follow a comparametric form that admits solutions having
sinusoidal components (e.g. solutions tend to be ``wavy'')~\cite{comparam}.
(Indeed, comparametric forms are very similar to difference equations
as can be seen from the form of (\ref{eq:imagepairsequation}) which
constrains the function over an interval step size of $K_i$ allowing ripples
in the solution.)

\subsubsection{Smoothness and monotonicity constraints}
We can make an inference that the true $F$ is probably smooth, and,
as mentioned previously, $F$ {\bf must} be semimonotonic.

Thus the estimate of $F$ can be constrained in both smoothness and
monotonicity.  A smoothness constraint may be formulated
by appending $N-L_s$ extra rows to ${\bf A}$ and the same
number of extra zeros to the end of $\bf K$,
where $L_s$ is the length of the appropriate
smoothness {\em filter}, $\bf s$, and then
solving (\ref{eq:normaleq}).
The extra rows appended to $\bf A$ are constructed as follows:
Let ${\bf A_s}\in \mathbb{R}^{N-L_s\times N}$ denote the portion appended to $\bf A$,
and create
$\bf A_s$ as a toeplitz matrix in which the first $L_s$ elements
of the first row are the {\em filter} coefficients, and the remaining
rows are appropriately shifted versions of the coefficients:
\be
 A_s =
 \left[\begin{array} {cccccccc}
       s_1& s_2& s_3& \ldots& 0& 0& 0& \ldots\\
       0&   s_1& s_2& s_3& \ldots& 0& 0& \ldots\\
       \ldots\\
       0& 0& 0& \ldots & s_1& s_2& s_3& \ldots
    \end{array}
 \right]
\ee
In order to achieve smoothness, the {\em filter} needs to be a
{\em highpass filter}.
(The intuition for this comes from the fact that by ``looking''
at the function through a highpass filter,
this makes it ``expensive'' for the curve
to have high frequency (non-smoothness) content since the
right hand side vector for this portion of the matrix equations is zero.)

The simplest filter is a three-tap filter $\lambda [1, -2, 1]$ for which
the effect of appending the corresponding
$\bf A_s$ to $\bf A$ is to impose a penalty for
nonzero second derivatives (inflection) of the curve $F$.
The {\em amplitude}, $\lambda$, of the filter, determines
how heavily the smoothness constraint is weighted.
Additionally, a monotonicity constraint may be imposed using
Quadratic Programming (QP).

Examples of determining the response function, $F$, from two differently
exposed images, are shown in Fig~\ref{fig:fig_Finv_s012_s015}(a).
\begin{figure}
  \figlrab{3.2in}{fig_Finv_s012_s015.eps,width=3in}
          {3.2in}{fig_Finv_s012_s015_certainties.eps,width=3in}
  \caption{(a) Response functions estimated from the data for no smoothing
           (solid line), smoothing with $\lambda=100$ (dotted line),
           and smoothing with $\lambda=1000$ (dotted line).
           (b) Derivatives of the response functions make the effects of
           smoothing more evident.  The derivative of the response function
           is called the certainty function, because it shows the sensitivity
           of pixel integer output as a function of changes in
           light~\protect\cite{comparam}.
           Slight ripple in the response function becomes magnified
           and visible as periodicity components in these certainty functions.
           Note how the harmonic contributions have a period of $K$
           (normalized to stepsize $K=1$ in this plot).
          }
  \label{fig:fig_Finv_s012_s015}
\end{figure}
The rippling is particularly evident when we consider the derivatives
of the response functions as shown in Fig~\ref{fig:fig_Finv_s012_s015}(b).

\subsubsection{Computational efficiency}
The least squares formulation of (\ref{eq:normaleqsolution})
can be solved using the standard approach for the normal equations.
Debevec\cit{debevec} presented a similar least squares
formulation, but his was much more computationally intensive because he
introduced extra unnecessary variables corresponding to our $Q$ for
each exposure.  Due to his large number of unnecessary variables he was
forced to use a small number of hand--selected points to calculate
a computationally tractable solution.

In contrast, our method (and earlier methods such as that of~~\cite{mannist})
is computationally tractable, such that it uses the information at all the
pixels and hence could be expected to be much more immune to noise.

%It should be noted that the solution for $F$ is similar to that
%provided in\cit{debevec},
%when the simplest 3-tap smoothness-constraint filter is used, when no
%monotonicity constraint is applied.
%However, the solution
%provided in this paper requires far fewer floating point
%operations\footnote{For example, using images from a PhotoCD,
%                    for which L=6291456 (since PhotoCD images are 2048
%                    by 3072 pixels) the method of\protect\cit{debevec}
%                    requires more than two billion times as many
%                    floating point operations as the method presented here.}.
%Thus, while the method presented in\cite{debevec} is not practical to implement
%on entire images and is therefore typically implemented by hand-selecting very
%small representative portions of the images\cite{debevec},
Our method
is also fully automatic and does not require any hand selection of image
portions in order to make it computationally tractable.

The method presented in this paper can be be made more efficient
(typically hundreds of times more efficient than it already is)
by the following:
\begin{itemize}
  \item Recognizing that many rows of $A$ are repeated, these
        rows can be grouped together into a single row
        and weighted by the square
        root of the number of repetitions.  By weighting $K$
        also by the same factor, the same numerical solution results.
  \item Rows that are zero, which happen when the corresponding pixel
        location in both images has the same value, may be eliminated,
        further reducing the size of the matrix $\bf A$.
\end{itemize}

\subsubsection{Robust statistics and yet more computationally
               efficient solutions: From `` Lebesgue summation'' to
               comparagram}
Another way to interpret the error presented in (\ref{eq:likeriemann})
is to rewrite the summation
as a sum over the range rather than over the domain of the
images\footnote{This summation over level sets is reminiscent of Lebesgue
                integration (e.g. the form of summation
                in (\protect\ref{eq:likelesbegue}) is to the earlier form of
                the summation given in (\protect\ref{eq:likeriemann}) as
                Lebesgue integration is to Riemann integration, so we
                will call it ``Lebesgue summation''.}:
\be
  \epsilon =
   \sum_{m=0}^{N} \sum_{n=0}^{N} J(m,n) ( F(f_0(x,y)) - F(f_1(x,y)) + K ) ^2
  \label{eq:likelesbegue}
\ee
summing over the values of $(x,y)$ for which
$F(f_0(x,y))=m$ and $F(f_0(x,y))=n$.

The quantity $J$ provides a weighting for the number of
pixels in image $f_0$ that are equal to $m$ and
at the same coordinates $(x,y)$ equal to $n$ in image $f_1$.
%%%%($J$ is analogous to the {\em measure} of the Lebesgue integral.)

It will be easier to understand the form of
(\ref{eq:likelesbegue}) by grouping (counting)
all of the pixel values for which $f_0(x,y)=m$
and, in the corresponding location $f_1(x,y)=n$.
The result, denoted by the letter $J$,
is called the {\em comparagram}\cite{comparam}
of the two images, and has dimensions $N$ by $N$.
It should be noted that $J$ captures all of the information
about the tonal relationship between the two pictures while discarding
all of the spatial information in the images.
This tonal relationship is all that is
needed to solve (\ref{eq:likeriemann}) while discarding a great
deal of irrelevant information.
%For example, if we have two
%PhotoCD-sized images (k048 by 3072 pixels), these are reduced
%to a single array of dimension 256 by 256.  This reduction from
%two 6291456 pixel images to one 65536 element array is a reduction
%by a factor of 192, which results in a tremendous computational savings.
To solve (\ref{eq:likelesbegue}), the
matrix ${\bf A} \in \mathbb{R}^{N^2 \times N}$
and the vector ${\bf K}$ are
constructed to have one row for each element of $J$, as follows:
\bea
  {\bf A}(mN+n,m)  &=&\sqrt{J(m,n)}\nonumber\\
  {\bf A}(mN+n,n)  &=&-\sqrt{J(m,n)}\\
  {\bf K}(mN+n)    &=&\sqrt{J(m,n)}
  \label{eq:lesbeguea}
\eea
with the additional constraints appended to $A$ as before,
resulting in the same method of solution as before (least-squares).

Looking at comparagrams of typical sets of differently exposed images,
the comparagrams tend to be very sparse.
Thus working with comparagrams
provides us with a very tidy way of greatly increasing the computational
efficiency, as well as performing robust statistics (pruning low entries).

Thus the computation can be made still more efficient by removing rows of
$A$ and $K$ that are zero, which is done by only including nonempty
bins of $J$ that are off the main diagonal (zero entries result from
either empty bins of $J$ or from diagonal entries of $J$).
Jointly, the comparagram across two images
seldom contains more than ten percent nonzero
entries, so the resulting computational savings is very significant.

Robust statistics provide improved immunity to noise but this immunity
is usually obtained at some computational expense.
However, the formulation of (\ref{eq:likelesbegue}) with solution
(\ref{eq:lesbeguea}) may be made both computationally
more efficient as well as more robust by
thresholding $J$.  This is done by setting any
entries in $J$ that are less than a certain number equal to zero.
A threshold of 10, for example, means that any bins containing less than
10 counts are set to zero prior to solving (\ref{eq:lesbeguea}).
Interestingly enough, this method of throwing away outliers
results in further increases in computational efficiency.


\section{Fitting Comparametric Data to a Known Function}
It has been shown~\cite{intelligentimageprocessing} that the function 
\begin{equation}
 f(q)=\left(\frac{e^b q^a}{e^bq^a +1}\right)^c
 \label{eq:camera_func}
\end{equation}
is able to model response functions from a  variety of cameras.
Not only does the function have the general shape of camera response
functions, it has a shape that, given reasonable parameters, is able
to fit data collected in sensor experiments from several studies.

Using \ref{eq:camera_func}, we may use it and its inverse
\begin{equation}
  f^{-1}(p) = \left(\frac{\root c \of{p}}{e^b(1- \root c \of{p}}\right)^{\frac{1}{a}}
\end{equation}
to find optimum values of $a$ and $c$ using a non-linear solver. This may be
accomplished by first asking ``what quantimetric value $q$ will produce 
a pixel value of $x$'', where $x$ is a pixel value normalized to the range $[0,1]$.
This may be done for pixel values $[0,255]$. Knowing $q$ for each value (by use
of $f^{-1}$), it is now trivial to multiply the values of $q$ by $k$ and produce 
the comparagram which results from $a$ and $c$ as parameters. Using a slimmed 
version of the comparagram, the derived comparagram may be subtracted from the
true comparagram by treating each element as a vector in $\mathbb{R}$$^{255}$.
Taking
any $p$ norm of the resulting vector gives a value which may be minimized 
by a non-linear solver (on the values $a$, $c$, and possibly $k$).

This procedure used on data collected from a Kodak DC260 camera produced the
approximation to the response function shown in Fig. \ref{fig:comparafit_plot}  
\begin{figure*} %%%[!h]
 \figc{/dsc_testpattern/ground_truth_d1_response_function_with_comparasolve_result.eps,width=4in}
 \caption{Using a non-linear solver to attain values for $a$ and $c$ to the
          model function $f(q)=\left(\frac{e^b q^a}{e^bq^a +1}\right)^c$}
 \label{fig:comparafit_plot}
\end{figure*}

where the method is plotted against ground truth data. The ground truth data
in this case was derived using a test pattern of known reflectivity.
\subsection{Downfalls of fitting data to a Known Function}
Though the function presented in (\ref{eq:camera_func}) does approximate
a large number of camera response functions, many cameras have been
manipulated by the manufacturer to produce somewhat unnatural camera
response functions. An example of the is Fig.~\ref{fig:weirdcomparagram},
where it is not hard to imagine that large errors will be incurred when
attempting to fit (\ref{eq:camera_func}) to such data. In this respect,
though the method benefits from a simple closed form solution which is
relatively easy to find, it is not generalizable to all camera response
functions. 

 \begin{figure}
  \figc{Jsum.eps,width=3in}
  \caption{A comparagram (comparasummed) of a camera with a non-standard
           response function.}
  \label{fig:weirdcomparagram}
\end{figure}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Estimation using Log Unrolling of a Comparagram
 with Spline Interpolation}

If the experiment from Chapter 2 is expanded to full images, a method
exists for numerically recovering a camera's response function. This
method is described within this section.

\subsection{A simple example of log unrolling of a comparametric plot
with the aid of spline interpolation}

The purpose of the following example is to demonstrate the concept of
log unrolling on a simple function which produces a known comparametric
plot. The subsequent section will use this technique to approximate a 
camera's response function given a comparagram.

As another illustrative example, this time to explain and demonstrate
log unrolling with spline interpolation,
consider the function
\begin{equation}
  f(q) = \sqrt{q}.
  \label{eq:sqrtexample}
\end{equation}
The comparametric plot of $f(q)$ against $f(kq)$ is given by:
\begin{align*}
%%\begin{eqnarray*}
    f(q) &= \sqrt{q}\\
    f(kq)&= \sqrt{2q}&\text{note $k = 2$}\\
         &= \sqrt{2} \sqrt{q}\\
         &= \sqrt{2} f\\
%%\end{eqnarray*}
\end{align*}
Therefore, the comparametric plot of $f(q)$ against $f(kq)$ 
is simply a straight line, given by:
\begin{equation}
  g=\sqrt{2} f.
  \label{eq:sqrtexamplerolledup}
\end{equation}

Now let us pretend that we had no knowledge of~(\ref{eq:sqrtexample}), e.g.
that we did not know where~(\ref{eq:sqrtexamplerolledup}) came from.

It will now be shown that comparametric unrolling
of~(\ref{eq:sqrtexamplerolledup})
recovers the function $f(q) = \sqrt{q}$. 
Using an assumption which has been shown~\cite{intelligentimageprocessing}
to be valid for camera response functions
$f(0)=0$, the unrolling of $f$ will start at
some small value $\epsilon$ where we also will assume
that $f(\epsilon)=\epsilon$.
This assumption may indeed be incorrect for an arbitrary function, however,
the results will only incorrectly scale the estimate of 
$f$ and may be corrected by a 
linear shift (in log space) after the unrolling is completed.
In this simple example, we let $\epsilon=1$.
Looking at
the comparametric plot at $q=1$,  $f(1) = \sqrt{2} = f(k)$. 
Finding $\sqrt{2}$ along the x-axis, the comparametric plot may again
be evaluated at $q=\sqrt(k)$ to find $f(4)$. 
Not surprisingly, $f(4) = 2$. Continuing the unrolling process,  the following
values are determined
\begin{align*}
    f(0) &=0\\
    f(1) &=1\\
    f(k) &=\sqrt{2}\\
    f(4) &=2\\
    f(8) &=2 \sqrt{2}\\
    f(16)&=4\\
    f(32)&=4 \sqrt{2}\\
\end{align*} 
and so on.

To generate a continuous and differentiable approximation to the camera
response function, the unrolled points may be used as data points from 
which to construct a spline 
interpolation/approximation to the original function. The results of this
method and the derived approximating function are compared to the original
function in Fig~\ref{fig:sqrmeth}.
%\begin{figure}[!h]
\begin{figure*}
  \begin{center}
 \includegraphics[width=3in,angle=270]{/usr/figs/comparam_solution_to_g.eq.sqrtf.eps}
 %\figc{comparam_solution_to_g.eq.sqrtf.eps,width=3in}
 \caption{The result of a spline interpolation with data points derived
          from the log unrolling method.}
 \label{fig:sqrmeth}
  \end{center}
\end{figure*}

\subsection{Recovering a camera's response function using a method of
Forward and Reverse Unrolling with spline interpolation}
The method in the previous section was shown to be effective in 
approximating a function given a comparametric plot of the function. 
Since the unrolling started at the left side of the comparametric plot
and continued to the right, this method of unrolling a function shall
be referred to as forward unrolling.  An alternate method of unrolling is 
to start from some point on the comparagram and unroll in the opposite
direction. This process shall be known as reverse unrolling. 
Due to the fact that when unrolling a function, cumulative errors may
be incurred,  combined forward and reverse unrolling
will be used when applied to a comparagram. 
%\begin{figure}[!h]
%\figc{xsummed.eps,width=2in}   
%\caption{A \emph{comparasummed comparagram} in which the exposure difference
%is 2}
%\label{fig:xsummed}
%\end{figure}

If a comparasum (the sum of the comparagrams
over various exposure ranges) is considered such as in
Fig~\ref{fig:comparagram_pyramid}, 
an approximating function is constructed. The function ideally returns
the mean $f(kq)$ for any $f(q)$ at a discrete point on $[0,255]$
and interpolates sensibly for any intermediate value. 
Many possibilities exist and were attempted in constructing the approximating
function. Among these were a least-squares low degree, (up to 6) polynomial,
spline fits to various points on the comparagram, etc. 
The best approximation to an arbitrary comparagram was found as follows. 
First, bin counts lower than a given value were
disregarded (to remove outliers resulting from sensor noise, JPEG compression
noise and the like. Then first moments were taken for each column of the
comparagram. Finally, splines
with third derivatives being continuous at the second and second
last point (option 'not-a-knot' in Octave's\footnote{the GNU Octave program
may be found at http:$//$www.octave.org} csape)
are used across each column. For
the process which follows, it is helpful to have a continuous function.
Splines give the required continuity and are a sensible method for finding
intermediate values of $f(kq)$. For simplicity, the continuous function
which approximates the comparagram shall be $\phi(q)$ where $q$ is a value
from $[0,255]$.

Having a function which approximates the comparagram, some 
assumptions are made about the underlying response function which gave rise 
to the comparagram.  Recall that we generally assume that the response function 
is such that $f(0) = 0$.  We also assume that $f(1) = 255$, such that
if we normalize the function, we may assume $f(1) = 1$
(the maximum pixel value of an image).
However, to simplify computations, this normalization will not
be applied until the response function has been determined. 
Since the quantity of light $q$ may be rescaled later, 
an assumption shall be made that $\hat f_1(1) = \epsilon$ 
(this is the first approximation to the response function $f$).
In practice, $\epsilon$ will be some small number greater than machine epsilon.
Using the approximation to the comparagram, $\phi$, 
the comparagram may be unrolled in the forward direction. 
Remembering that the comparagram is a plot of $f(q)$ against $f(kq)$,
it is now possible to find $\hat f_1(k)$ by evaluating $\phi(\epsilon)=y_1$. 
The function may continue to be unrolled in the forward direction. To find 
$\hat f_1(4)=y_2$, $\phi$ may be evaluated at $y_1$. If
the comparagram is monotonic (which it should be given the physical 
applications), then
we may continue in this manner until $\phi(y_n) \geq 255$. 
This will imply $\hat f_1(k^{(n-1)})=255$ for some $n \in \mathbb{N}$. %\Bbb{N}$
Since the function $\phi$ is itself an approximation to the comparagram,
which has included sensor noise,compression noise, etc., $\hat f_1$ 
shall be the notation for the function which approximates the true 
camera response function $f$, using the forward unroll procedure. 

Since this unrolling only gives discrete points, to achieve a continuous and
differentiable approximation cubic splines may be used to interpolate between
the unrolled points.

One downfall of the method is that if we have a small error in the 
approximation of the comparagram due to noise, compression,etc., then this 
error will propagate through the calculation. One 
method to counter this error is to find the point in the comparagram which first
attains 255. The value for which this first occurs shall be $2^n$ for some 
$\gamma_1$ such that $\phi (\gamma_1)=\phi (k^n)=255$
where $n$ is to be determined later.  If the assumption is made that
$\hat f_2(k^n) = 255$, then $\hat f_2(k^{n-1})=\gamma_1$. $\hat f_2(k^{n-1})$
may be recovered by locating $\gamma_2$ such that $\phi(\gamma_2)=\gamma_1$. 
Corresponding to what has been done in the forward
unroll, we shall end the procedure at some point when $\phi(\gamma_m)
\approx 0$.  We may then assign the value of $n$ appropriately
(where $\hat f_2(1)$ is the smallest non-zero value achieved during the
reverse unroll and $\hat f_2(k^n) = 255$). 

It is natural to expect that the forward unroll will be accurate for values
close to $0$, but will lose accuracy as $\hat f_1(x)$ approaches 255.
Similarly, it is expected that the reverse unroll will be accurate for 
$\hat f_2(x)$ which yields values close to 255, but will degrade   
for values of $x$ such that $\hat f_2(x)$ is close to 0.

At this point, both approximations were normalized so that the domain 
of both $f_1$ and $f_2$ is $[0,1]$. 
Both the forward and reverse
results were put into a logarithmic scale, and once again splines
were used to achieve a continuous and differentiable approximation to the 
true camera response function.
To achieve a compromise between the two approximations that benefits from the 
strength of both solutions, both approximations were combined into
a single approximation
which was weighted to transition
from $\hat f_1$ for values close to 0, shifting 
to $\hat f_2$ as the function approaches 255.

The results of this procedure are shown in \ref{fig:unrollapprox}, 
where they are compared to the camera function as recovered from use 
of a test chart containing a pattern of
known reflectivity (ground truth).  
%\begin{figure}[!h]
\begin{figure}
 \includegraphics [width=4in, angle=270] {/usr/figs/dsc_testpattern/ground_truth_d1_response_function_with_comparaspline_result.eps} 
 %%\figc{dsc_testpattern/ground_truth_d1_response_function_with_comparaspline_result.eps,width=3in}
 \caption{The results of the forward and reverse unrolled data compared to
          known ground truth data. The ground truth data was recovered by
          images taken of patterns of known reflectivity (DCS Labs test
          pattern) by the same camera from which the comparagram was derived.} 
\label{fig:unrollapprox}
\end{figure}

\section{Comparagrams from non-spatially aligned images}
As mentioned previously, recent work has been done to derive comparagraphs
from images which are not spatially aligned. 

Ideally, a comparagram is a function which plots $f(q)$ against $f(2q)$ 
which appears much like the solid line in 
Fig.~\ref{fig:photocell_experiment_comparametric_plot}.
Unfortunately, the comparagram does not usually appear as this ideal function.
Rather, the comparagram appears as a cloud of points around the function.
Three methods are commonly used for  recovering the underlying comparametric
function, for example, first moments may be calculated down each of the 
columns of the comparagram or across each row of the comparagram. Or,
a maximum likelihood estimator can be used to  pick the indices of
the comparagram which have the highest values as entries. 
Alternatively, a combination of row and column calculations could be taken.
However, due to the nature of the comparagram,  
he marginals of the comparagram may be used. This method is referred to
as the cumulagram algorithm.

\subsection{The cumulagram and it's computation}
Looking at the 
example from the 12 pixel camera, the histogram of the 
first image is: $\{3,2,4,3\}$. The corresponding 
histogram of the second image is $\{2,1,5,4\}$. If each position of the
histogram is numbered from $0$ to $3$, the first histogram has 3 entries
in it's first position. We then ask, ``how far must one look into
the second histogram to account for all of the data in the first
histogram's first position?''. 
To simplify this task the histograms are converted to 
cumulatives, such that each entry of the cumulative expresses how much of
the data is present at each point of the corresponding histogram. 
This implies that the cumulative of the first histogram is
$\{3,5,9,12\}$. The cumulative of the second histogram is $\{2,3,8,12\}$.

Looking at the first entry of the first cumulative, there are 3 data points.
We now progress through the
second cumulative until all three data points have been account for. The 
first entry in the second cumulative is $2$, which does not account for 
all of the data. The second entry is $3$ which accounts for all of the 
data in the first cumulagram entry exactly. For this reason the first cumulagram
entry is 1 (as 1 is the index of the second cumulative where all of the
data has been accounted for). Continuing the process, the second entry in
the first cumulative is 5. Looking at the second cumulative, the second 
entry is 3 and therefore does not account for enough data, however, the third 
entry is 8, which accounts for too much of 
the data. Thus, we must interpolate between these two points (1 and 2) to 
find a reasonable value. Following this method,  
the third value will fall between 
2 and 3, and finally the last value will be exactly 3.

\begin{figure}
 \figc{photocell_experiment/comparagram_to_comparagraph.idraw,width=2.7in}
 \caption{{\bf Slenderizing a comparagram by marginalization:}
          A comparagram is constructed from the two differently exposed
          pictures of Fig.~\protect\ref{inputpictures}(a)
          and Fig.~\protect\ref{inputpictures}(b).  The dynamic range of
          comparagrams tends to be quite high, so here it is displayed on
          a logarithmic scale, $log(J+\epsilon)-log(\epsilon)$,
          $epsilon>0$ preventing calculation of $log(0)$ (and thresholding
          noise below -60dB or so).  Next summing down columns produces
          the marginal $h_f$ along the $f$ axis.
          Summing along rows produces the marginal $h_g$ along the $g$ axis.
          Whatever tone scale is selected for the optimal
          display of the comparagram (typically logarithmic or power law)
          will affect $h_f$ and $h_g$, so $h_f$ and $h_g$ are typically
          logarithmic histograms (or power law compressed histograms).
          Cumulative sums are taken (giving logarithmic cumulatives
          or power law cumulatives).  The comparagram is then re-constructed
          from only its marginals, such that only a slender ridge remains.
          When the width of this slender ridge is collapsed to zero,
          the result is referred to as a {\em comparagraph} of the underlying
          {\em comparamateric function}\protect\cite{intelligentimageprocessing}.
         }
 \label{comparaslim}
\end{figure}


\section{Graphical Aids for Unrolling}
After the method in the previous section was developed, a program using
these methods was constructed. As previously mentioned, the program
is available at http:$//$comparametric.sourceforge.net. The program
uses a GTK+ interface and the analytic methods above to unroll 
comparagrams to approximate response functions. The program is shown
in Fig.~\ref{fig:gunroll_pic}.

\begin{figure*}
 %%\includegraphics [width=2.2in, angle=270] {/usr/figs/gunroll_pic.eps} 
 \figc{gunroll_pic.eps,width=7in}
 \caption{The gunroll interface. Gunroll is able to quickly generate
          an approximate response function given comparagrams generated
          by the comparagram program.} 
\label{fig:gunroll_pic}
\end{figure*}

\chapter{Multimedia Imaging using Photoquantities}

The previous chapters discuss how to accurately recover the camera response
function of a given camera. This allows for taking pixel values
and transforming these values into photoquantimetric values. If this
transformation is applied to a image, the photoquantimetric map lies
in lightspace as opposed to imagespace. In lightspace, the transformed
images are lightvectors, which may be manipulated ordinarily as vectors
in any other vectorspace.

\section{Multimedia Imaging}

Methods of image manipulation in the software in current use manipulate
pixels and not photoquantities. These programs (such as Adobe's 
Photoshop or the superior GNU Image Manipulation Program, the GIMP),
experience problems in compositing images. Composited images often 
become ``washed-out'' due to working in imagespace. This section
demonstrates methods using photoquantities which yield superior results.
Compositing images in lightspace give tonally increased images that
qualitatively look better than those manipulated in imagespace.


Fig.~\ref{fig:agcsequence} illustrates how a camera with
automatic exposure control takes in a typical scene.
%\begin{figure}[h!]
\begin{figure}
%\begin{figure*}
  \figlcrabc{2.2in}{/hart_house_soldiers_tower_newer/s008.eps,width=2.10in}
            {2.2in}{/hart_house_soldiers_tower_newer/s054.eps,width=2.10in}
            {2.2in}{/hart_house_soldiers_tower_newer/s075.eps,width=2.10in}
  \caption{Automatic exposure as the cause of differently exposed pictures
           of the same (overlapping) subject matter:
           (a) Looking from inside Hart House Soldier's Tower,
               out through an open doorway,
               when the sky is dominant in the picture, the exposure is
               automatically reduced, and we can see the texture (clouds, etc.)
               in the sky.  We can also see University College and the CN Tower
               to the left.
           (b) As we look up and to the right, to take in subject matter
               not so well illuminated, the exposure automatically increases
               somewhat.  We can no longer see detail in the sky, but new
               architectural details inside the doorway start to become visible.
           (c) As we look further up and to the right, the dimly lit
               interior dominates the scene, and the exposure is automatically
               increased dramatically.  We can no longer see any detail in the
               sky, and even the University College building, outside, is
               washed out (overexposed).  However, the inscriptions on the wall
               (names of soldiers killed in the war) now become visible.
           (a,b,c) The differently exposed pictures of overlapping subject
           matter can be combined to extend dynamic range and tonal definition,
           or to provide a true photographic quantity ``lightspace'' for
           intelligent vision systems.
          }
  \label{fig:agcsequence}
\end{figure}
%\end{figure*}
As we look straight ahead we see mostly sky, and the exposure is quite
small.  Looking to the right, at darker subject matter, the exposure is
automatically increased.  Since the differently exposed pictures depict
overlapping subject matter, we have (once the images are registered
in regions of overlap) differently exposed pictures of identical
subject matter.  Registration typically also includes correction of barrel
distortion, correction for darkening at the corners of the image, 
etc., to make the camera become
a truly quantimetric instrument. Image registration is covered in greater 
detail in the next chapter.
In this example, we have three very differently exposed pictures depicting
parts of the University College building and surroundings.

\begin{figure*}[t!]
  \figc{hart_house_soldiers_tower_newer/s103445556575dk.eps,width=6in}
  \caption{Successive video frames of Fig.~\protect\ref{fig:agcsequence}
           are spatiotonally aligned, so that they all appear in the same
           spatial (projective) coordinates, as well as in the same tonal
           range (e.g. the same $k$ value).
          }
  \label{fig:agcsequencealigned}
\end{figure*}

\begin{figure*}[t!]
  \figlcrabc{2in}{hart_house_soldiers_tower_newer/virtualcamera0.5.eps,width=1.8in}
            {2in}{hart_house_soldiers_tower_newer/virtualcamera2.0.eps,width=1.8in}
            {2in}{hart_house_soldiers_tower_newer/virtualcamera8.0.eps,width=1.8in}
  \caption{The virtual camera shows an image composite having all the
           information from the images combined together, so that the
           exposure settings of the camera can be set retroactively
           (e.g. after the picture is taken).  Thus at picture taking time,
           we merely measure the quantities of light, and then render these
           into an image at the desired exposure later.
           (a) If we desire to see outside through the archway,
           we can adjust the exposure settings on the virtual
           camera appropriately.  These settings are roughly equivalent to
           what the automatic camera exposure would select, when pointed
           out through the open doorway.
           Here we can see the buildings outside, together with the
           needle-like object to the left of the buildings (Toronto's CN Tower).
           (b) An intermediate exposure shows some detail in the bright
           exterior, and some of detail in the darker interior, but neither
           is ideal.
           (c) The exposure may be adjusted appropriately to show the textual
           inscription on the wall inside.  This exposure is roughly equivalent
           to that of the automatic camera when pointed primarily at the
           text on the wall.
          }
  \label{fig:virtualcameras}
\end{figure*}



\subsection{Composing Images with Certainty Functions}
The inverse camera response function $f^{-1}$ can be used to convert pixel
values into photoquantities, and vice-versa (with $f$). 
%ability to go from imagespace (where image co-ordinate values yield pixels),
%to lightspace (where images co-ordinate values yield photoquantities), as well
%as from lightspace to imagespace. The first transformation may be achieved
%using the inverse of the camera response function $f^{-1}$ whereas the second
%simply uses the derived camera response function $f$. Note that the function
%does indeed have an inverse as it is monotonic and non-decreasing. 
Once the camera response function is estimated, along with
the projective coordinate transformations between successive frames
of video, the images may be brought together into a common coordinate space,
as shown in Fig~\ref{fig:agcsequencealigned}.
Each image may be brought into the same photoquantimetric range by
multiplicative scaling by an appropriate scalar exposure constant $k$.
Each pair of images provides an estimate of $q$ in areas of overlap.
Looking at
Fig.~\ref{fig:unrollapprox}, we see that the camera is most accurate as
a photoquantimetric tool in the middle of it's range (where the derivative
of the function is relatively high). Using the information about certainty
functions from the earlier sections of the paper, we note that this corresponds
to sections in the image of high certainty. Thus to combine two or more images,
a logical method is to multiplicatively scale each image to that of one 
reference frame, then assign a quantimetric value given appropriate weightings
of certainty. This procedure will have the effect of favoring  
photoquantimetric values  where the camera was most accurate at collecting data.
Consequently this gives lower significance to data which was collected in less
responsive areas. This gives us the following equation for a particular 
photoquantimetric value:
\begin{equation}
      X = \frac{\sum_{s=1}^{d} \omega_{s_X} q_{s_X}}{\sum_{s=1}^{d} \omega_{s_X}}
\end{equation}
Where $q_{s_X}$ is the photoquantimetric value of lightspace image $s$ at location
$X=(x,y)$ given $d$ lightspace images. And $\omega_{s_X}$ is the certainty of the 
photoquantimetric value at that location.

\subsection{Viewing High-Dynamic Range Images}
After composing the images, we have a single array of immense
dynamic range (as well as range resolution) and spatial extent (as well
as spatial resolution).
To display such an image on a printed page, we may wish to
reduce the dynamic range down to that of typical print media, which we can
do by presenting the data on a $Q=log(q)$ scale, as shown in
Fig~\ref{fig:logscale_new}. Similarly, a ``virtual camera'' program may
be used to display photoquantimetric maps at different camera $f$-stops 
retroactively as shown in Fig.~\ref{fig:virtualcameras}.
\begin{figure}
  \figc{hart_house_soldiers_tower_newer/s103445556575logscale.eps,width=3in}
  \caption{Cemented video frames of Fig.~\protect\ref{fig:agcsequence}
           after being spatiotonally aligned, are now displayed on a logarithmic
           Quantimetric scale, $Q$.  Here the entire dynamic range is visible,
           but the image lacks the pleasing effect of high contrast, which we
           often desire in images.
          }
  \label{fig:logscale_new}
\end{figure}

Now we seek to produce an image that shows the entire dynamic range on a
high contrast, visually appealing tone scale.  To achieve such a dynamic range
compression, we
relax the {\em monotonicity constraint}\cite{intelligentimageprocessing},
globally, while maintaining it locally.


\subsection{Mean Regression}

The first step in producing a tonally appealing image is to shift the mean
quantimetric value of local regions towards a desired global mean.  This is
done in order to reduce the maximum dynamic range so that the image lies
within some displayable range of values, while still preserving local detail.
It is also desired that brighter regions maintain a higher mean quantimetric
value than darker regions, so that the image retains the \textit{impression}
of global monotonicity (even though global monotonicity may be violated).  

Thus, regions of high quantimetric mean have their mean value reduced, while
regions of low quantimetric mean have their means increased.  In this fashion, local means are moderated toward the desired global mean.  This shift must
also be dependent upon on the original regional mean.  For instance, if all
means are moved to a single midrange value, the image will have an
undesirable flat grey quality.  Thus it is desired that regions which are
extremely light or dark have their means shifted proportionally less to keep
them proportionally lighter or darker.  

To achieve this, the local means are \textit{regressed} towards some midrange
value.  This is regression in the true sense of the word since it moderates
local means by regressing them towards a global mean\cite{galton86}.  The
formulation for the regression used was:
\be
  \alpha = (\frac{\mu_g}{\mu_l})^{a}
\ee
where $\mu_g$ is the global mean, $\mu_l$ is the local mean, and $0<a<1$ is a
scaling constant which affects the severity of the regression.  $\alpha$ is
then used as a constant multiplier on the quantimetric each of the values in
the local region, so $\alpha$ regresses the local mean $\mu_l$ towards the
global mean $\mu_g$.

The fraction $\mu_g/\mu_l$ determines the amount of shift by comparing the
local mean with global mean.  When $\mu_l > \mu_g$, the fraction $\mu_g/\mu_l <
1$ and so $\alpha < 1$ which means the local mean will be shifted downwards.
Similarly, when $\mu_l < \mu_g$, the fraction $\mu_g/\mu_l > 1$  and the local
mean is shifted upwards.  The constant $a$ affects the severity of the shift.
A typical value for $a$ is $a=1/2$, which takes the square root of
$\mu_g/\mu_l$.  This moderates the amount of shift by making the extreme light
and dark regions shift proportionally less.  Local means which are further
from the global mean are thus shifted proportionally less.   It was found that
this type of mean regression was able to bring all regions of the image into a
displayable dynamic range while preserving the impression of global
monotonicity.

\subsection{Variance Scaling}

After the local means have been regressed to the global means, 
all the light and dark regions are
visible and have detail present.  However, the local detail is not very
pronounced nor does it exhibit much local dynamic range.  To enhance local
details after the mean regression, the local image variance is increased.  The
variance is increased about the regressed mean so that the impression of
global monotonicity is preserved even though local detail will likely violate
global monotonicity.

A dilation factor to adjust local variance was calculated as: 
%dilation sounds like morphological filters - is this term OK?

%\be
%  \beta = \beta_m\log(\frac{\sigma_g}{\sigma_l}+\sigma_a)
%\ee
\be
  \beta = \beta_m\log(\frac{\sigma_g}{\sigma_l})+\sigma_a
\ee

where $\sigma_l$ is the local variance, $\sigma_g$ is the maximum variance
found in any of the local regions of the image, $\sigma_a$ and $\beta_m$ are
constants.  $\beta$ is then used as a constant multiplier on the pixel
variance (about the shifted mean) to increase the local variance.

The logarithm is used to properly scale the large range of image variances.
In some areas the variance is sufficient before processing and so no dilation
factor is used.  $\sigma_a$ is a constant, below which no dilation is
applied ($\beta=1$), and above which, the local variance is multiplied by $\beta>1$.
%which region of the logarithm is
%used since it will shift the argument $\sigma_l/\sigma_g$ of the logarithm.  
Thus, $\sigma_a$ defines a smooth threshold above which variances are increased.  $\beta_m$ is a constant
which determines the maximum amount of variance dilation which can occur in
any region.



The result is shown in Fig~\ref{fig:nonmonotonicscale_new}
\begin{figure*}
  \figc{cemented_3.42k2.87b40s1g0.0758577575029184d8.08f1n0.010606und.eps,width=6in}
  \caption{Cemented video frames of Fig.~\protect\ref{fig:agcsequence}
           after being spatiotonally aligned, are now dynamic-range compressed
           in a way that maintains high contrast by relaxing the
          {\em monotonicity constraint}\protect\cite{intelligentimageprocessing}
           so that the entire dynamic range is visible and the contrast is high,
           resulting in a visually pleasing image, typical of what might be
           desired in many multimedia applications.
          }
   \label{fig:nonmonotonicscale_new}
\end{figure*}

\subsection{Determining and processing of neighborhoods
            of large dynamic range}
Having regressed the means and expanded the variances, there
 are cases where adjustment may need to
be moderated. In particular, local regions which contain
large differences in photoquantity will result in neighborhoods
of high variance. The methods used to scale the means and variances in these
local regions tend to be particularly sensitive to local 
variance. Specifically, even small changes in variance in such an 
area produces an ``over-processed'' look. 
The first challenge in dealing with this problem is
to determine local regions which possess this characteristic.
In the images presented up to this point in the  paper
(such as \ref{fig:agcsequencealigned}),
any local region around the archway will demonstrate this
phenomena. Similarly, neighborhoods around the border of the sky and
the top of building to the right will possess this troublesome
property.

   To detect a high variance region which contains sharp
differences in lighting, certainly many techniques may
be used. Among these are harmonic analysis, Fourier analysis,
etc., but are typically computationally expensive. This fact is made
worse by the high resolution needed to produce superior
quality images as well as the floating point precision
needed to store the high dynamic range lightspace images.
For this reason, one logical approach is to calculate a
\emph{general non-uniformity factor} for each local neighborhood.
To be more specific, a local neighborhood shall be an
$n \times n$ block, calculated at intervals of one pixel in both
the vertical and horizontal directions for the entire
image. For each $n \times n$ region, first order spatial gradients
are computed, and the largest spatial gradient in the block
(i.e. the largest pixel difference in either the horizontal
or vertical direction) is compared to a threshold value.
If the largest spatial gradient is greater then the given
threshold, the block is accordingly characterized as non-uniform.
The factor by which the gradient is greater than the threshold
is similarly characterized thus accounting for its description.

   A slightly more computationally expensive method would be to
find subsections of the lightspace image which differ greatly in
photoquantity. Given that such subsections may be determined, the
question still remains as to what factor this should effect the
corresponding calculations and similarly how the size of the
the subsections should factor in.

   Given that non-uniform regions can be consistently determined,
we may then deal with the region in a variety of manners. One
possible method of dealing with the region is to vary the
block size (or a generalized local neighborhood) upon which compressive
adjustments are made, thereby reducing the local variance. Any method
chosen to act on a region in lightspace will thereby be tempered
in a logical manner. Alternatively, we may calculate a
general non-uniformity factor and directly apply it to either
a mean-shifting method, a variance shifting method or both.

\section{Developed Free-Source Tool}

A free-source perl-tk program was developed which uses the techniques
in this chapter. The application is shown in 
Fig~\ref{fig:virtual_camera}. Sliders allow the user to retroactively
change the exposure of the image, as well as the local means and 
variances. This program, called ``virtual camera'' is available from
the http:$//$comparametric.sourceforge.net site in the videorbits
package.
\begin{figure*}[h!]
  \includegraphics[width=\textwidth]{/usr/figs/virtual_camera.eps}
  \caption{The Virtual Camera Program}
   \label{fig:virtual_camera}
\end{figure*}


\chapter{Motion Analysis using Photoquantities}

The previous sections demonstrated methods of determining a 
camera's response function using a plethora of methods which give 
accurate results. Now that the output of a camera can be measured
objectively, previous results from various fields of image analysis
may be reimplemented using photoquantities rather than ambiguous
pixel values.

One of the most obvious reimplementations may be found in image
analysis, or specifically motion analysis. One common algorithm
used in motion analysis is the brightness constancy constraint equation.
\section{Improvements in motion calculation}


Traditionally, most analysis of images captured by cameras of various sort
is done on the resulting pixel values. In particular, a typical camera may 
quantify incoming light as red, green and blue values defined by the amount
of light falling on a sensor array located on the image plane of the camera.
For simplicity, this chapter will only be concerned with 
greyscale images. It should
be observed that all arguments may easily 
be generalized to RGB (red, green, blue) 
images or the like.  The problem with existing analysis 
on the resulting pixels is
that often a linearity is assumed, or is complicated by the 
fact that the pixel values
are non-linear, particularly at the extremes of the 
range of the camera's response
function. 


\subsection{The goal of lightspace imaging in terms of motion analysis}

If the pixel values are converted into photoquantities, 
many functions such as blur,
sharpening or composition yield better results. 
The focus of this chapter will show that
motion estimation in imagespace (using the brightness 
constancy constraint equation
\cite{hornandschunk}),
when done in lightspace (hereafter referred to as the 
lightspace constancy constraint
equation or LCCE)
also yield better results. Computations in lightspace 
also have the benefit of easy
estimates of exposure differences between images. 
This simple computation will also be shown in the
course of the paper. This feature is without question a benefit. 
One particular example
is the presence of AGC (automatic gain control) in 
contemporary commercial cameras. 
The computations in lightspace allow for scalar corrections, 
canceling the effect of
unwanted gain in calculations.


Finally, because of the newfound linearity in lightspace a new derivation of the 
lightspace constraint equation is presented in log lightspace where changes in gain
may be viewed as additive constants.  



\section{Applications of comparametric image processing:
         Motion Estimation}
Although most of image processing is currently done in imagespace,
many forms of image processing can be greatly improved by working
in lightspace rather than imagespace.
Lightspace provides a methodology
for treatment of the camera as a scientific measurement instrument,
as well as providing a for a new genre of visual art,
in which the camera reports the actual quantity of light
arriving from each direction in space.

To align the images spatially, the lightspace constancy constraint equation
is used. This is done by pairwise estimation of the spatiotonal coordinate transformation
between successive pairs of images in the video sequence. To accomplish the
estimation lightspace, the videoorbits algorithm \cite{intelligentimageprocessing}
was modified to accept portable lightspace maps (plms as opposed to ppms).
All calculations were done in double precision IEEE arithmetic. This
modified source code is available at http://comparametric.sourceforge.net.

\subsection{Motion Estimation of images of the same exposure in lightspace}
As a first measure of comparison between the new method proposed and current

imagespace methods, consider the {\em equally exposed} images in Fig.~\ref{fig:dataset1}.
\begin{figure}
 \figlrab{3in}{svv091.eps,width=2.9in}
         {3in}{svv098.eps,width=2.9in}
 \caption{data set for BCCE vs. LCCE comparison in order to show LCCE is much
          better even with no exposure difference. 
          used for lightspace and imagespace methods}
 \label{fig:dataset1}
\end{figure}

The estimation of the motion between the images was done first in imagespace
(using ppm images) and then in lightspace (using plms). It should be noted
that the average pixel distance between the two images is substantial using
the imagespace techniques, though not so for the lightspace technique. 
The videoorbits algorithm fails to converge to the correct parameters, where
the lightspace algorithms does not have a problem. The results in both cases
are shown in Fig.~\ref{fig:lightagainstimage} with a third image in which the
computations were done in log lightspace.
\begin{figure*}
 \figlcrabc{2in}{data_set_1_imagespace.eps,height=3in}
         {2.4in}{data_set_1_lightspace3.eps,height=3in}
         {2.5in}{data_set_1_loglightspace.eps,height=3in}
 \caption{the leftmost image (a) shows the result of the videoorbits algorithm 
          in imagespace where the center image (b) shows the result of the
          videoorbit program in lightspace. Finally, the rightmost image (c) shows
          the result of the videoorbits program using calculations in log 
          lightspace.
          }
 \label{fig:lightagainstimage}
\end{figure*}
As hard evidence to show the improvement using lightspace calculations, the
calculation of the mean squared error using imagespace maps (ppms) was
$525.5$ while in lightspace, the mean squared error was $0.075$ after
converting the estimate into a measure of pixels to fairly compare the 
differences.

  
\subsection{Motion Estimation of differently exposed images in lightspace}

Considering the fact that photoquantimetric values follow the usual
methods of composition in lightspace, we may roughly estimate the difference
in exposure between two images by simply adding all of the photoquantimetric 
values in the first image $\eta=\sum_{\text{image}}X$ (where $X$ is the
photoquantimetric value at position $(x,y)$ in the image), adding all of the
photoquantimetric values of the second image $\theta=\sum_{\text{image}}X$ and
taking the ratio $\eta/\theta$.

Following the procedure of estimating the exposure differences between images
at every iteration, the video orbits program \footnote{available at
http:$//$comparametric.sourceforge.net} 
 allows fast convergence to correct projective
parameters. To see this difference consider the differently exposed images  
of data set 2 Fig.\ref{fig:dataset2}
\begin{figure}
 \figlrab{3in}{s036.eps,width=2.9in}
         {3in}{s044.eps,width=2.9in}
 \caption{data set for BCCE vs. LCCE comparison in order to show LCCE 
          is superior in finding projective parameters in differently
          exposed images.}
 \label{fig:dataset2}
\end{figure}

The results of the videoorbits algorithm when carrying out operations in 
imagespace, lightspace and log lightspace is shown in
Fig.~\ref{fig:pairwisecemented}. Again comparing the mean squared errors,
the MSE for the imagespace computations was $2191.9$ while for
the case of the plms (lightspace) was merely $3.5$
\begin{figure*}
 \figlcrabc{2.4in}{cemented36and44_in_imagespace.eps,width=2.4in}
           {2.4in}{cemented36and44in_lightspace.eps,width=2.4in}
           {2.5in}{cemented36and44_in_log_lightspace.eps,width=2.4in}
 \caption{Comparison of the proposed method with traditional methods.
          (a) Motion estimation in imagespace, e.g. by using the
              Brightness Constancy Constraint Equation (BCCE)
              for motion estimation
              between $f_1$ and $f_2$ gives poor results when there is a large
              jump from one image to the next.
          (b) Parameter estimation in lightspace, e.g. by using the
              Lightspace Change Constraint Equation (LCCE) to estimate the
              motion between $f^{-1}(f_1)$ and $f^{-1}(f_2)$
              gives much better results.
          (c) Parameter estimation in log lightspace, e.g. by using the
              Log Lightspace Change Constraint Equation (LLCCE) to estimate the
              motion between $F^{-1}(f_1)$ and $F^{-1}(f_2)$
              gives the best results.  Moreover, the mathematical formulation
              in log lightspace is much simpler than that of either imagespace
              or lightspace.
         }
 \label{fig:pairwisecemented}
\end{figure*}


\subsection{Compositing images spatiotonally in lightspace}

Each image may be brought into the same photoquantimetric range by
multiplicative scaling by an appropriate scalar exposure constant $k$
(found using the estimation technique of the previous section).
Each image provides an estimate of $q$ in areas of overlap.
Looking at
Fig.~\ref{fig:unrollapprox}, we see that the camera is most accurate as
a photoquantimetric tool in the middle of it's range (where the derivative
of the function is relatively high). Using the information about certainty
functions from the earlier sections of the paper, we note that this corresponds
to sections in the image of high certainty. Thus to combine two or more images,
a logical method is to multiplicatively scale each image to that of one 
reference frame, then assign a quantimetric value given appropriate weightings
of certainty. This procedure will have the effect of favoring  
photoquantimetric values  where the camera was most accurate at collecting data.
Consequently this gives lower significance to data which was collected in less
responsive areas. This gives us the following equation for a particular 
photoquantimetric value:
\begin{equation}
      X = \frac{\sum_{s=1}^{d} \omega_{s_X} q_{s_X}}{\sum_{s=1}^{d} \omega_{s_X}}
\end{equation}
Where $q_{s_X}$ is the photoquantimetric value of lightspace image $s$ at location
$X=(x,y)$ given $d$ lightspace images. And $\omega_{s_X}$ is the certainty of the 
photoquantimetric value at that location.

Once the camera response function is estimated, along with
the projective coordinate transformations between successive frames
of video, the images may be brought together into a common coordinate space,
as shown previously in Fig~\ref{fig:agcsequencealigned}.

%\begin{figure}
% \figc{hart_house_soldiers_tower_newer/s103445556575logscale.eps,width=6in}
%  \caption{Cemented video frames of Fig.~\protect\ref{fig:agcsequence}
%           after being spatiotonally aligned, are now displayed on a logarithmic
%           Quantimetric scale, $Q$.  Here the entire dynamic range is visible,
%           but the image lacks the pleasing effect of high contrast, which we
%           often desire in images.
%          }
%  \label{fig:logscale}
%\end{figure}


Finally, the images may be composed correctly in lightspace to produce the
figure shown in the previous chapter as Fig~\ref{fig:nonmonotonicscale_new}. 
%A preliminary image shown in
%log lightspace is presented in Fig.~\ref{fig:logscale}.
%\begin{figure*}[t!]
%  \figc{cemented_3.42k2.87b40s1g0.0758577575029184d8.08f1n0.010606und.eps,width=6in}
%  \caption{Cemented video frames of Fig.~\protect\ref{fig:agcsequence}
%           after being spatiotonally aligned, are now dynamic-range compressed
%           in a way that maintains high contrast by relaxing the
%          {\em monotonicity constraint}\protect\cite{intelligentimageprocessing}
%           so that the entire dynamic range is visible and the contrast is high,
%           resulting in a visually pleasing image, typical of what might be
%           desired in many multimedia applications.
%          }
%  \label{fig:nonmonotonicscale}
%\end{figure*}


\section{Motion estimation with the
         the Lightspace Change Constraint Equation (LCCE)}
Hundreds of papers have been published on the problems of motion
estimation and frame alignment~\cite{barron95}, and much of this work
is based on the so-called Brightness Constancy Constraint Equation
(BCCE) of Horn and Schunk~\cite{hornandschunk}.
However, a more general formulation, suitable for mediated reality,
is based on the Lightspace Change Constraint Equation
(LCCE)~\cite{intelligentimageprocessing}.

Tsai and Huang~\cite{tsai81} pointed out that the elements of the
projective {\it group} give the true camera motions with respect to a
planar surface, hence the exact match between our chosen model,
and the situation depicted in Fig.~\ref{fig:agcsequence}.
Tsai and Huang explored the group structure associated with
images of a 3-D rigid planar patch, as well as the associated {\em Lie
algebra}, although they assume that the correspondence problem has
been solved.  The solution presented in this paper (which does not
require prior solution of correspondence) also relies on projective
group theory~\cite{artin}.

When the change from one image to another is small, optical
flow~\cite{hornandschunk} may be used.  In 1-D, the traditional
optical flow formulation assumes each point $x$ in frame $t$ is a
translated version of the corresponding point in frame $t+\Delta t$,
and that $ \Delta x \, \mbox{and} \,
\Delta t$ are chosen in the ratio $\Delta x / \Delta t = u$, the
translational flow velocity of the point in question.

It will be shown in the computations below
that it is advantageous to use the log quantimetric quantity, $Q$
in calculations of quantimetric flow~\cite{intelligentimageprocessing}
on $Q(x,t)$ as described by
\bea
  F(x,t) = E(x+\Delta x,t+\Delta t), \; \;
  \forall (x,t)
  \label{eq:motionofeachpointinimage}
\eea
In \ref{eq:motionofeachpointinimage}, $u$ is the translational flow velocity,
${\bf x}=[x,y]$ is the spatial coordinate,
and $F(x,t) = F(Q(x,t))$ is the logarithm of the camera response function
of the log photoquantity, $Q$.

Expanding the right hand side of (\ref{eq:motionofeachpointinimage})
in a Taylor series, and canceling 0th order terms gives:
$uF_x+F_t + h.o.t. =0$, where
$F_x = dF(x,t)/dx$ and $F_t = dF(x,t)/dt$
are the spatial and temporal derivatives respectively, and $h.o.t.$
denotes higher order terms.
Typically, the higher order terms are neglected, giving the expression
for the lightflow at each point in one of the two images:
\be
  u F_x + F_t \approx 0
  \label{eq:bccenoagc}
\ee
However, when automatic gain control is involved (e.g. when there can be
a gain change between successive frames of video), we have
\be
  u F_x + F_t \approx -K
  \label{eq:bcce}
\ee


Consider the lightflow velocity given by (\ref{eq:bcce})
For `projective-flow' (`p-flow')~\cite{intelligentimageprocessing}, substitute
$u=\frac{ax+b}{cx+1}-x$ into (\ref{eq:bcce}).
If we do not wish to apply the ad-hoc weighting
scheme of~\cite{intelligentimageprocessing}, to solve this equation,
we may still estimate the parameters of projectivity in a simple manner,
still based on solving a linear system of equations.
To do this, we write
the Taylor series of $u$:
\be
  u + x = b + (a-bc)x + (bc-a)c x^2 + (a-bc)c^2x^3+ \cdots
  \label{eq:tayloroftaylor}
\ee
and use the first 3 terms, obtaining
enough degrees of freedom to account for the 3 parameters being
estimated.
Letting the squared error due to higher order terms in the Taylor series
approximation be
$\varepsilon=\sum (-h.o.t.)^2 = \sum ((b + (a-bc-1)x + (bc-a)cx^2)E_x+E_t)^2 $,
${\bf q}_2=(bc-a)c$, ${\bf q}_1=a-bc-1$, and ${\bf q}_0=b$,
and differentiating with respect to each of
the 4 parameters of $\bf q$, setting the derivatives equal to zero, and
solving, gives the linear system of equations for `unweighted projective flow':
{\small
\bea
  \left[ \! \!
    \begin{array} {cccc}\sum x^4F_x^2 &\!\!\!\! \sum x^3F_x^2 &\!\!\!\! \sum x^2 F_x^2 &\!\!\!\! \sum x^2 F_x \!\! \\
                        \sum x^3F_x^2 &\!\!\!\! \sum x^2F_x^2 &\!\!\!\! \sum x   F_x^2 &\!\!\!\! \sum x   F_x \!\! \\
                        \sum x^2F_x^2 &\!\!\!\! \sum x  F_x^2 &\!\!\!\! \sum     F_x^2 &\!\!\!\! \sum     F_x \!\! \\
                        \sum x^2F_x   &\!\!\!\! \sum x  F_x   &\!\!\!\! \sum     F_x   &\!\!\!\! \sum     1   \!\!
    \end{array}  \! \!
  \right] \!\!\!
  \left[ \!
    \begin{array} {c} \!\!\!  \alpha \!\! \\
                      \!\!\!  \beta  \!\! \\
                      \!\!\!  \gamma \!\! \\
                      \!\!\!  K
    \end{array} \! \!
  \right] \!\!\!
  = \!
  - \!\!
  \left[ \! \!
    \begin{array} {c}  \sum x^2 F_x F_t \!\! \\
                       \sum x   F_x F_t \!\! \\
                       \sum     F_x F_t \!\! \\
                       \sum         F_t \!\! \\
    \end{array} \! \!
  \right]
  \label{eq:pflow}
\eea
}%end\small
where $\alpha=b$ , $\beta=(a-bc)$, and $\gamma=(bc-a)c$,
and where the two dimensionality of the images is, for simplicity, not shown
explicitly.

\section{Why the LCCE out performs the BCC}
The previous sections show empirically that the LCCE is superior to
the BCC. To some extent, the BCC may be improved by reducing the non-linearity
of the camera response function by excluding pixels on the extremes of the
camera response function (i.e. pixels of value $<25$ and pixels $>230$).
However, as shown previously, cameras may not even respond linearly in
the central region of their associated response function. The fact that
the BCC (and the LCCE) are both derived from first-order Taylor expansions
implies a linearity in the pixels which is not generally present. The 
first chapters show relatively easy methods of discovering the response
function of a given camera along with a host of free-source programs
for computing them. For this reason, it would make no sense not to 
transform imagespace vectors into lightspace vectors. The added benefit
of sensibly dealing with exposure differences certainly adds to this 
argument. 

Earlier, it was shown that a camera does not in fact record radiance
or irradiance. However, the mathematics carried out implies that a
camera does infact record irradiance. To point out one case,
Trucco and Verri state ``It is common experience that, under most
circumstances, the appearent brightness of moving objects remain
constant. We have seen $\ldots$ that the image irradiance is 
propotional to the scene radiance in the direction of the optical
axis of the camera; if we assume the proportiality factor is the 
same across the entire image plane, the constancy of the apparent
brightness of the observed scene can be written at the stationarity
of the image brightness $E$ over time $\ldots$'' \cite{truccoandverri}.
From this initial flaw, later computations become inaccurate. Working
in the correct units, photoquantities corrects for this flaw.



\chapter{Future Directions}
\section{Analytically solving comparametric equations}

The work above has indicated some of the future directions of the
work. The first being that comparagram lead to comparametric 
equations which are relatively easy to solve numerically. However,
solving analytically are very interesting problem which have eluded
even the best mathematicians in the field of functional equations.
In the passed year, Professor Janos Acz\'{e}l was presented with the
problem of comparametric equations. Some information about the 
problem was added by Professor Acz\'{e}l, however, to this date there
has been no clear method of finding the complete family of functions
which are solutions to a given comparametric equation. More research
in this area of mathematics may lead to clear analytical solutions to 
comparametric equations.

\section{Improved Error Estimation}

The method of log unrolling data using splines as of yet has had
no error estimation calculated. This should be non-trivial but 
possible to calculate. If some inital error is assumed $\epsilon$,
then the propagation of the error in further points may be calculated.
The error between points may calculated by noting that  the error
introduced by the spline may be bounded using standard methods. This
implies that the error throughout the entire response function may
be bounded at any point.
 
\section{From Lightvectors to Lightmodules}
A clear statement was made early in this thesis that calculations
would only be done on greyscale values. From this lightvectors were
formulated which lie in a corresponding vector space. When the theory
is expanded into RGB values, the problem becomes more complicated.
Instead of a vectors in a vectorspace, the problem becomes R-modules in
a module space. Virtually no work has been done in this regard and 
is certainly worthy of more research.

\section{Improved Tools}
Many of the programs used for multimedia imaging, such as GIMP, allow
for plug-ins. Given some understanding of the existing code base, it
is possible to construct image plug-ins would would correctly account 
for camera response functions. A tremendous amount of code was developed
in conjuction with this thesis and the scientific papers published
around the time of this thesis. The essential programs needed to demonstrate
the results are working and quite easy to use. The developement of 
\emph{Lightspace GIMP plug-ins} would allow more casual users of 
image manipulation programs to enjoy the benefits of calculations 
in lightspace. 

\section{Adding certainty to the LCCE}
The chapter on motion estimation extensively used the video orbits
algorithm to estimate projective changes of two or more images. At
this point, the certainty function (the derivative of the camera
response function), was mentioned as a means of 
estimating motion more effectively. Theoretically, this would favor
photoquantimetric values where the camera is most accurate. This too
would improve the algorithm. However, at the time of writing this
document, no scientific tests have been done to confirm this 
hypothesis. This feature is yet to be wrtten into the video orbits 
program and is undoubtedly worthy of research.


\chapter{Conclusion}

The previous chapters have shown not only that the transformation
of images into lightspace is easy, but working in lightspace gives a more
true meaning to mathematical operations on images.
Consequently processed images,
or computations on images, such as in computer vision, etc.,
are far superior to those done in imagespace. In fields such as motion
estimation and multimedia imaging in which mathematics is the basis 
of so much of the work, it is truly surpising that so little attention
has been given to the data collected by a camera and what it actually 
represents. Not paying attention to such a critical factor as a camera's
response function largely invalidates many of the mathematical derivations
based on the data. A good example of this is motion estimation. It is
also surprising that a large percentage of those who use image 
manipulation utilities such as Adobe's Photoshop or GIMP allow such 
poor results in such actions as image composition. 

Furthermore, it is evident that users of such programs do in fact care
about the mathematics ``behind the scenes'' by preference for Gaussian 
blur over other less gentle methods. However, when using functions such
as Gaussian blur or the sharpen function, the actions are not mathematically
truthful unless the images are brought into lightspace. Only then are the
effects what they are intended to be.

The use of photoquantites in calculations on images strengthens the
mathematics involved. The use of lightspace ensures that the mathematical
framework of these calculations is true from the very start. Not doing
so creates uncertainty in any calculations, as the data itself can only
be deemed unreliable. 

Though much data has been presented throughout this thesis supporting the
mathematics behind lightspace calculations, the superiority is also 
seen in collecing the data. For example, when the data for motion estimation
was collected, algorithms converged several times faster when lightspace
images were used. Though imagespace computations may also converge, the 
process largely stumbled to a reasonable answer.


\bibliographystyle{apalike}
\bibliography{\path/conf/chirplet}

\end{document}








