Periodicity-in-perspective gives rise to the so-called "p-chirp" (a special kind of chirp comprising projected harmonic components). By "p-chirp", we mean the kind of effect you see if you look down a railway line toward the vanishing point.
If we take a picture of that view down a railway line, and set that picture on a flat surface, and take a picture of that picture, we observe that not only are the borders of the picture no-longer parallel, but the "chirp-rate" has changed. A picture of a picture often looks strange, which is not surprising, but what is surprising is that a picture of a picture of a flat object (such as a whiteboard, or a wall with writing on it) can also look kind of strange. Twice-projected planar surfaces "live" inside an 8-parameter group-action of coordinate transformation operators.
In a picture of a picture, the location of the vanishing point has changed with respect to the point where the edges of the new image meet. (Edges used to meet at infinity in the original image because the original was rectangular but the new image is NOT rectangular).
A special class of "picture-of-picture" will result in an image where objects that were periodic in the original scene become periodic again in the "picture-of-picture". We refer to this recovering of periodic structures as "dechirping" (using the language of machine vision, we say that we have rectified the image with the appropriate homography of the plane that maximizes Fourier spectral sharpness).
When we dechirp the image, we note that the spatial frequency is no longer changing. There are the same number of railway ties per pixel all the way from top to bottom of the "dechirped" image. Of course, the ties that were closer to the camera (on Mass Ave) are sharper and more well-defined than those further from the camera.
Superresolution Image Mosaics, or resolution enhancement, as it is traditionally called in the literature, involves the use of multiple low-resolution video frames to produce a single high-resolution frame, by filling in some of the spaces between the pixels.
Traditionally resolution enhancement has been accomplished by shifting images around to make them fit together, or by using a more general affine model. However, the traditional euclidean or affine models fail to account for the "chirping" effect that you get from camera panning and tilting.
Suppose, for example, we were to take a second picture of the railway tracks from that foot bridge you see off in the distance, running above the tracks, but took the picture looking back toward the first camera (so that both pictures were of the same objects, namely the railway ties between Mass Ave and the bridge), we could "dechirp" the second image and then put the two together. Notice how the original image is sharp at Mass ave, medium sharp between Mass Ave and the bridge, and then blurry under the bridge.
The composite image would go from sharp to medium to sharp again (sharp near Mass Ave, and near the bridge, but medium sharp in-between); the railway ties near Mass ave would get their sharpness from the first picture; those near the bridge would get their sharpness from the second picture. With many more pictures (as in video) it is not hard to imagine how one can make a very sharp picture, since each picture captures some aspect of the scene very clearly, and even if it does not, the combined effect of all the images give a "pixel-filling-in" effect like Synthetic Aperture Radar (SAR).
This general framework using periodicity-in-perspective is based on the concept of p-chirps. A related concept of q-chirps has also appeared with applications in radar. You can take a look at some p-chirps and q-chirps.