It’s hard to take a photo through a window without picking up reflections of the objects behind you. To solve that problem, professional photographers sometimes wrap their camera lenses in dark cloths affixed to windows by tape or suction cups. But that’s not a terribly attractive option for a traveler using a point-and-shoot camera to capture the view from a hotel room or a seat in a train.
MIT researchers will present a new algorithm that, in a broad range of cases, can automatically remove reflections from digital photos. The algorithm exploits the fact that photos taken through windows often feature two nearly identical reflections, slightly offset from each other. This algorithm will work only with the window producing double reflection. Normally the double-paned windows and very thick windows produce double reflection. With double-paned windows, there’s one reflection coming from the inner pane and another reflection from the outer pane. Thick windows also usually produce a double reflection, one reflection from the inner side and the other reflection from outer side.
Both the reflected image and the image captured through the window have the statistical regularities of so-called natural images. The basic intuition is that at the level of small clusters of pixels, in natural images — unaltered representations of natural and built environments — abrupt changes of color are rare. And when they do occur, they occur along clear boundaries. So if a small block of pixels happens to contain part of the edge between a blue object and a red object, everything on one side of the edge will be bluish, and everything on the other side will be reddish.
In computer vision, the standard way to try to capture this intuition is with the notion of image gradients, which characterize each block of pixels according to the chief direction of color change and the rate of change. Researchers found that this approach didn’t work very well.
This algorithm that divides images into 8-by-8 blocks of pixels; for each block, it calculates the correlation between each pixel and each of the others. The aggregate statistics for all the 8-by-8 blocks in 50,000 training images proved a reliable way to distinguish reflections from images shot through glass.
The ideas here can progress into routine photography, if the algorithm is further robustified and becomes part of toolboxes used in digital photography. It may help robot vision in the presence of confusing glass reflection.