Wavelet Representations

We have used wavelet representations for much of our work on visual inference.  The use of such representations is one of many choices to use at the front-end to a computer vision system, but we have found that the wavelet framework has served us very well from the point of view of flexibility, speed and accuracy of performance.

At one level of visual inference, we use algorithms based on spatial filters that are similar in spirit to Gabor functions, but have different numerical properties, and cannot be specified easily through analytic expressions in space.  They are similar to complex spatial Gabor functions, but are usually used in a multi-rate pyramid.  They offer spatial patterns that are similar to the Dual Tree Complex Wavelet Transform, but have fewer directional channels whilst being rotationally separable.  For more of this early work, see the paper by Bharath and Ng.  You may also be interested in looking at the equivalence between convolutional networks and wavelet transforms.  For some hints, see the brief explanation on the origins of Cortexica’s technology, here.

We have been using such representations along with simple non-linearities that simulate, in a crude way, population normalisation and encoding effects that are found in biology, allowing us to infer orientation.  If you are familiar with computer vision, you can think of this as a generalisation of multiscale gradient fields used in computer vision. We think this approach complements spatial gradients very nicely.

A rendering of orientation encoding is shown immediately below. Examples, generated in real-time, and dating from several years ago are to be found in our Video page.

 

 

Orientation Dominance as a 3D "floating" RenderOrientationDominance-492x270