Image Processing for the Average Person Part 1 – The Human Visual System

There have been a few things in the news about how computers work with images that I feel are a bit misinformed. I believe these reports mislead the average person about how image processing works. As a huge part of my background, and current business, involve image processing, I thought I would start a series of posts about how computers manipulate images, from zooming in and out, to doing enhancement tasks. I hope to give a decent explanation of things so that you, the reader, will have a better understanding and will be able to separate fiction from facts, politics from reality.

First I want to start with the most import part: the human visual system. It is indeed a miracle of evolution, and works pretty well in helping us to navigate our environment. You might be surprised to find, however, that it’s not exactly as good as you might think it is.

Our Vision System Makeup

The Human VisuaL Pathway, Miquel Perello Nieto, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons
The Human VisuaL Pathway, Miquel Perello Nieto, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

A simplified view of our visual system is that it is made up of the eye, which can be thought of as a camera, the optic nerve (USB cable), and the brain (computer). In reality, there are many more parts to this, including multiple parts of the eye, different parts of the brain that detect and react to different parts of what the eye sees, and so on. There are numerous articles online that break this down into much more detail, but for this series it is enough to use the simplistic point of view.

Physical Characteristics

The specs of our visual system are roughly what is listed below:

  • Much like a physical camera, the performance of our eyes depends a lot on age and the quality of the parts. As we get older, our lens stops performing as well, we get eye floaters, and other issues.
  • Our eyes have receptors in the back that fire when light hits them.
  • Each eye has what is known as a blind spot located where the optic nerve passes through the optic disc. There are no light receptors here so it no data can go to the brain from this area. Do not feel bad, though, as all vertebrate eyes have a blind spot, so it is not just us humans.
  • Our eyes can adapt to a range of intensity of almost ten orders of magnitude. It cannot operate over all of this range simultaneously, however.
  • While we think our eyes see the same all over, we actually only see clearly over the fovea. The fovea is the part of the eye that receives light from the central two degrees of our field of view. To get an idea about how small this is, imagine holding a quarter or half dollar coin at your arm’s length.
  • It is actually hard to assign a resolution such as 1920×1080 to the human eye, as resolutions are dependent on characteristics like sensor size, pixel density, and so on. Instead, we need to think about it in terms of how many pixels make up our vision. Our total field of vision can be thought of as having around 580 megapixels of resolution. Keep in mind that this represents our total field of vision, and that our fovea is the part of the eye that clearly focuses light.
  • Our fovea can be thought of as only being around seven megapixels of resolution. Our eyes are constantly in motion so create our field of view by sending in multiple snapshots to the brain to create our sight. Estimates are that outside of the fovea, the rest of the snapshot is thought to only contain about one megapixel of data.
  • If we want to think in terms of a video frame rate, our eyes and brains can only process around a paltry fifteen frames a second. We see an illusion of motion due to a concept called beta movement. This is chiefly due to how long the visual cortex stores data coming in from the eyes.

Processing Characteristics

Once light coming into our eyes passes to the brain, it runs into several systems that work up to us cognitively recognizing what we are looking at. Again, I am not going to get into the weeds here as there is already plenty of information online about what goes on in portions of the brain such as the visual cortex.

The comparison to a simple camera breaks down here, as our brain has a final say in what we actually see. Parts of the brain work together to help us understand the different parts of the chair, but in the end we decide “Oh I’m looking at a chair.” The brain can also be fooled in its interpretation of what the physical part of the visual system is seeing.

Two profiles or a vase? - Ian Remsen, CC0, via Wikimedia Commons
Two profiles or a vase? – Ian Remsen, CC0, via Wikimedia Commons

An example of this trickery is in optical illusions. This happens when the brain tries to fill in the gaps of information that it needs to decide. It can also misinterpret geometrical properties of an object that results in an incorrect analysis.

The brain merges an amalgamation of what the eyes see into our view of the world. Our eyes are constantly moving, making minute changes to what they are focusing on as we are looking at something. The brain interpolates incoming information to fill in the gaps from parts of the eye like the blind spot and faulty receptors. This means that the brain does a lot of processing to generate what we perceive as our default field of view.

This is a lot of information, so the brain takes as many shortcuts as it can in processing our visual data. We may have a super computer on our necks, but it can only process so much so quickly. This is where comparing our eyes to a camera breaks down as a lot of what we see is based on perception versus physical processing. Our brains cannot store every “megapixel” of what we see in our memories either, so we remember things more as concepts and objects than each individual component of a picture. We simply do not have enough storage to keep everything in our memory.

This finely balanced system of optics and processing and simplification can also break down. We see fast motion as being blurred, or, well, having motion blur. This is because our eyes cannot move fast enough and our brain cannot process fast enough to see individual images, so the brain adds in blur so we understand something is in motion. Now, on a sufficiently high frame rate high definition display, objects are captured without blur, which can mess with our brain’s processing and cause us to have a headache. Think of it as our brain trying to keep up and basically having a blue screen.

This is probably a good place to wrap this up today. I mainly wanted to give a quick explanation of how we see the world to demonstrate that our own eyes are not always perfect, and that a lot goes on behind the scenes to enable our vision.

Next time I’ll start going into some specifics, including showing the difference between what we see and what a computer might see.