Wednesday, January 12, 2011

Eye Tracking, Head Tracking, and Ray Tracing

It is possible with today's technology and processor power to provide photo realistic 3-D images on today's technology, with combining three rather astounding technologies. All this requires is something I call CYAO or (Cheating Your Ass Off).

Eye tracking technology has been around for a long while. Basically what it does is track what your eyes are currently looking at. So for various activities you can gauge what is happening using signals that people aren't always aware of, where they look. Most of this is unconscious and there's some really cool information that comes out of eye tracking. But, one of the interesting things you can do, is only display stuff where they are looking. If you're not looking somewhere, you really are largely unaware that you're not seeing it. Your brain fills in that stuff. But, if when ever you look at a screen you see stuff, you have every reason to think that that stuff is still there even when it isn't, or that it has significantly more detail than it actually does. After all we see very little outside the area we are focused on. Dan Dennett in "Consciousness Explained" on page 361-2 relays a early version of this technology with regard to words. and the change of the word while the person registered a saccade (or a quick eye movement to a new location). So what looked normal to the person in the contraption which used a bite bar and a beam of reflecting light on the eyeball (the technology has come along from then, but it's fine for the example). So when the eyes moved, the computer figured out where they were going and switched that word for a different word. In the machine it looks like nothing is wrong, but really the world is being changed "before your eyes".

Now this is just interesting. Just as interesting is Head Tracking which is a more general trick to do some semi-3D viewing. Basically if you know where somebody's head is, and they move their head, You can treat the monitor like a window and adjust the picture with regard to the persons head. Thereby making it seem that they can see new stuff, by moving slightly.



Now that's awesome too. And in fact, you can do that with a webcam and some good software that tracks the face. And frankly even the old bit-bar Egor style stuff would work. This is all part of today's technology. It's entirely possible to, with today's technology, create a three-D image that is foggy and not detailed until you look at it and for it to become crisp where you look at it. That's an interesting set of ideas. But, mostly worthless until you bring in a third technology we have: Ray tracing.





Raytracing is the renderization algorithm of the future. It's is used in films like Lord of the Rings and others to get photo realistic images. The way rasterization works is that it takes bunch of numbers and multiplying them to figure out exactly what the scene should look, by applying different effects to core objects and solving the screen image as a problem. Now, this takes a lot of math, and in fact that's what GPUs do, a heck of a lot of really fantastic math multiplying large matrices of number together to figure out exactly what you should see. Raytracing is different. Ray tracing works by taking individual pixels on the screen and solving for them. So if you were a photon, coming in at that exact pixel, what color would you be? Well to solve this we take a line in 3D, and see what it would hit in the scene. How much of that gets that color, does the ray bend through a glass of water, does it reflect? How did the light get there? (Eye based ray tracing is inverted from real photons in that we're tracing backwards to see what light source caused it). The effect allows for absolutely beautiful scenes that you can stream to a labtop if you have four rather massive servers working on the problem solving for each pixel. It's great if you're doing a film and have months to do an image, but if you want it in real time, you need a supercomputer. Because it's simply too hard to do.

Well, it's too hard to do if you actually do it. If you cheat your ass off, you can save yourself 99% of the work, and do it on today's technology. What do you need to do? Only show the detail where the person is looking! If you know where they are looking, that's where you need the mind blowing detail. The rest of the crap can be absolutely fuzzy and phoned in. You don't need to solve a pixel that the person looking at it isn't going to see. 99% of the work done by the server is pointless because she is only looking at 1% of the screen at any point and time. If you know where they are looking, you can give that spot the detail and ignore the rest. You can easily track the person's head and eyes and use that as part of the raytracing algorithm, because we know where the end point is, you know exactly which line to use. Head tracking pretty much comes as a freebee when you know where the pixel is actually going. You know what angle they are looking at that pixel from, and necessarily what should be in the scene at that angle. So rather than a flat picture, you can do the same amount of work and get an insanely good 3-D picture.

When you combine technologies you can get absolutely mindblowing results.

So there's my billion dollar idea for the day (and one one actually worth that billion). It's possible with today's technology and processor power to provide photo realistic 3-D images. Because if you know your audience you know what they want to see (what they are looking at) and can give them what they want (astounding detail) by not giving them massive amounts of what they don't care about (what they aren't looking at).

No comments: