The Problem With Photorealism
Many people assume that modern graphics technology is now capable of rendering photorealistic video games. If you define photorealistic as any still frame is indistinguishable from a real photo, then we can get pretty close. Unfortunately, the problem with video games is that they are not still frames - they move.
What people don’t realize is that modern games rely on faking a lot of stuff, and that means they only look photorealistic in a very tight set of circumstances. They rely on you not paying close attention to environmental details so you don’t notice that the grass is actually just painted on to the terrain. They precompute environmental convolution maps and bake ambient occlusion and radiance information into level architecture. You can’t knock down a building in a game unless it is specifically programmed to be breakable and all the necessary preparations are made. Changes in levels are often scripted, with complex physical changes and graphical consequences being largely precomputed and simply triggered at the appropriate time.
Modern photorealism, like the 3D graphics of ages past, is smoke and mirrors, the result of very talented programmers and artists using tricks of the eye to convince you that a level is much more detailed and interactive than it really is. There’s nothing wrong with this, but we’re so good at doing it that people think we’re a heck of a lot closer to photorealistic games then we really are.
If you want to go beyond simple photorealism and build a game that feels real, you have to deal with a lot of extremely difficult problems. Our best antialiasing methods are perceptual, because doing real antialiasing is prohibitively expensive. Global illumination is achieved by deconstructing a level’s polygons into an octree and using the GPU to cubify moving objects in realtime. Many advanced graphical techniques in use today depend on precomputed values and static geometry. The assumption that most of the world is probably going to stay the same is a powerful one, and enables huge amounts of optimization. Unfortunately, as long as we make that assumption, none of it will ever feel truly real.
Trying to build a world that does not take anything for granted rapidly spirals out of control. Where do you draw the line? Does gravity always point down? Does the atmosphere always behave the same way? Is the sun always yellow? What counts as solid ground? What happens when you blow it up? Is the object you’re standing on even a planet? Imagine trying to code an engine that can take into account all of these possibilities in realtime. This is clearly horrendously inefficient, and yet there is no other way to achieve a true dynamic environment. At some point, we will have to make assumptions about what will and will not change, and these sometimes have surprising consequences. A volcanic eruption, for example, drastically changes the atmospheric composition and completely messes up the ambient lighting and radiosity.
Ok, well, at least we have dynamic animations, right? Wrong. Almost all modern games still use precomputed animations. Some fancy technology can occasionally try to interpolate between them, but that’s about it. We have no reliable method of generating animations on the fly that don’t look horrendously awkward and stiff. It turns out that trying to calculate a limb’s shortest path from point A to point B while avoiding awkward positions and obstacles amounts to solving the Euler-Lagrange equation over an n-dimensional manifold! As a result, it’s incredibly difficult to create smooth animations, because our ability to fluidly shift from one animation to another is extremely limited. This is why we still have weird looking walk animations and occasional animation jumping.
The worst problem, however, is that of content creation. The simple fact is that at photorealistic detail levels, it takes way too long for a team of artists to build a believable world. Even if we had super amazing 3D modelers that would allow an artist to craft any small object in a matter of minutes (which we don’t), artists aren’t machines. Things look real because they have a history behind them, a reason for their current state of being. We can make photorealistic CGI for movies because each scene is scripted and has a well-defined scope. If you’re building GTA V, you can’t somehow manage to come up with three hundred unique histories for every single suburban house you’re building.
Even if we did invent a way to render photorealistic graphics, it would all be for naught until we figured out a way to generate obscene amounts of content at incredibly high levels of detail. Older games weren’t just easier to render, they were easier to make. There comes a point where no matter how many artists you hire, you simply can’t build an expansive game world at a photorealistic level of detail in just 3 years.
People always talk about realtime raytracing as the holy grail of graphics programming without realizing just what is required to take advantage of it. Photorealism isn’t just about processing power, it’s about content.
First of all, I really enjoyed your article. I feel like I can hear you talking through your text.
Question for you: Are we getting to a point where we can use 3d cameras to help model objects/places? I remeber the game LA Noire used 3d cameras for the actors which made for very convincing faces, expressions, and even the lips matched what was being said.
Certianly not perfect, but it would seem we getting close to a point were we can take Hi-res 3d imagery/video (possibly items on a turntable) and create great models with textures to match?
"I feel like I can heard you talking through your text."
God. This is all I want someone to comment on something I write. High praise indeed, and well deserved.
The thing with LA Noire is still scripted. The facial expressions were recorded by the actor by multiple high resolution cameras at multiple angles. The amount of effort that went into it was pretty cool.
Still, it's scripted. Basically, if scene A happens, play "Annoyed" sequence, if Scene B happens, play "surprised", etc.
Currently, we can take hi-res 3D scanned objects and put them into games, provided you aren't scanning in cloth. Scan a cup, car, object, food, all doable objects, but once you introduce dynamic material into the mix the calculations for it become troublesome. Scanning objects into a game is done somewhat like you said, put on a turntable and scanned. Other times it's having the camera move around a stationary object, reverse turntable?
I believe that a vast part of the problems you mention could be solved by a very advanced AI and, probably, quantic computers.
But i feel like it´s not a game anymore, it´s much more close to a simulation of reality, i don´t think that´s the point of games, you don´t have to interact with everything on the stage to get the most fun, quite the opposite i think.
I look at the loading scenes in GTAV and I wish the game looked like that, rather than the 'uncanny valley' that it is.
Borderlands II is a great example of not trying to do something you cant, instead it has great art that you can lose yourself in.
Calling 3D Makers artists but programmer not, is lächerlich. Maybe you did not mind it. Just a remark.
In your say something and just after that the contrary or don't apply the same reasoning. I promised my virtual partner not to stay too log in front of the computer because of «someone is wrong». You wrote it somewhere else in your blog that «EVERYBODY IS AN IDIOT» so I assume, you are too, just like myself.
For instance «If you're building GTA V, you can't somehow manage to come up with three hundred unique histories for every single suburban house you're building.». You just wrote before, «Modern photorealism, like the 3D graphics of ages past, is smoke and mirrors, the result of very talented programmers and artists using tricks of the eye to convince you that a level is much more detailed and interactive than it really is.». Those issues can be considered the same "problem" even if you come up with different conclusions. "300 unique histories" might not be required & it's not impossible and can be done in 3 years.
You seem to have a lot of time to write. I find this *very* *very* admirable. I'm impressed. I'd like to write as much as you.
Quick listening of your tunes on bandcamp, try https://www.youtube.com/watch?v=ijiAas3DI2U
HTH
Can you elaborate on what you meant in this paragraph?
> If you want to go beyond simple photorealism and build a game that feels real, you have to deal with a lot of extremely difficult problems. Our best antialiasing methods are perceptual, because doing real antialiasing is prohibitively expensive.
What do you mean by "real antialiasing"?
Real antialiasing requires you to either calculate precise coverage over a given pixel for shadows, or sample the pixel a couple hundred times. This is obviously not actually *required* for the vast majority of antialiasing, which is why perceptual algorithms usually do the job. It becomes important for things like the sun shining through dense forest foliage - extremely tiny holes can let through a significant amount of light, and those holes are often MUCH smaller than a single pixel, which breaks a lot of our antialiasing algorithms.