Erik McClure

C# to C++ Tutorial - Part 4: Operator Overload


[ 1 · 2 · 3 · 4 · 5 · 6 · 7 ]

If you are familiar with C#, you should be familiar with the difference between C#’s struct and class declarations. Namely, a struct is a value type and a class is a reference type, meaning that if you pass a struct to a function, its default behavior is for the entire struct to be copied into the function’s parameter, so any modifications made to it won’t affect whatever was passed in. On the flip side, a class is a reference value, so a reference is passed into the function, and any changes made to that reference will be reflected in the object that was originally passed into the function.

// Takes an integer, or a basic value type
public static int add(int v)
{
  v+=3;
  return 4+v;
}

public struct Oppa
{
  public string gangnam;
}

// Takes a struct, or a complex value type
public static Oppa style(Oppa g)
{
  g.gangnam="notstyle";
  return g;
}

public class Psy
{
  public int style;
}

// Takes a class, or a reference type
public static void change(Psy psy)
{
  psy.style=5;
}

// Takes an integer, but forces it to be passed by reference instead of by value.
public static int addref(ref int v)
{
  v+=3;
  return 4+v;
}

int a = 0;
int b = add(a);
// a is still 0
// b is now 7

int c = addref(a);
// a is now 3, because it was passed by reference
// c is now 7

Oppa s1;
s1.gangnam="style";
Oppa s2 = style(s1);
//s1.gangnam is still "style"
//s2.gangnam is now "notstyle"

Psy psy = new Psy();
psy.style=0;
change(psy);
// psy.style is now 5, because it was passed by reference
C++ also lets you pass in parameters by reference and by value, however it is more explicit about what is happening, so there is no default behavior to know about. If you simply declare the type itself, for example (myclass C, int B), then it will be passed by value and copied. If, however, you use the reference symbol that we’ve used before in variable declarations, it will be passed by reference. This happens no matter what. If a reference is passed into a function that takes a value, it will still have a copy made.

// Integer passed by value
int add(int v)
{
  v+=3;
  return 4+v;
}

class Psy
{
public:
  int style;
};

// Class passed by value
Psy change(Psy psy)
{
  psy.style=5;
  return psy;
}

// Integer passed by reference
int addref(int& v)
{
  v+=3;
  return 4+v;
}

// Class passed by reference
Psy changeref(Psy& psy)
{
  psy.style=5;
  return psy;
}

int horse = 2;
int korea = add(horse);
// horse is still 2
// korea is now 9

int horse2 = 2;
int korea2 = addref(horse2);
// horse2 is now 5
// korea2 is now 9

Psy psy;
psy.style = 0;
Psy ysp = change(psy);
// psy.style is still 0
// ysp.style is now 5

Psy psy2;
psy2.style = 0;
Psy ysp2 = changeref(psy2);
// psy2.style is now 5
// ysp2.style is also 5
However, in order to copy something, C++ needs to know how to properly copy your class. This gives rise to the copy constructor. By default, the compiler will automatically generate a copy constructor for your class that simply invokes all the default copy constructors of whatever member variables you have, just like C#. If, however, your class is holding on to a pointer, then this is going to cause a giant mess when two classes are pointing to the same thing and one of the deletes what it’s pointing to! By specifying a copy constructor, we can deal with the pointer properly:

class myString
{
public:
  // The copy constructor, which copies the string over instead of copying the pointer
  myString(const myString& copy)
  {
    size_t len = strlen(copy._str)+1; //+1 for null terminator
    _str=new char[len];
    memcpy(_str,copy._str,sizeof(char)*len);
  }
  // Normal constructor
  myString(const char* str)
  {
    size_t len = strlen(str);
    _str=new char[len];
    memcpy(_str,str,sizeof(char)*len);
  }
  // Destructor that deallocates our string
  ~myString()
  {
    delete [] _str;
  }

private:
  char* _str;
};

This copy constructor can be invoked manually, but it will simply be implicitly called whenever its needed. Of course, that isn’t the only time we need to deal with our rogue pointer that screws things up. What happens when we set our class equal to another class? Remember, a reference cannot be changed after it is created. Observe the following behavior:

int a = 3;
int b = 2;
int& ra = a;
int* pa = &a;

b = a; //b is now 3
a = 0; //b is still 3, but a is now 0
b = ra; // b is now 0
a = 5; // b is still 0 but now a is 5
b = *pa; // b is now 5
b = 8; // b is now 8 but a is still 5

ra = b; //a is now 8! This assigns b's values to ra, it does NOT change the reference!
ra = 9; //a is now 9, and b is still 8! ra STILL refers to a, and NOTHING can change that.

pa = &b; // Now pa points to to b
a = *pa; // a is now 8, because pointers CAN be changed.
*pa = 7; // Now b is 7, but a is still 8

int*& rpa = pa; //Now we have a reference to a pointer (C++11)
//rpa = 5; // COMPILER ERROR, rpa is a reference to a POINTER
int** ppa = &pa;
//rpa = ppa; // COMPILER ERROR, rpa is a REFERENCE to a pointer, not a pointer to a pointer!
rpa = &a; //now pa points to a again. This does NOT change the reference!
b = *pa; // Now b is 8, the same as a.
So somehow, we have to overload the assignment operator! This brings us to Operator Overloading. C# operator overloading works by defining global operator overloads, ones that take a left and a right argument, and are static functions. By default, C++ operator overloading only take the right argument. The left side of the equation is implied to be the class itself. Consequently, C++ operators are not static. C++ does have global operators, but they are defined outside the class, and the assignment operator isn’t allowed as a global operator; you have to define it inside the class. All the overload-able operators are shown below with appropriate declarations:

class someClass
{
someClass operator =(anything b); // me=other
someClass operator +(anything b); // me+other
someClass operator -(anything b); // me-other
someClass operator +(); // +me
someClass operator -(); // -me (negation)
someClass operator *(anything b); // me*other
someClass operator /(anything b); // me/other
someClass operator %(anything b); // me%other
someClass& operator ++(); // ++me
someClass& operator ++(int); // me++
someClass& operator --(); // --me
someClass& operator --(int); // me--
// All operators can TECHNICALLY return any value whatsoever, but for many of them only certain values make sense.
bool operator ==(anything b); 
bool operator !=(anything b);
bool operator >(anything b);
bool operator <(anything b);
bool operator >=(anything b);
bool operator <=(anything b);
bool operator !(); // !me
// These operators do not usually return someClass, but rather a type specific to what the class does.
anything operator &&(anything b); 
anything operator ||(anything b);

anything operator ~();
anything operator &(anything b);
anything operator |(anything b);
anything operator ^(anything b);
anything operator <<(anything b);
anything operator >>(anything b);
someClass& operator +=(anything b); // Should always return *this;
someClass& operator -=(anything b);
someClass& operator *=(anything b);
someClass& operator /=(anything b);
someClass& operator %=(anything b);
someClass& operator &=(anything b);
someClass& operator |=(anything b);
someClass& operator ^=(anything b);
someClass& operator <<=(anything b);
someClass& operator >>=(anything b);
anything operator [](anything b); // This will almost always return a reference to some internal array type, like myElement&
anything operator *();
anything operator &();
anything* operator ->(); // This has to return a pointer or some other type that has the -> operator defined.

anything operator ->*(anything a);
anything operator ()(anything a1, U a2, ...);
anything operator ,(anything b);
operator otherThing(); // Allows this class to have an implicit conversion to type otherThing
void* operator new(size_t x); // These are called when you write new someClass()
void* operator new[](size_tx); // new someClass[num]
void operator delete(void*x); // delete pointer_to_someClass
void operator delete[](void*x); // delete [] pointer_to_someClass

};

// These are global operators that behave more like C# operators, but must be defined outside of classes, and a few operators do not have global overloads, which is why they are missing from this list. Again, operators can technically take or return any value, but normally you only override these so you can handle some other type being on the left side.
someClass operator +(anything a, someClass b);
someClass operator -(anything a, someClass b);
someClass operator +(someClass a);
someClass operator -(someClass a);
someClass operator *(anything a, someClass b);
someClass operator /(anything a, someClass b);
someClass operator %(anything a, someClass b);
someClass operator ++(someClass a);
someClass operator ++(someClass a, int); // Note the unnamed dummy-parameter int - this differentiates between prefix and suffix increment operators.
someClass operator --(someClass a);
someClass operator --(someClass a, int); // Note the unnamed dummy-parameter int - this differentiates between prefix and suffix decrement operators.

bool operator ==(anything a, someClass b);
bool operator !=(anything a, someClass b);
bool operator >(anything a, someClass b);
bool operator <(anything a, someClass b);
bool operator >=(anything a, someClass b);
bool operator <=(anything a, someClass b);
bool operator !(someClass a);
bool operator &&(anything a, someClass b);
bool operator ||(anything a, someClass b);

someClass operator ~(someClass a);
someClass operator &(anything a, someClass b);
someClass operator |(anything a, someClass b);
someClass operator ^(anything a, someClass b);
someClass operator <<(anything a, someClass b);
someClass operator >>(anything a, someClass b);
someClass operator +=(anything a, someClass b);
someClass operator -=(anything a, someClass b);
someClass operator *=(anything a, someClass b);
someClass operator /=(anything a, someClass b);
someClass operator %=(anything a, someClass b);
someClass operator &=(anything a, someClass b);
someClass operator |=(anything a, someClass b);
someClass operator ^=(anything a, someClass b);
someClass operator <<=(anything a, someClass b);
someClass operator >>=(anything a, someClass b);
someClass operator *(someClass a);
someClass operator &(someClass a);

someClass operator ->*(anything a, someClass b);
someClass operator ,(anything a, someClass b);
void* operator new(size_t x);
void* operator new[](size_t x);
void operator delete(void* x);
void operator delete[](void*x);
We can see that the assignment operator mimics the arguments of our copy constructor. For the most part, it does the exact same thing; the only difference is that existing values must be destroyed, an operation that should mostly mimic the destructor. We extend our previous class to have an assignment operator accordingly:

class myString
{
public:
  // The copy constructor, which copies the string over instead of copying the pointer
  myString(const myString& copy)
  {
    size_t len = strlen(copy._str)+1; //+1 for null terminator
    _str=new char[len];
    memcpy(_str,copy._str,sizeof(char)*len);
  }
  // Normal constructor
  myString(const char* str)
  {
    size_t len = strlen(str);
    _str=new char[len];
    memcpy(_str,str,sizeof(char)*len);
  }
  // Destructor that deallocates our string
  ~myString()
  {
    delete [] _str;
  }

  // Assignment operator, does the same thing the copy constructor does, but also mimics the destructor by deleting _str. NOTE: It is considered bad practice to call the destructor directly. Use a Clear() method or something equivalent instead.
  myString& operator=(const myString& right)
  {
    delete [] _str;
    size_t len = strlen(right._str)+1; //+1 for null terminator
    _str=new char[len];
    memcpy(_str,right._str,sizeof(char)*len);
  }

private:
  char* _str;
};
These operations take an instance of the class and copy it’s values to our instance. Consequently, these are known as copy semantics. If this was 1998, we’d stop here, because for a long time, C++ only had copy semantics. Either you passed around references to objects, or you copied them. You could also pass around pointers to objects, but remember that pointers are value types just like integers and floats, so you are really just copying them around too. In fact, until recently, you were not allowed to have references to pointers. Pointers were the one data type that had to be passed by value. Provided you are using a C++0x-compliant compiler, this is no longer true, as you may remember from our first examples. The new standard released in 2011 allows references to pointers, and introduces move semantics.

Move semantics are designed to solve the following problem. If we have a series of dynamic string objects being concatenated, with normal copy constructors we run into a serious problem:

std::string result = std::string("Oppa") + std::string(" Gangnam") + std::string(" Style") + std::string(" by") + std::string(" Psy");
// This is evaluated by first creating a new string object with its own memory allocation, then deallocating both " by" and " Psy" after copying their contents into the new one
//std::string result = std::string("Oppa") + std::string(" Gangnam") + std::string(" Style") + std::string(" by Psy");
// Then another new object is made and " by Psy" and " Style" are deallocated
//std::string result = std::string("Oppa") + std::string(" Gangnam") + std::string(" Style by Psy");
// And so on and so forth
//std::string result = std::string("Oppa") + std::string(" Gangnam Style by Psy");
//std::string result = std::string("Oppa Gangnam Style by Psy");
// So just to add 5 strings together, we've had to allocate room for 5 additional strings in the middle of it, 4 of which are then simply deallocated!
This is terribly inefficient; it would be much more efficient if we could utilize the temporary objects that are going to be destroyed anyway instead of reallocating a bunch of memory over and over again only to delete it immediately afterwards. This is where move semantics come in to play. First, we need to define a “temporary” object as one whose scope is entirely contained on the right side of an expression. That is to say, given a single assignment statement a=b, if an object is both created and destroyed inside b, then it is considered temporary. Because of this, these temporary values are also called rvalues, short for “right values”. C++0x introduces the syntax variable&& to designate an rvalue. This is how you declare a move constructor:

class myString
{
public:
  // The copy constructor, which copies the string over instead of copying the pointer
  myString(const myString& copy)
  {
    size_t len = strlen(copy._str)+1; //+1 for null terminator
    _str=new char[len];
    memcpy(_str,copy._str,sizeof(char)*len);
  }
  // Move Constructor
  myString(myString&& mov)
  {
    _str = mov._str;
    mov._str=NULL;
  }
  // Normal constructor
  myString(const char* str)
  {
    size_t len = strlen(str);
    _str=new char[len];
    memcpy(_str,str,sizeof(char)*len);
  }
  // Destructor that deallocates our string
  ~myString()
  {
    if(_str!=NULL) // Make sure we only delete _str if it isn't NULL!
      delete [] _str;
  }

  // Assignment operator, does the same thing the copy constructor does, but also mimics the destructor by deleting _str. NOTE: It is considered bad practice to call the destructor directly. Use a Clear() method or something equivalent instead.
  myString& operator=(const myString& right)
  {
    delete [] _str;
    size_t len = strlen(right._str)+1; //+1 for null terminator
    _str=new char[len];
    memcpy(_str,right._str,sizeof(char)*len);
    return *this;
  }

private:
  char* _str;
};
NOTE: Observe that our destructor functionality was changed! Now that _str can be NULL, we have to check for that before deleting the object.

The idea behind a move constructor is that, instead of copying the values into our object, we move them into our object, setting the source to some NULL value. Notice that this can only work for pointers, or objects containing pointers. Integers, floats, and other similar types can’t really be “moved”, so instead their values are simply copied over. Consequently, move semantics is only beneficial for types like strings that involve dynamic memory allocation. However, because we must set the source pointers to 0, that means we can’t use const myString&&, because then we wouldn’t be able to modify the source pointers! This is why a move constructor is declared without a const modifier, which makes sense, since we intend to modify the object.

But wait, just like a copy constructor has an assignment copy operator, a move constructor has an equivalent assignment move operator. Just like the copy assignment, the move operator behaves exactly like the move constructor, but must destroy the existing object beforehand. The assignment move operator is declared like this:

myString& operator=(myString&& right)
  {
    delete [] _str;
    _str=right._str;
    right._str=0;
    return *this;
  }

Move semantics can be used for some interesting things, like unique pointers, that only have move semantics - by disabling the copy constructor, you can create an object that is impossible to copy, and can therefore only be moved, which guarantees that there will only be one copy of its contents in existence. std::unique_ptr is an implementation of this provided in C++0x. Note that if a data structure requires copy semantics, std::unique_ptr will throw a compiler error, instead of simply mysteriously failing like the deprecated std::autoptr.

There is an important detail when you are using inheritance or objects with move semantics:

class Substring : myString
{
  Substring(Substring&& mov) : myString(std::move(mov))
  {
    _sub = std::move(mov._sub);
  }

  Substring& operator=(Substring&& right)
  {
    myString::operator=(std::move(right));
    _sub = std::move(mov._sub);
    return *this;
  }

  myString _sub;
};
Here we are using std::move(), which takes a variable (that is either an rvalue or a normal reference) and returns an rvalue for that variable. This is because rvalues stop being rvalues the instant they are passed into a different function, which makes sense, since they are no longer on the right-hand side anymore. Consequently, if we were to pass mov above into our base class, it would trigger the copy constructor, because mov would be treated as const Substring&, instead of Substring&&. Using std::move lets us pass it in as Substring&& and properly trigger the move semantics. As you can see in the example, you must use std::move when moving any complex object, using base class constructors, or base class assignment operators. Note that std::move allows you to force an object to be moved to another object regardless of whether or not its actually an rvalue. This would be particularly useful for moving around std::unique_ptr objects.

There’s some other weird things you can do with move semantics. This most interesting part is the strange behavior of && when it is appended to existing references.

  • A& & becomes A&
  • A& && becomes A&
  • A&& & becomes A&
  • A&& && becomes A&&

By taking advantage of the second and fourth lines, we can perform perfect forwarding. Perfect forwarding allows us to pass an argument as either a normal reference (A&) or an rvalue (A&&) and then forward it into another function, preserving its status as an rvalue or a normal reference, including whether or not it’s const A& or const A&&. Perfect forwarding can be implemented like so:

template<typename U>
void Set(U && other)
{
  _str=std::forward<U>(other);
}
Notice that this allows us to assign our data object using either the copy assignment, or the move assignment operator, by using std::forward<U>(), which transforms our reference into either an rvalue if it was an rvalue, or a normal reference if it was a normal reference, much like std::move() transforms everything into an rvalue. However, this requires a template, which may not always be correctly inferred. A more robust implementation uses two separate functions forwarding their parameters into a helper function:

class myString
{
public:
  // The copy constructor, which copies the string over instead of copying the pointer
  myString(const myString& copy)
  {
    size_t len = strlen(copy._str)+1; //+1 for null terminator
    _str=new char[len];
    memcpy(_str,copy._str,sizeof(char)*len);
  }
  // Move Constructor
  myString(myString&& mov)
  {
    _str = mov._str;
    mov._str=NULL;
  }
  // Normal constructor
  myString(const char* str)
  {
    size_t len = strlen(str);
    _str=new char[len];
    memcpy(_str,str,sizeof(char)*len);
  }
  // Destructor that deallocates our string
  ~myString()
  {
    if(_str!=NULL) // Make sure we only delete _str if it isn't NULL!
      delete [] _str;
  }
  void Set(myString&& str)
  {
    _set<myString&&>(std::move(str));
  }
  void Set(const myString& str)
  {
    _set<const myString&>(str);
  }


  // Assignment operator, does the same thing the copy constructor does, but also mimics the destructor by deleting _str. NOTE: It is considered bad practice to call the destructor directly. Use a Clear() method or something equivalent instead.
  myString& operator=(const myString& right)
  {
    delete [] _str;
    size_t len = strlen(right._str)+1; //+1 for null terminator
    _str=new char[len];
    memcpy(_str,right._str,sizeof(char)*len);
    return *this;
  }

private:
  template<typename U>
  void _set(U && other)
  {
    _str=std::forward<U>(other);
  }

  char* _str;
};
Notice the use of std::move() to transfer the rvalue correctly, followed by std::forward<U>() to forward the parameter. By using this, we avoid redundant code, but can still build move-aware data structures that efficiently assign values with relative ease. Now, its on to Part 5: Delegated Llamas! Or, well, delegates, function pointers, and lambdas. Possibly involving llamas. Maybe.


7 Problems Raytracing Doesn't Solve


I see a lot of people get excited about extreme concurrency in modern hardware bringing us closer to the magical holy grail of raytracing. It seems that everyone thinks that once we have raytracing, we can fully simulate entire digital worlds, everything will be photorealistic, and graphics will become a “solved problem”. This simply isn’t true, and in fact highlights several fundamental misconceptions about the problems faced by modern games and other interactive media.

For those unfamiliar with the term, raytracing is the process of rendering a 3D scene by tracing the path of a beam of light after it is emitted from a light source, calculating its properties as it bounces off various objects in the world until it finally hits the virtual camera. At least, you hope it hits the camera. You see, to be perfectly accurate, you have to cast a bajillion rays of light out from the light sources and then see which ones end up hitting the camera at some point. This is obviously a problem, because most of the rays don’t actually hit the camera, and are simply wasted. Because this brute force method is so incredibly inefficient, many complex algorithms (such as photon-mapping and Metropolis light transport) have been developed to yield approximations that are thousands of times more efficient. These techniques are almost always focused on attempting to find paths from the light source to the camera, so rays can be cast in the reverse direction. Some early approximations actually cast rays out from the camera until they hit an object, then calculated the lighting information from the distance and angle, disregarding other objects in the scene. While highly efficient, this method produced extremely inaccurate results.

It is with a certain irony that raytracing is touted as being a precise, super-accurate rendering method when all raytracing is actually done via approximations in the first place. Pixar uses photon-mapping for its movies. Most raytracers operate on stochastic sampling approximations. We can already do raytracing in realtime, if we get approximate enough, it just looks boring and is extremely limited. Graphics development doesn’t just stop when someone develops realtime raytracing, because there will always be room for a better approximation.

1. Photorealism

The meaning of photorealism is difficult to pin down, in part because the term is inherently subjective. If you define photorealism as being able to render a virtual scene such that it precisely matches a photo, then it is almost impossible to achieve in any sort of natural environment where the slightest wind can push a tree branch out of alignment.

This quickly gives rise to defining photorealism as rendering a virtual scene such that it is indistinguishable from a photograph of a similar scene, even if they aren’t exactly the same. This, however, raises the issue of just how indistinguishable it needs to be. This seems like a bizarre concept, but there are different degrees of “indistinguishable” due to the differences between people’s observational capacities. Many people will never notice a slightly misaligned shadow or a reflection that’s a tad too bright. For others, they will stand out like sore thumbs and completely destroy their suspension of disbelief.

We have yet another problem in that the entire concept of “photorealism” has nothing to do with how humans see the world in the first place. Photos are inherently linear, while human experience a much more dynamic, log-based lighting scale. This gives rise to HDR photography, which actually has almost nothing to do with the HDR implemented in games. Games simply change the brightness of the entire scene, instead of combining the brightness of multiple exposures to brighten some areas and darken others in the same photo. If all photos are not created equal, then exactly which photo are we talking about when we say “photorealistic”?

2. Complexity

Raytracing is often cited as allowing an order of magnitude more detail in models by being able to efficiently process many more polygons. This is only sort of true in that raytracing is not subject to the same computational constraints that rasterization is. Rasterization must render every single triangle in the scene, whereas raytracing is only interested in whether or not a ray hits a triangle. Unfortunately, it still has to navigate through the scene representation. Even if a raytracer could handle a scene with a billion polygons efficiently, this raises completely unrelated problems involving RAM access times and cache pollution that suddenly become actual performance bottlenecks instead of micro-optimizations.

In addition, raytracing approximation algorithms almost always take advantage of rays that degrade quickly, such that they can only bounce 10-15 times before becoming irrelevant. This is fine and dandy for walking around in a city or a forest, but what about a kitchen? Even though raytracing is much better at handling reflections accurately, highly reflective materials cripple the raytracer, because now rays are bouncing hundreds of times off a myriad of surfaces instead of just 10. If not handled properly, it can absolutely devastate performance, which is catastrophic for game engines that must maintain constant render times.

3. Scale

How do you raytrace stars? Do you simply wrap a sphere around the sky and give it a “star” material? Do you make them all point sources infinitely far away? How does this work in a space game, where half the stars you see can actually be visited, and the other half are entire galaxies? How do you accurately simulate an entire solar system down to the surface of a planet, as the Kerbal Space Program developers had to? Trying to figure out how to represent that kind of information in a meaningful form with only 64 bits of precision, if you are lucky, is a problem completely separate from raytracing, yet of increasingly relevant concern as games continue to expand their horizons more and more. How do we simulate an entire galaxy? How can we maintain meaningful precision when faced with astronomical scales, and how does this factor in to our rendering pipeline? These are problems that arise in any rendering pipeline, regardless of what techniques it uses, due to fundamental limitations in our representations of numbers.

4. Materials

Do you know what methane clouds look like? What about writing an aerogel shader? Raytracing, by itself, doesn’t simply figure out how a given material works, you have to tell it how each material behaves, and its accuracy is wholly dependent on how accurate your description of the material is. This isn’t easy, either, it requires advanced mathematical models and heaps of data collection. In many places we’re actually still trying to figure out how to build physically correct material equations in the first place. Did you know that Dreamworks had to rewrite part of their cloud shader1 for How To Train Your Dragon? It turns out that getting clouds to look good when your character is flying directly beneath them with a hand raised is really hard.

This is just for common lighting phenomena! How are you going to write shaders for things like pools of magic water and birefringent calcite crystals? How about trying to accurately simulate circular polarizers when most raytracers don’t even know what polarization is? Does being photorealistic require you to simulate the Tyndall Effect for caustics in crystals and particulate matter? There are so many tiny little details all around us that affect everything from the color of our iris to the creation of rainbows. Just how much does our raytracer need to simulate in order to be photorealistic?

5. Physics

What if we ignored the first four problems and simply assumed we had managed to make a perfect, magical photorealistic raytracer. Congratulations, you’ve managed to co-opt the entirety of your CPU for the task of rendering a static 3D scene, leaving nothing left for the physics. All we’ve managed to accomplish is taking the “interactive” out of “interactive media”. Being able to influence the world around us is a key ingredient to immersion in games, and this requires more and more accurate physics, which are arguably just as difficult to calculate as raytracing is. The most advanced real-time physics engine to-date is the Lagoa Multiphysics, and it can only just barely simulate a tiny scene in a well-controlled environment before it completely decimates a modern CPU. This is without any complex rendering at all. Now try doing that for a scene with a radius of several miles. Oh, and remember our issue with scaling? This applies to physics too! Except with physics, its an order of magnitude even more difficult.

6. Content

As many developers have been discovering, procedural generation is not magic pixie dust you can sprinkle on problems to make them go away. Yet, without advances in content generation, we are forced to hire armies of artists to create the absurd amounts of detail required by modern games. Raytracing doesn’t solve this problem, it makes it worse. In any given square mile of a human settlement, there are billions of individual objects, ranging from pine cones, to rocks, to TV sets, to crumbs, all of which technically have physics, and must be kept track of, and rendered, and even more importantly, modeled.

Despite multiple attempts at leveraging procedural generation, the content problem has simply refused to go away. Until we can effectively harness the power of procedural generation, augmented artistic tools, and automatic design morphing, the advent of fully photorealistic raytracing will be useless. The best graphics engine in the world is nothing without art.

7. AI

<Patrician|Away> what does your robot do, sam <bovril> it collects data about the surrounding environment, then discards it and drives into walls — Bash.org quote [#240849](http://bash.org/?240849)
Of course, while we’re busy desperately trying to raytrace supercomplex scenes with advanced physics, we haven’t even left any CPU time to calculate the AI! The AI in games is so consistently terrible its turned into its own trope. The game industry spends all its computational time trying to render a scene, leaving almost nothing left for the AI routines, forcing them to rely on techniques from 1968. Think about that - we are approaching the point where AI in games comes down to a 50-year old technique that was considered hopelessly outdated before I was even born. Oh, and I should also point out that Graphics, Physics, Art, and AI are all completely separate fields with fundamentally different requirements that all have to work together in a coherent manner just so you can shoot headshots in Call of Duty 22.

I know that raytracing is exciting, sometimes simply as a demonstration of raw computational power. But it always disheartens me when people fantasize about playing amazingly detailed games indistinguishable from real life when that simply isn’t going to happen, even with the inevitable development2 of realtime raytracing. By the time it becomes commercially viable, it will simply be yet another incremental step in our eternal quest for infinite realism. It is an important step, and one we should strive for, but it alone is not sufficient to spark a revolution.


1 Found on the special features section of the How To Train Your Dragon DVD.
2 Disclaimer: I've been trying to develop an efficient raytracing algorithm for ages and haven't had much luck. These guys are faring much better.

Analyzing XKCD: Click and Drag


Today, xkcd featured a comic with a comically large image that is navigated by clicking and dragging. In the interests of SCIENCE (and possibly accidentally DDoSing Randall’s image server - sorry!), I created a static HTML file of the entire composite image.1

The collage is made up of 225 images2 that stretch out over a total image area 79872 pixels high and 165888 pixels wide. The images take up 5.52 MB of space and are named with a simple naming scheme "ydxd.png" where d represents a cardinal direction appropriate for the axis (n for north, s for south on the y axis and e for east, w for west on the x axis) along with the tile coordinate number; for example, "1n1e.png". Tiles are 2048x2048 png images with an average size of 24.53 KB. If you were to try and represent this as a single, uncompressed 32-bit 79872x165888 image file, it would take up 52.99 GB of space.

Assuming a human’s average height is 1.8 meters, that would give this image a scale of about 1 meter per 22 pixels. That means the total composite image is approximately 3.63 kilometers high and 7.54 kilometers wide. It would take an average human 1.67 hours to walk from one end of the image to the other. Note that the characters at the far left say they’ve been walking for 2 miles - they are 67584 pixels from the starting point, which translates to 3.072 km or ~1.9 miles, so this seems to indicate my rough estimates here are reasonably accurate.

If Randall spent, on average, one hour drawing each frame, it would take him 9.375 days of constant, nonstop work to finish this. If he instead spent an average of 10 minutes per frame, it would take ~37.5 hours, or almost an entire 40-hour work week.

Basically I’m saying Randall Munroe is fucking insane.

1 If you are on firefox or chrome, right-clicking and selecting "Save as" will download the HTML file along with all 225 images into a separate folder. 2 There are actually 3159 possible images (39 x 81), but all-white and all-black images are not included, instead being replaced by either the default white background or a massive black <div> representing the ground, located 28672 pixels from the top of the image, with a height of 51200.

Coordinate Systems And Cascading Stupidity


Today I learned that there are way too many coordinate systems, and that I’m an idiot (but that was already well-established). I have also learned to not trust graphics tutorials, but the reasons for that won’t become apparent until the end of this article.

There are two types of coordinate systems: left-handed and right-handed coordinate systems. By convention, most everyone in math and science uses right-handed coordinate systems with positive x going to the right, positive y going up, and positive z coming out of the screen. A left-handed coordinate system is the same, but positive z instead points into the screen. Of course, there are many other possible coordinate system configurations, each either being right or left-handed; some modern CAD packages have y pointing into the screen and z pointing up, and screen-space in graphics traditionally has y pointing down and z pointing into the screen.

If you start digging through DirectX and OpenGL, the handedness of the coordinate systems being used are ill-defined due to its reliance on various perspective transforms. Consequently, while DirectX traditionally uses a left-handed coordinate system and OpenGL uses a right-handed coordinate system, you can simply use D3DPerspectiveMatrixRH to give DirectX a right-handed coordinate system, and openGL actually uses a left-handed coordinate system by default on its shader pipeline - but all of these are entirely dependent on the handedness of the projection matrices involved. So, technically the coordinate system is whichever one you choose, but unlike the rest of the world, computer graphics has no real standard on which coordinate system to use, and so its just a giant mess of various coordinate systems all over the place, which means you don’t know what handedness a given function is for until things start getting funky.

I discovered all this, because today I found out that, for the past 6 or so years (the entire time my graphics engine has ever existed in any shape or form), it has been rotating everything backwards. I didn’t notice.

This happened due to a number of unfortunate coincidences. For many years, I simply didn’t notice because I didn’t know what direction the sign of a given rotation was supposed to rotate in, and even if I did I would have assumed this to be the default for graphics for some strange reason (there are a lot of weird things in graphics). The first hint was when I was integrating with Box2D and I had to reverse the rotation of its bodies to match up with my images. This did trigger an investigation, but I mistakenly concluded that it was Box2D that had it wrong, not me, because I was using atan2 to check coordinates, and I was passing them in as atan2(v.x,v.y). The problem is that atan2 is defined as float atan2(float y, float x), which means my coordinates were reversed and I was getting nonsense angles.

Now, here you have to understand that I was currently using a standard left-handed coordinate system, with y pointing up, x pointing right and z into the screen. The thing is, I wanted a coordinate system where y pointed down, and so I did as a tutorial instructed me to and reversed all of my y coordinates on the low-level drawing functions.

So, when atan2(x,y) gave me bad results, I mistakenly thought “Oh, i forgot to reverse the y coordinate!” Suddenly atan2(x,-y) was giving me angles that matched what my images were doing. The thing is, if you switch x and y and negate y, atan2(x,-y)==-atan2(y,x). One mistake had been incorrectly validated by yet another mistake, caused by yet another mistake!

You see, by inverting those y coordinates, I was accidentally reversing the result of my rotation matrices, which caused them to rotate everything backwards. This was further complicated by how the camera rotates things - if your camera is fixed, how do you make it appear that it is rotating? You rotate everything else in the opposite direction! Hence even though my camera was rotating backwards despite looking like it was rotating forwards, it was actually being rotated the right way for the wrong reason.

While I initially thought the fix for this would require some crazy coordinate system juggling, the actual solution was fairly simple. The fact was, a coordinate system with z pointing into the screen and y pointing down is still right-handed, which means it should play nicely with rotations from a traditional right-handed system. Since the handedness of a coordinate system is largely determined by the perspective matrix, reversing y-coordinates in the drawing functions was actually reversing them too late in the pipeline. Hence, because I used D3DXMatrixPerspectiveLH, I had a left-handed coordinate system, and my rotations ended up being reversed. D3DXMatrixPerspectiveRH negates the z-coordinate to switch the handedness of the coordinate system, but I like positive z pointing into the screen, so I instead hacked the left-handed perspective matrix itself and negated the y-scaling parameter in cell [2,2], then undid all the y-coordinate inversion insanity that had been inside my drawing functions (you could also negate the y coordinate in any world transform matrix sufficiently early in the pipeline by specifying a negative y scaling in [2,2]). Suddenly everything was consistent, and rotations were happening in the right direction again. Now the Camera rotation actually required the negative rotation, as one would expect, and I still got to use a coordinate system with y pointing down. Unfortunately it also reversed several rotation operations throughout the engine, some of which were functions that had been returning the wrong value this whole time so as to match up with the incorrect rotation of the engine - something that will give me nightmares for weeks, probably involving a crazed rabbit hitting me over the head with a carrot screaming “STUPID STUPID STUPID STUPID!”

What’s truly terrifying that all of this was indirectly caused by reversing the y coordinates in the first place. Had I instead flipped them in the perspective matrix itself (or otherwise properly transformed the coordinate system), I never would have had to deal with negating y coordinates, I never would have mistaken atan2(x,-y) as being valid, and I never would have had rotational issues in the first place.

All because of that one stupid tutorial.

P.S. the moral of the story isn’t that tutorials are bad, it’s that you shouldn’t be a stupid dumbass and not write unit tests or look at function definitions.


How Joysticks Ruined My Graphics Engine


It’s almost a tradition.

Every time my graphics engine has been stuck in maintenence mode for 6 months, I’ll suddenly realize I need to push out an update or implement some new feature. I then realize that I haven’t actually paid attention to any of my testing programs, or their speed, in months. This is followed by panic, as I discover my engine running at half speed, or worse. Having made an infinite number of tiny tweaks that all could have caused the problem, I am often thrown into temporary despair only to almost immediately find some incredibly stupid mistake that was causing it. One time it was because I left the profiler on. Another time it was caused by calling the Render function twice. I’m gearing up to release the first public beta of my graphics engine, and this time is no different.

I find an old backup distribution of my graphics engine and run the tests, and my engine is running at 1100 FPS instead of 3000 or 4000 like it should be. Even the stress test is going at only ~110 FPS instead of 130 FPS. The strange part was that for the lightweight tests, it seemed to be hitting a wall at about 1100 FPS, whereas normally it hits a wall around 4000-5000 due to CPU⇒GPU bottlenecks. This is usually caused by some kind of debugging, so I thought I’d left the profiler on again, but no, it was off. After stepping through the rendering pipeline and finding nothing, I knew I had no chance of just guessing what the problem was, so I turned on the profiler and checked the results. Everything seemed relatively normal except-

PlaneShader::cEngine::_callpresent       1.0    144.905 us   19%        145 us   19%
PlaneShader::cEngine::Update             1.0     12.334 us    2%        561 us   72%
PlaneShader::cEngine::FlushMessages    1.0    546.079 us   70%        549 us   71%
What the FUCK?! Why is 70% of my time being spent in FlushMessages()? All that does is process window messages! It shouldn’t take any time at all, and here it is taking longer to process messages than it does to render an entire frame!
bool cEngine::FlushMessages()
{
PROFILE_FUNC();
_exactmousecalc();
//windows stuff
MSG msg;

while(PeekMessageW(&msg, NULL, 0, 0, PM_REMOVE))
{
TranslateMessage(&msg);
DispatchMessageW(&msg);

    if(msg.message == WM_QUIT)
    {
      _quit = true;
      return false;
    }

}

_joyupdateall();
return !_quit; //function returns opposite of quit
}
Bringing up the function, there don’t seem to be many opportunities for it to fail. I go ahead and comment out _exactmousecalc() and _joyupdateall(), wondering if, perhaps, something in the joystick function was messing up? Lo and behold, my speeds are back to normal! After re-inserting the exact mouse calculation, it is, in fact, _joyupdateall() causing the problem. This is the start of the function:
void cInputManager::_joyupdateall()
{
JOYINFOEX info;
info.dwSize=sizeof(JOYINFOEX);
info.dwFlags= \[ Clipped... \];

for(unsigned short i = 0; i < _maxjoy; ++i)
{
if(joyGetPosEx(i,&info)==JOYERR_NOERROR)
{
if(_allbuttons\[i\]!=info.dwButtons)
{
Well, shit, there isn’t really any possible explanation here other than something going wrong with joyGetPosEx. It turns out that calling joyGetPosEx when there isn’t a joystick plugged in takes a whopping 34.13 µs (microseconds) on average, which is almost as long as it takes me to render a single frame (43.868 µs). There’s probably a good reason for this, but evidently it is not good practice to call it unnecessarily. Fixing this was relatively easy - just force an explicit call to look for active joystick inputs and only poll those, but its still one of the weirdest performance bottlenecks I’ve come up against.

Of course, if you aren’t developing a graphics engine, consider that 1/60 of a second is 16666 µs - 550 µs leaves a lot of breathing room for a game to work with, but a graphics engine must not force any unnecessary cost on to a program that is using it unless that program explicitly allows it, hence the problem.

Then again, calculating invsqrt(x)*x is faster than sqrt(x), so I guess anything is possible.


Avatar

Archive

  1. 2025
  2. 2024
  3. 2023
  4. 2022
  5. 2021
  6. 2020
  7. 2019
  8. 2018
  9. 2017
  10. 2016
  11. 2015
  12. 2014
  13. 2013
  14. 2012
  15. 2011
  16. 2010
  17. 2009