Erik McClure

Analyzing XKCD: Click and Drag


Today, xkcd featured a comic with a comically large image that is navigated by clicking and dragging. In the interests of SCIENCE (and possibly accidentally DDoSing Randall’s image server - sorry!), I created a static HTML file of the entire composite image.1

The collage is made up of 225 images2 that stretch out over a total image area 79872 pixels high and 165888 pixels wide. The images take up 5.52 MB of space and are named with a simple naming scheme "ydxd.png" where d represents a cardinal direction appropriate for the axis (n for north, s for south on the y axis and e for east, w for west on the x axis) along with the tile coordinate number; for example, "1n1e.png". Tiles are 2048x2048 png images with an average size of 24.53 KB. If you were to try and represent this as a single, uncompressed 32-bit 79872x165888 image file, it would take up 52.99 GB of space.

Assuming a human’s average height is 1.8 meters, that would give this image a scale of about 1 meter per 22 pixels. That means the total composite image is approximately 3.63 kilometers high and 7.54 kilometers wide. It would take an average human 1.67 hours to walk from one end of the image to the other. Note that the characters at the far left say they’ve been walking for 2 miles - they are 67584 pixels from the starting point, which translates to 3.072 km or ~1.9 miles, so this seems to indicate my rough estimates here are reasonably accurate.

If Randall spent, on average, one hour drawing each frame, it would take him 9.375 days of constant, nonstop work to finish this. If he instead spent an average of 10 minutes per frame, it would take ~37.5 hours, or almost an entire 40-hour work week.

Basically I’m saying Randall Munroe is fucking insane.

1 If you are on firefox or chrome, right-clicking and selecting "Save as" will download the HTML file along with all 225 images into a separate folder. 2 There are actually 3159 possible images (39 x 81), but all-white and all-black images are not included, instead being replaced by either the default white background or a massive black <div> representing the ground, located 28672 pixels from the top of the image, with a height of 51200.

Coordinate Systems And Cascading Stupidity


Today I learned that there are way too many coordinate systems, and that I’m an idiot (but that was already well-established). I have also learned to not trust graphics tutorials, but the reasons for that won’t become apparent until the end of this article.

There are two types of coordinate systems: left-handed and right-handed coordinate systems. By convention, most everyone in math and science uses right-handed coordinate systems with positive x going to the right, positive y going up, and positive z coming out of the screen. A left-handed coordinate system is the same, but positive z instead points into the screen. Of course, there are many other possible coordinate system configurations, each either being right or left-handed; some modern CAD packages have y pointing into the screen and z pointing up, and screen-space in graphics traditionally has y pointing down and z pointing into the screen.

If you start digging through DirectX and OpenGL, the handedness of the coordinate systems being used are ill-defined due to its reliance on various perspective transforms. Consequently, while DirectX traditionally uses a left-handed coordinate system and OpenGL uses a right-handed coordinate system, you can simply use D3DPerspectiveMatrixRH to give DirectX a right-handed coordinate system, and openGL actually uses a left-handed coordinate system by default on its shader pipeline - but all of these are entirely dependent on the handedness of the projection matrices involved. So, technically the coordinate system is whichever one you choose, but unlike the rest of the world, computer graphics has no real standard on which coordinate system to use, and so its just a giant mess of various coordinate systems all over the place, which means you don’t know what handedness a given function is for until things start getting funky.

I discovered all this, because today I found out that, for the past 6 or so years (the entire time my graphics engine has ever existed in any shape or form), it has been rotating everything backwards. I didn’t notice.

This happened due to a number of unfortunate coincidences. For many years, I simply didn’t notice because I didn’t know what direction the sign of a given rotation was supposed to rotate in, and even if I did I would have assumed this to be the default for graphics for some strange reason (there are a lot of weird things in graphics). The first hint was when I was integrating with Box2D and I had to reverse the rotation of its bodies to match up with my images. This did trigger an investigation, but I mistakenly concluded that it was Box2D that had it wrong, not me, because I was using atan2 to check coordinates, and I was passing them in as atan2(v.x,v.y). The problem is that atan2 is defined as float atan2(float y, float x), which means my coordinates were reversed and I was getting nonsense angles.

Now, here you have to understand that I was currently using a standard left-handed coordinate system, with y pointing up, x pointing right and z into the screen. The thing is, I wanted a coordinate system where y pointed down, and so I did as a tutorial instructed me to and reversed all of my y coordinates on the low-level drawing functions.

So, when atan2(x,y) gave me bad results, I mistakenly thought “Oh, i forgot to reverse the y coordinate!” Suddenly atan2(x,-y) was giving me angles that matched what my images were doing. The thing is, if you switch x and y and negate y, atan2(x,-y)==-atan2(y,x). One mistake had been incorrectly validated by yet another mistake, caused by yet another mistake!

You see, by inverting those y coordinates, I was accidentally reversing the result of my rotation matrices, which caused them to rotate everything backwards. This was further complicated by how the camera rotates things - if your camera is fixed, how do you make it appear that it is rotating? You rotate everything else in the opposite direction! Hence even though my camera was rotating backwards despite looking like it was rotating forwards, it was actually being rotated the right way for the wrong reason.

While I initially thought the fix for this would require some crazy coordinate system juggling, the actual solution was fairly simple. The fact was, a coordinate system with z pointing into the screen and y pointing down is still right-handed, which means it should play nicely with rotations from a traditional right-handed system. Since the handedness of a coordinate system is largely determined by the perspective matrix, reversing y-coordinates in the drawing functions was actually reversing them too late in the pipeline. Hence, because I used D3DXMatrixPerspectiveLH, I had a left-handed coordinate system, and my rotations ended up being reversed. D3DXMatrixPerspectiveRH negates the z-coordinate to switch the handedness of the coordinate system, but I like positive z pointing into the screen, so I instead hacked the left-handed perspective matrix itself and negated the y-scaling parameter in cell [2,2], then undid all the y-coordinate inversion insanity that had been inside my drawing functions (you could also negate the y coordinate in any world transform matrix sufficiently early in the pipeline by specifying a negative y scaling in [2,2]). Suddenly everything was consistent, and rotations were happening in the right direction again. Now the Camera rotation actually required the negative rotation, as one would expect, and I still got to use a coordinate system with y pointing down. Unfortunately it also reversed several rotation operations throughout the engine, some of which were functions that had been returning the wrong value this whole time so as to match up with the incorrect rotation of the engine - something that will give me nightmares for weeks, probably involving a crazed rabbit hitting me over the head with a carrot screaming “STUPID STUPID STUPID STUPID!”

What’s truly terrifying that all of this was indirectly caused by reversing the y coordinates in the first place. Had I instead flipped them in the perspective matrix itself (or otherwise properly transformed the coordinate system), I never would have had to deal with negating y coordinates, I never would have mistaken atan2(x,-y) as being valid, and I never would have had rotational issues in the first place.

All because of that one stupid tutorial.

P.S. the moral of the story isn’t that tutorials are bad, it’s that you shouldn’t be a stupid dumbass and not write unit tests or look at function definitions.


How Joysticks Ruined My Graphics Engine


It’s almost a tradition.

Every time my graphics engine has been stuck in maintenence mode for 6 months, I’ll suddenly realize I need to push out an update or implement some new feature. I then realize that I haven’t actually paid attention to any of my testing programs, or their speed, in months. This is followed by panic, as I discover my engine running at half speed, or worse. Having made an infinite number of tiny tweaks that all could have caused the problem, I am often thrown into temporary despair only to almost immediately find some incredibly stupid mistake that was causing it. One time it was because I left the profiler on. Another time it was caused by calling the Render function twice. I’m gearing up to release the first public beta of my graphics engine, and this time is no different.

I find an old backup distribution of my graphics engine and run the tests, and my engine is running at 1100 FPS instead of 3000 or 4000 like it should be. Even the stress test is going at only ~110 FPS instead of 130 FPS. The strange part was that for the lightweight tests, it seemed to be hitting a wall at about 1100 FPS, whereas normally it hits a wall around 4000-5000 due to CPU⇒GPU bottlenecks. This is usually caused by some kind of debugging, so I thought I’d left the profiler on again, but no, it was off. After stepping through the rendering pipeline and finding nothing, I knew I had no chance of just guessing what the problem was, so I turned on the profiler and checked the results. Everything seemed relatively normal except-

PlaneShader::cEngine::_callpresent       1.0    144.905 us   19%        145 us   19%
PlaneShader::cEngine::Update             1.0     12.334 us    2%        561 us   72%
PlaneShader::cEngine::FlushMessages    1.0    546.079 us   70%        549 us   71%
What the FUCK?! Why is 70% of my time being spent in FlushMessages()? All that does is process window messages! It shouldn’t take any time at all, and here it is taking longer to process messages than it does to render an entire frame!
bool cEngine::FlushMessages()
{
PROFILE_FUNC();
_exactmousecalc();
//windows stuff
MSG msg;

while(PeekMessageW(&msg, NULL, 0, 0, PM_REMOVE))
{
TranslateMessage(&msg);
DispatchMessageW(&msg);

    if(msg.message == WM_QUIT)
    {
      _quit = true;
      return false;
    }

}

_joyupdateall();
return !_quit; //function returns opposite of quit
}
Bringing up the function, there don’t seem to be many opportunities for it to fail. I go ahead and comment out _exactmousecalc() and _joyupdateall(), wondering if, perhaps, something in the joystick function was messing up? Lo and behold, my speeds are back to normal! After re-inserting the exact mouse calculation, it is, in fact, _joyupdateall() causing the problem. This is the start of the function:
void cInputManager::_joyupdateall()
{
JOYINFOEX info;
info.dwSize=sizeof(JOYINFOEX);
info.dwFlags= \[ Clipped... \];

for(unsigned short i = 0; i < _maxjoy; ++i)
{
if(joyGetPosEx(i,&info)==JOYERR_NOERROR)
{
if(_allbuttons\[i\]!=info.dwButtons)
{
Well, shit, there isn’t really any possible explanation here other than something going wrong with joyGetPosEx. It turns out that calling joyGetPosEx when there isn’t a joystick plugged in takes a whopping 34.13 µs (microseconds) on average, which is almost as long as it takes me to render a single frame (43.868 µs). There’s probably a good reason for this, but evidently it is not good practice to call it unnecessarily. Fixing this was relatively easy - just force an explicit call to look for active joystick inputs and only poll those, but its still one of the weirdest performance bottlenecks I’ve come up against.

Of course, if you aren’t developing a graphics engine, consider that 1/60 of a second is 16666 µs - 550 µs leaves a lot of breathing room for a game to work with, but a graphics engine must not force any unnecessary cost on to a program that is using it unless that program explicitly allows it, hence the problem.

Then again, calculating invsqrt(x)*x is faster than sqrt(x), so I guess anything is possible.


Multithreading Problems In Game Design


A couple years ago, when I first started designing a game engine to unify Box2D and my graphics engine, I thought this was a superb opportunity to join all the cool kids and multithread it. I mean all the other game developers were talking about having a thread for graphics, a thread for physics, a thread for audio, etc. etc. etc. So I spent a lot of time teaching myself various lockless threading techniques and building quite a few iterations of various multithreading structures. Almost all of them failed spectacularly for various reasons, but in the end they were all too complicated.

I eventually settled on a single worker thread that was sent off to start working on the physics at the beginning of a frame render. Then at the beginning of each subsequent frame I would check to see if the physics were done, and if so sync the physics and graphics and start up another physics render iteration. It was a very clean solution, but fundamentally flawed. For one, it introduces an inescapable frame of input lag.

Single Thread Low Load
  FRAME 1   +----+
            |    |
. Input1 -> |    |
            |[__]| Physics   
            |[__]| Render    
. FRAME 2   +----+ INPUT 1 ON BACKBUFFER
. Input2 -> |    |
. Process ->|    |
            |[__]| Physics
. Input3 -> |[__]| Render
. FRAME 3   +----+ INPUT 2 ON BACKBUFFER, INPUT 1 VISIBLE
.           |    |
.           |    |
. Process ->|[__]| Physics
            |[__]| Render
  FRAME 4   +----+ INPUT 3 ON BACKBUFFER, INPUT 2 VISIBLE

Multi Thread Low Load
  FRAME 1   +----+
            |    | 
            |    |
. Input1 -> |    | 
.           |[__]| Render/Physics START  
. FRAME 2   +----+        
. Input2 -> |____| Physics END
.           |    |
.           |    | 
. Input3 -> |[__]| Render/Physics START
. FRAME 3   +----+ INPUT 1 ON BACKBUFFER
.           |____|
.           |    | PHYSICS END
.           |    | 
            |____| Render/Physics START
  FRAME 4   +----+ INPUT 2 ON BACKBUFFER, INPUT 1 VISIBLE

The multithreading, by definition, results in any given physics update only being reflected in the next rendered frame, because the entire point of multithreading is to immediately start rendering the current frame as soon as you start calculating physics. This causes a number of latency issues, but in addition it requires that one introduce a separated “physics update” function to be executed only during the physics/graphics sync. As I soon found out, this is a massive architectural complication, especially when you try to put in scripting or let other languages use your engine.

There is another, more subtle problem with dedicated threads for graphics/physics/audio/AI/anything. It doesn’t scale. Let’s say you have a platformer - AI will be contained inside the game logic, and the absolute vast majority of your CPU time will either be spent in graphics or physics, or possibly both. That means your game effectively only has two threads that are doing any meaningful amount of work. Modern processors have 8 logical cores1, and the best one currently available has 12. You’re using two of them. You introduced all this complexity and input lag just so you could use 16.6% of the processor instead of 8.3%.

Instead of trying to create a thread for each individual component, you need to go deeper. You have to parallelize each individual component separately, then tie them together in a single-threaded design. This has the added bonus of being vastly more friendly to single-threaded CPUs that can’t thread things (like certain phones), because the parallization goes on at a lower level and is invisible to the overall architecture of the library. So instead of having a graphics thread and a physics thread, you simply call the physics update, then call the graphics update, and inside each physics and graphics update you spawn enough worker threads to match the number of cores you have to work with and concurrently process as much stuff as possible. This eliminates latency problems, complicated library designs, and it scales forever. Even if your initial implementation of concurrency won’t handle 32 cores, because the concurrency is encapsulated inside the engine, you can just go back and change it later without ever having to modify any programs that use the graphics engine.

Consequently, don’t try to multithread your games. It isn’t worth it. Separately parallelize each individual component instead and write your game engine single-threaded; only use additional threads for asynchronous activities like resource loading.


1
The processors actually only have 4 or 6 physical cores, but use hyperthreading techniques so that 8 or 12 logical cores are presented to the operating system. From a software point of view, however, this is immaterial.


Why Windows 8 Does The Right Thing The Wrong Way


Yesterday, I saw a superb presentation called “When The Consoles Die, What Comes Next?” by Ben Cousins. It demonstrates that mobile gaming is behaving as a disruptive technology, and is causing the same market decline in consoles that consoles themselves did to arcades in the 1990s. He also demonstrates how TV crushed cinema in a similar manner - we just don’t think of it like that because we don’t remember back when almost 60% of the population was going to the movie theaters on a weekly basis. Today, most people tend to go to the movie theater as a special occasion, so the theaters didn’t completely die out, they just lost their market dominance. The role the movie theater played changed as new technology was introduced.

The game industry, and in fact the software industry as a whole, is in a similar situation. Due to the mass adoption of iPads and other tablets, we now have a mobile computing experience that is distinct from that of say, a console, or even a PC. Consequently, the role of consoles and PCs will shift in response to this new technology. However, while many people are eager to jump on the bandwagon (and it’s a very lucrative bandwagon), we are already losing sight of what will happen to stabilize the market.

People who want to sound futuristic and smart are talking about the “Post-PC Era”, which is a very inaccurate thing to say. PCs are clearly very useful for some tasks, and its unlikely that they will be entirely replaced by mobile computing, especially when screen real-estate is so important to development and productivity, and the difficulty of replicating an ergonomic keyboard. The underlying concept of a PC, in that you sit down at it, have a keyboard and mouse and a large screen to work at, is unlikely to change significantly. The mouse will probably be replaced by adaptive touch solutions and possibly gestures, and the screen might very well turn into a giant glass slab with OLEDs on it, or perhaps simply exist as the wall, but the underlying idea is not going anywhere. It will simply evolve.

Windows 8 is both a surprisingly prescient move on part of Microsoft, and also (not surprisingly) a horrible train wreck of an OS. The key concept that Microsoft correctly anticipated was the unification of operating systems. It is foolish to think that we will continue on with this brick wall separating tablet devices and PCs. The difference between tablets and PCs is simply one of both user interface and user experience. These are both managed by the highest layers of complexity in an operating system, such that it can simply adapt its presentation to suit whatever device it is currently on. It will have to once we introduce monitors the size of walls and OLED cards with embedded microchips. There will be such a ridiculous number of possible presentation mediums, that the idea of a presentation medium must be generalized such that a single operating system can operate on a stupendous range of devices.

This has important consequences for the future of software. Currently we seem to think that there should be “tablet versions” of software. This is silly and inconvenient. If you buy a piece of software, it should just work, no matter what you put it on. If it finds itself on a PC, it will analyze the screen size and behave appropriately. If its on a tablet, it will enable touch controls and reorganize the UI appropriately. More importantly, you shouldn’t have to buy a version for each of your devices, because eventually there won’t be anything other than a computer we carry around with us that plugs into terminals or interacts with small central servers at a company.

If someone buys a game I make, they own a copy of that game. That means they need to be able to get a copy of that game on all their devices without having to buy it 2 or 3 times. The act of buying the game should make it available to install on any interactive medium they want, and my game should simply adapt itself to whatever medium is being used to play it. The difference between PC and tablet will become blurred as they are reduced to simply being different modes of interaction, with the same underlying functionality.

This is what Microsoft is attempting to anticipate, by building an operating system that can work on both a normal computer and a tablet. They even introduce a Windows App Store, which is a crucial step towards allowing you to buy a program for both your PC and your tablet in a single purchase. Unfortunately, the train-wreck analogy is all too appropriate for describing the state of Windows 8. Rather than presenting an elegant, unified tablet and PC experience, they smash together two completely incompatible interfaces in an incoherent disaster. You are either presented with a metro interface, or a traditional desktop interface, with no in-between. The transition is about as smooth as your head smashing against a brick wall. They don’t even properly account for the fact that their new metro start menu is terribly optimized for a mouse, but try to make you use it anyway. It does the right thing, the wrong way.

The game industry has yet to catch on to this, since one designs either a “PC game” or a “mobile game”. When a game is released on a tablet, it’s a special “mobile version”. FL Studio has a special mobile version. There is no unification anywhere, and the two are treated as separate walled gardens. While this is currently an advantage during a time where tablets don’t have the kind of power a PC does, it will quickly become a disadvantage. The convenience of having familiar interfaces on all your devices, with all of the same programs, will trump isolated functionality. There will always be games and programs more suited to consoles, or to PCs, or to tablets, but unless we stop thinking of these as separate devices, and instead one of many possible user experiences that we must adapt our creations to, we will find ourselves on the wrong side of history.


Avatar

Archive

  1. 2024
  2. 2023
  3. 2022
  4. 2021
  5. 2020
  6. 2019
  7. 2018
  8. 2017
  9. 2016
  10. 2015
  11. 2014
  12. 2013
  13. 2012
  14. 2011
  15. 2010
  16. 2009