Erik McClure

Camp Vista - Growing Up Next To Microsoft


This blog was started on LiveJournal shortly after I graduated high school in 2009. It has survived this long because I was very persistent about porting all my posts to Blogger, and then to my own static website. Of course, I have limits. Most of my terrible old posts were removed long ago, because they were so bad they didn't contribute anything. One post was a rant about my high school internship at Microsoft. I took it down because it was bad, but more importantly I took it down because I wrote it before graduating college, while I was still completely delusional.

Let me explain.

I grew up 20 minutes from Microsoft. I had absolutely no idea what an absurdly warped childhood I had until I graduated college and met people who hadn't grown up in a tech bubble. I never questioned how my middle school always somehow had the latest copy of Windows and maintained six different computer labs (one in each classroom hub, plus the library, plus a dedicated lab). I didn't realize how unusual it was for my high school to offer an “intro to game dev course” in 2008, or how ridiculous it was that my AP Computer Science professor had worked on the first version of DirectX at Microsoft. I never thought it was weird that we had Microsoft Connector buses driving around town, creating an entire secondary bus system for the sole purpose of moving around Microsoft employees.

I definitely did not realize how utterly insane it was that, in July 2006, students from my high school (and a few other high schools around the area) were invited to a week long event where we would get to experiment with and “bug test” a beta version of Windows Vista, a full six months before it was released. At 8:00AM every morning, our parents dropped us off at Microsoft Building 40, and we were led into a room filled with rows of desks with computers on them. There were maybe 50 of us, and we were given guest accounts and full internet access while Microsoft employees gave presentations about various new features they had introduced for Vista.

I still remember some of those presentations. One that really stuck with me at the time was Microsoft jumping on the “192kHz sampling rate support” bandwagon, which makes absolutely no sense outside of music production, and in retrospect just seems incredibly dumb. Another presentation had us build wifi mesh networks between the computers, which was touted as a way to share files and communicate with friends in places with little to no internet connectivity. I remember this one both because I thought it was very cool, and because someone managed to bluescreen Windows while attempting to set it up, so they actually had an SDE come in and take a look.

In their attempts to make this a “camp” experience, they gave us all a project on the final day: create our own presentation (using the new Microsoft™ Office™ PowerPoint™ 2007, of course), about some new feature we wanted on Windows or some improvement that wasn't there yet. I don't remember what our presentation was, but I do know we were all terrible. Afterwards we were all invited to the “Windows Vista Consumer Support Assisted Beta”, a predecessor to the Windows Insider program.

This “Camp Vista” thing was a strange, one-time event, but the Hunt The Wumpus competition is still going. If you're lucky enough to be attending school in the Lake Washington School District, you can team up with several other students and get tutoring from a Microsoft employee over the course of several months while you build a very basic video game and compete for Microsoft sponsored prizes. I fondly remember our attempts at building a game using XNA back when it was brand new (it's now dead), and making a bad Descent clone. We tried to use SVN, but weren't allowed to install it on school computers, so we resorted to an FTP folder and e-mailing zip files.

Here we have the crux of the problem with my initial impressions of my high school internship - it's not a normal thing! Students worldwide compete for a chance to get college internships at Microsoft, but the high school interns are just random CS students from the Lake Washington School District. When my team learned I knew quite a bit of programming already, they had to come up with something else for me to do because they had assumed I barely knew how to code. It was just another outreach program for nearby high schools, only available to 31000 kids on the entire planet.

So in the midst of me having an experience that almost no other high school student gets to have, I am complaining about things like “wow, we have too many meetings” and “wow, this software is bad”. Yes, we know Microsoft is a dysfunctional catastrophe, but focusing on issues that are omnipresent in large corporations only serves to detract from the actual crazy parts of the internship, like how the program we were working on, if compiled with no optimizations, took 20 minutes to start. Or the meeting where the entire team spent 15 in-person minutes sitting around an actual table deciding that one of our function names was too long and debated over what to rename it before deciding not to change it. Or that one time the department head (my boss's boss's boss) took me, an intern, to a meeting with his boss, who I think directly reported to a vice president. It got better, though, because one afternoon, there was a 3-hour period of time where my boss, his boss, and the department head were all gone, so I, a 19-year-old high school student, was technically supposed to take my questions about how to use C# PInvoke to the department head's boss.

I was really careful not to break anything for 3 hours.

Looking back at my teenage years has made me realize how easy it is for people to simply miss how some aspect of their upbringing was deeply unusual. For some, it may be a silent hindrance, but for others, it might be a quiet boon, softly sending opportunities their way that almost no one else has access to. How many opportunities did I let slip by, unaware of how unique they were?


Blockchain Is The New JavaScript


Over 25 years ago, Brenden Eich created JavaScript, named after Java simply because it was popular. It's prototypical nature and dynamic typing made it unsuitable for anything other than slow interpreters, forcing Google to invent a way to JIT the language just to make it fast. We even built a package manager for JavaScript which managed to create such an absurd dependency hell that one guy's left-pad function managed to break web development for a day.

Now the entire world economy runs on JavaScript.

This is the kind of dumb future that we live in, and it is the kind of dumb future we can look forward to with blockchain. It is why people who insist that blockchain will never work because of various technological reasons are ignoring the simple fact that humanity is incredibly good at standardizing on the worst possible things, then building our entire future on top of them. In fact, there is a disturbing number of direct parallels between the rise of JavaScript and the rise of blockchain technologies, with a few key differences.

JavaScript was originally going to be Scheme, a lisp dialect. Sadly, the management demanded that it look more like Java, so Brenden hid Scheme inside a vaguely Java-esque syntax, which became Mocha, which became LiveScript, which then became JavaScript. This has led to many suffering web developers agonizing over the fact that “We could have had Lisp!". Even worse, CSS originated in DSSSL, which was another dialect of Scheme for constructing style sheets, but it was deemed too complicated (too many parenthesis!), so a much simpler version was created, called CSS. Modern CSS, of course, has simply reinvented everything DSSSL had, except worse.

This is what drives the entire internet, and in turn, Amazon's trillion dollar empire.

In 2009, Satoshi Nakamoto unleashed Bitcoin upon an unsuspecting world. Most people ignored it, for good reason - it was completely impractical for any sort of global transaction system, capable of processing a whopping 7 transaction per second. Then the first bubble happened and Bitcoin skyrocketed to $1000 USD in late 2013. It then promptly crashed to $300 in 2014, after which most people thought it had simply been a passing fad, and ignored it for the next 3 years, until the second bubble of 2017.

But not everyone. After his favorite World of Warcraft spell got nerfed by Blizzard, some maniac Vitalik Buterin argued that the Bitcoin network should support programmable smart contracts, at the peak of the 2013 bubble. Here, we almost had a repeat of JavaScript, nearly bolting on smart contracts to a blockchain algorithm that was never designed for it. However, history thankfully granted us a reprieve, because the Bitcoin developers told Vitalik to take a hike. So, he invented Ethereum, which is like Bitcoin except slightly less stupid. Unfortunately, it still ran on Proof-of-Work, which means burning coal to solve sudokus.

At this point, you may be expecting me to draw a parallel from Google's V8 JavaScript engine to Proof-of-Stake, but that actually isn't correct. Proof-of-Stake doesn't make the network faster, it just lets the network run without wasting astronomical amounts of energy. Proof-of-Stake is more analogous to the introduction of AJAX in 1999, the foundational JavaScript extension that allowed for asynchronous websites, and in turn, the entire modern web. Proof-of-Stake is a change that finally makes Ethereum usable for large scale smart contracts without burning ludicrous amounts of electricity to run the network. This, however, isn't enough, because Ethereum's network is still painfully slow. Granted, at 14 transactions per second, it's at least twice as fast as Bitcoin, but that doesn't really count for much when you need to be several orders of magnitude faster.

So, naturally, just like we did with JavaScript, a bunch of extremely smart people are inventing ways to make Ethereum's horribly slow network go really fast, either by creating Layer 2 Rollups, or via sharding through the proposed Shard Chains. Individually, these optimizations are expected to yield 2000-3000 transactions per second, and if combined, the optimistic estimate is that it will allow hundreds of thousands of transactions per second. We'll have to wait until we see the true results of the speedups, but even the pessimistic estimations expect a 100x increase in transaction speed with existing Layer 2 rollup solutions, which is a pretty big deal.

Of course, if you give an inch, they'll take a mile, and when Web Developers discovered that we could make JavaScript go fast, they started putting entire software solutions on the web. We got Web Apps. We got Electron. Now I've got JavaScript running in my goddamn IDE and there's no end in sight. We've forgotten how to make native apps (and the lack of good UI solutions for anything other than JavaScript hasn't helped, either), so now the Web isn't just inside your Web Browser, it's in all your programs too. We've created a monster, and there's no getting it back in the box.

This, I think, is something we should keep in mind when criticizing “unnecessary” blockchain solutions. Opponents of blockchain correctly point out that there are, in fact, very few things that actually need to be decentralized. Oftentimes, you can achieve decentralization through federation, or a DHT, or some other option instead, without needing an entire decentralized permanent public ledger. In many cases, having a permanent write-only public ledger is objectively worse than existing solutions.

These criticisms are all 100% true and also don't matter. If software actually needed to work well in order for people to use it, nobody would use anything Oracle made. Our future is filled with blockchains that we have spent obscene amounts of time and effort to make fast so we can create centralized decentralized solutions, private public ledgers, and dumb smart contracts. There are good ideas buried underneath this mess, but if we spend our time railing against blockchain instead of implementing them, all we'll get is more left-pad. We only have one chance to make the future suck less, even if we're just proposing less awful blockchain designs.

This is why I have begun learning the basics of blockchain - not because any of this makes sense, or is even a good idea. I simply recognize that the future is coming, and the future is dumb.


C++ Constructors, Memory, and Lifetimes


C++ Initialization Hell

What exactly happens when you write Foo* foo = new Foo();? A lot is packed into this one statement, so lets try to break it down. First, this example is allocating new memory on the heap, but in order to understand everything that's going on, we're going to have to explain what it means to declare a variable on the stack. If you already have a good understanding of how the stack works, and how functions do cleanup before returning, feel free to skip to the new statement.

Stack Lifetimes

Describing the stack is very often glossed over in many other imperative languages, despite the fact that those languages still have one (functional languages are an entirely different level of weird). Let's start with something very simple:

int foobar(int b)
{
  int a;
  a = b;
  return a;
}
Here, we are declaring a function foobar that takes an int and returns an int. The first line of the function declares a variable a of type int. This is all well and good, but where is the integer?. On most modern platforms, int resolves to a 32-bit integer that takes up 4 bytes of space. We haven't allocated any memory yet, because no new statement happened and no malloc() was called. Where is the integer?

The answer is that the integer was allocated on the stack. If you aren't familiar with the computer science data structure of the same name, your program is given a chunk of memory by the operating system that is organized into a stack structure, hence the name. It's like a stack of plates - you can push items on top of the stack, or you can remove items from the top of the stack, but you can't remove things from the middle of the stack or all the plates will come crashing down. So if we push something on top of the stack, we're stuck with it until we get rid of everything on top of it.

When we called our function, the parameter int b was pushed on to the stack. Parameters take up memory, so on to the stack they go. Hence, before we ever reach the statement int a, 4 bytes of memory were already pushed onto our stack. Here's what our stack looks like at the beginning of the function if we call it with the number 90 (assuming little-endian):

Stack for b

int a tells the compiler to push another 4 bytes of memory on to the stack, but it has no initial value, so the contents are undefined:

Stack for a and b

a = b assigns b to a, so now our stack looks like this:

Stack for initialized a and b

Finally, return a tells the compiler to evaluate the return expression (which in our case is just a so there's nothing to evaluate), then copy the result into a chunk of memory we reserved ahead of time for the return value. Some programmers may assume the function returns immediately once the return statement is executed - after all, that's what return means, right? However, the reality is that the function still has to clean things up before it can actually return. Specifically, we need to return our stack to the state it was before the function was called by removing everything we pushed on top of it in reverse order. So, after copying our return value a, our function pops the top of the stack off, which is the last thing we pushed. In our case, that's int a, so we pop it off the stack. Our stack now looks like this:

Stack without a

The moment from which int a was pushed onto the stack to the moment it was popped off the stack is called the lifetime of int a. In this case, int a has a lifetime of the entire function. After the function returns, our caller has to pop off int b, the parameter we called the function with. Now our stack is empty, and the lifetime of int b is longer than the lifetime of int a, because it was pushed first (before the function was called) and popped afterwards (after the function returned). C++ builds it's entire concept of constructors and destructors on this concept of lifetimes, and they can get very complicated, but for now, we'll focus only on stack lifetimes.

Let's take a look at a more complex example:

int foobar(int b)
{
  int a;
  
  {
    int x;
    x = 3;
    
    {
      int z;
      int max;
      
      max = 999;
      z = x + b;
      
      if(z > max)
      {
        return z - max;
      }
      
      x = x + z;
    }
    
    // a = z; // COMPILER ERROR!
    
    {
      int ten = 10;
      a = x + ten;
    }
  } 
  
  return a;
}
Let's look at the lifetimes of all our parameters and variables in this function. First, before calling the function, we push int b on to the stack with the value of whatever we're calling the function with - say, 900. Then, we call the function, which immediately pushes int a on to the stack. Then, we enter a new block using the character {, which does not consume any memory, but instead acts as a marker for the compiler - we'll see what it's used for later. Then, we push int x on to the stack. We now have 3 integers on the stack. We set int x to 3, but int a is still undefined. Then, we enter another new block. Nothing interesting has happened yet. We then push both int z and int max on to the stack. Then we assign 999 to int max and assign int z the value x + b - if we passed in 900, this means z is now equal to 903, which is less than the value of int max (999), so we skip the if statement for now. Then we assign x to x + z, which will be 906.

Now things get interesting. Our topmost block ends with a } character. This tells the compiler to pop all variables declared inside that block. We pushed int z on to the stack inside this block, so it's gone now. We cannot refer to int z anymore, and doing so will be a compiler error. int z is said to have gone out of scope. However, we also pushed int max on to the stack, and we pushed it after int z. This means that the compiler will first pop int max off the stack, and only afterwards will it then pop int z off the stack. The order in which this happens will be critical for understanding how lifetimes work with constructors and destructors, so keep it in mind.

Then, we enter another new scope. This new scope is still inside the first scope we created that contains int x, so we can still access x. We define int ten and initialize it with 10. Then we set int a equal to x + ten, which will be 916. Then, our scope ends, and int ten goes out of scope, being popped off the stack. Immediately afterwards, we reach the end of our first scope, and int x is popped off the stack.

Finally, we reach return a, which copies a to our return value memory segment, pops int a, and returns to our caller, who then pops int b. That's what happens when we pass in 900, but what happens if we pass in 9000?

Everything is the same until we reach the if statement, whose condition is now satisfied, which results in the function terminating early and returning z - max. What happens to the stack?

When we reach return z - max, the compiler evaluates the statement and copies the result (8004) out. Then it starts popping everything off the stack (once again, in the reverse order that things were pushed). The last thing we pushed on to the stack was int max, so it gets popped first. Then int z is popped. Then int x is popped. Then int a is popped, the function returns, and finally int b is popped by the caller. This behavior is critical to how C++ uses lifetimes to implement things like smart pointers and automatic memory management. Rust actually uses a similar concept, but it uses it for a lot more than C++ does.

new Statements

Okay, now we know how lifetimes work and where variables live when they aren't allocated, but what happens when you do allocate memory? What's going on with the new statement? To look at this, let's use a simplified example:

int* foo = new int();
Here we have allocated a pointer to an integer on the stack (which will be 8 bytes if you're on a 64-bit system), and assigned the result of new int() to it. What happens when we call new int()? In C++, the new operator is an extension of malloc() from C. This means it allocates memory from the heap. When you allocate memory on the heap, it never goes out of scope. This is what most programmers are familiar with in other languages, except that most other languages handle figuring out when to deallocate it and C++ forces you to delete it yourself. Memory allocated on the heap is just there, floating around, forever, or until you deallocate it. So this function has a memory leak:
int bar(int b)
{
  int* a = new int();
  *a = b;
  return *a;
}
This is the same as our first example, except now we allocate a on the heap instead of the stack. So, it never goes out of scope. It's just there, sitting in memory, forever, until the process is terminated. The new operator looks at the type we passed it (which is int in this case) and calls malloc for us with the appropriate number of bytes. Because int has no constructors or destructors, it's actually equivelent to this:
int bar(int b)
{
  int* a = (int*)malloc(sizeof(int));
  *a = b;
  return *a;
}
Now, people who are familiar with C will recognize that any call to malloc should come with a call to free, so how do we do that in C++? We use delete:
int bar(int b)
{
  int* a = new int();
  *a = b;
  int r = *a;
  delete a;
  return r;
}
IMPORTANT: Never mix new and free or malloc and delete. The new/delete operators can use a different allocator than malloc/free, so things will violently explode if you treat them as interchangeable. Always free something from malloc and always delete something created with new.

Now we aren't leaking memory, but we also can't do return *a anymore, because it's impossible for us to do the necessary cleanup. If we were allocating on the stack, C++ would clean up our variable for us after the return statement, but we can't put anything after the return statement, so there's no way to tell C++ to copy the value of *a and then manually delete a without introducing a new variable r. Of course, if we could run arbitrary code when our variable went out of scope, we could solve this problem! This sounds like a job for constructors and destructors!

Constructors and delete

Okay, let's put everything together and return to our original statement in a more complete example:

struct Foo
{
  // Constructor for Foo
  Foo(int b)
  {
    a = b;
  }
  // Empty Destructor for Foo
  ~Foo() {}
  
  int a;
};

int bar(int b)
{
  // Create
  Foo* foo = new Foo(b);
  int a = foo->a;
  // Destroy
  delete foo;
  return a; // Still can't return foo->a
}
In this code, we still haven't solved the return problem, but we are now using constructors and destructors, so let's walk through what happens. First, new allocates memory on the heap for your type. Foo contains a 32-bit integer, so that's 4 bytes. Then, after the memory is allocated, new automatically calls the constructor that matches whatever parameters you pass to the type. Your constructor doesn't need to allocate any memory to contain your type, since new already did this for you. Then, this pointer is assigned to foo. Then we delete foo, which calls the destructor first (which does nothing), and then deallocates the memory. If you don't pass any parameters when calling new Type(), or you are creating an array, C++ will simply call the default constructor (a constructor that takes no parameters). This is all equivelent to:
int bar(int b)
{
  // Create
  Foo* foo = (Foo*)malloc(sizeof(Foo));
  new (foo) Foo(b); // Special new syntax that ONLY calls the constructor function (this is how you manually call constructors in C++)
  int a = foo->a; 
  // Destroy
  foo->~Foo(); // We can, however, call the destructor function directly
  free(foo);
  
  return a; // Still can't return foo->a
}
This uses a special new syntax that doesn't allocate anything and simply lets us call the constructor function directly on our already allocated memory. This is what the new operator is doing for you under the hood. We then call the destructor manually (which you can do) and free our memory. Of course, this is all still useless, because we can't return the integer we allocated on the heap!

Destructors and lifetimes

Now, the magical part of C++ is that constructors and destructors are run when things are pushed or popped from the stack [1]. The fact that constructors and destructors respect variable lifetimes allows us to solve our problem of cleaning up a heap allocation upon returning from a function. Let's see how that works:

struct Foo
{
  // Default constructor for Foo
  Foo()
  {
    a = new int();
  }
  // Destructor frees memory we allocated using delete
  ~Foo()
  {
    delete a;
  }
  
  int* a;
};

int bar(int b)
{
  Foo foo;
  *foo.a = b;
  return *foo.a; // Doesn't leak memory!
}
How does this avoid leaking memory? Let's walk through what happens: First, we declare Foo foo on the stack, which pushes 4 bytes on to the stack, and then C++ calls our default constructor. Inside our default constructor, we use new to allocate a new integer and store it in int* a. Returning to our function, we then set our integer pointer foo.a to b. Then, we return the value stored in foo.a from the function[2]. This copies the value out of foo.a first by dereferencing the pointer, and then C++ calls our destructor ~Foo before Foo foo is popped off the stack. This destructor deletes int* a, ensuring we don't leak any memory. Then we pop off int b from the stack and the function returns. If we could somehow do this without constructors or destructors, it would look like this:
int bar(int b)
{
  Foo foo;
  foo.a = new int();
  *foo.a = b;
  int retval = *foo.b;
  delete a;
  return retval;
}
The ability to run a destructor when something goes out of scope is an incredibly important part of writing good C++ code, becuase when a function returns, all your variables go out of scope when the stack is popped. Thus, all cleanup that is done during destructors is gauranteed to run no matter when you return from a function. Destructors are gauranteed to run even when you throw an exception! This means that if you throw an exception that gets caught farther up in the program, you won't leak memory, because C++ ensures that everything on the stack is correctly destroyed when processing exception handling, so all destructors are run in the same order they normally are.

This is the core idea behind smart pointers - if a pointer is stored inside an object, and that object deletes the pointer in the destructor, then you will never leak the pointer because C++ ensures that the destructor will eventually get called when the object goes out of scope. Now, if implemented naively there is no way to pass the pointer into different functions, so the utility is limited, but C++11 introduced move semantics to help solve this issue. We'll talk about those later. For now, let's talk about different kinds of lifetimes and what they mean for when constructors and destructors are called.

Static Lifetimes

Because any struct or class in C++ can have constructors or destructors, and you can put structs or classes anywhere in a C++ program, this means that there are rules for how to safely invoke constructors and destructors in all possible cases. These different possible lifetimes have different names. Global variables, or static variables inside classes, have what's called “static lifetime”, which means their lifetime begins when the program starts and ends once the program exits. The exact order these constructors are called, however, is a bit tricky. Let's look at an example:

struct Foo
{
  // Default constructor for Foo
  Foo()
  {
    a = new int();
  }
  // Destructor frees memory we allocated using delete
  ~Foo()
  {
    delete a;
  }
  
  int* a;
  static Foo instance;
};

static Foo GlobalFoo;

int main()
{
  *GlobalFoo.a = 3;
  *Foo::instance.a = *GlobalFoo.a;
  return *Foo::instance.a;
}
When is instance constructed? When is GlobalFoo constructed? Can we safely assign to GlobalFoo.a immediately? The answer is that all static lifetimes are constructed before your program even starts, or more specifically, before main() is called. Thus, by the time your program has reached your entry point (main()), C++ gaurantees that all static lifetime objects have already been constructed. But what order are they constructed in? This gets complicated. Basically, static variables are constructed in the order they are declared in a single .cpp file. However, the order these .cpp files are constructed in is undefined. So, you can have static variables that rely on each other inside a single .cpp file, but never between different files.

Likewise, all static lifetime objects get deconstructed after your main() function returns, and once again, this order is random, although it should be in the reverse order they were constructed in. Technically this should be respected even if an exception occurs, but because the compiler can assume the process will terminate immediately after an unhandled exception occurs, this is unreliable.

Static lifetimes still apply for shared libraries, and are constructed the moment the library is loaded into memory - that's LoadLibrary on Windows and dlopen on Linux. Most kernels provide a custom function that fires when the shared library is loaded or unloaded, and these functions fall outside of the C++ standard, so there's no gaurantee about whether the static constructors have actually been called when you're inside the DllLoad, but almost nobody actually needs to worry about those edge cases, so for any normal code, by the time any function in your DLL can be called by another program, you can rest assured all static and global variables have had their constructors called. Likewise, they are destructed when the shared library is unloaded from memory.

While we're here, there are a few gotchas in the previous example that junior programmers should know about. You'll notice that I did not write static Foo* = new GlobalFoo(); - this will leak memory!. In this case, C++ doesn't actually call the destructor because Foo doesn't have a static lifetime, the pointer it's stored in does!. So the pointer will get it's constructor called before the program starts (which does nothing, because it's a primitive), and then the pointer will have it's destructor called after main() returns, which also does nothing, which means Foo never actually gets deconstructed or deallocated. Always remember that C++ is extremely picky about what you do. C++ won't magically extend Foo's lifetime to the lifetime of the pointer, it will instead do exactly what you told it to do, which is to declare a global pointer primitive.

Another thing to avoid is to not accidentally write Foo::instance.a = GlobalFoo.a;, because this doesn't copy the integer, it copies the pointer from GlobalFoo to Foo::instance. This is extremely bad, because now Foo::instance will leak it's pointer and instead try to free GlobalFoo's pointer, which was already deleted by GlobalFoo, so the program will crash, but only AFTER successfully returning 3. In fact, it will crash outside of the main() function completely, which is going to look very weird if you don't know what's going on.

Implicit Constructors and Temporary Lifetimes

Lifetimes in C++ can get complicated, because they don't just apply to function blocks, but also function parameters, return values, and expressions. This means that, for example, if we are calling a function, and we construct a new object inside the function call, there is an implicit lifetime that exists for the duration of the function call, which is well-defined but very weird unless you're aware of exactly what's going on. Let's look at a simple example of a function call that constructs an object:

class Foo
{
  // Implicit constructor for Foo
  Foo(int b)
  {
    a = b;
  }
  // Empty Destructor for Foo
  ~Foo() {}
  
  int a;
}

int get(Foo foo)
{
  return foo.a;
}

int main()
{
  return get(3);
}
To understand what's going on here, we need to understand implicit constructors, which are a “feature” of C++ you never wanted but got anyway. In C++, all constructors that take exactly 1 argument are implicit, which means the compiler will attempt to use call them to satisfy a type transformation. In this case, we are trying to pass 3 into the get() function. 3 has the type int, but get() takes an argument of type Foo. Normally, this would just cause an error, because the types don't match. But because we have a constructor for Foo that takes an int, the compiler actually calls it for us, constructing an object of type Foo and passing it into the function! Here's what it looks like if we do this ourselves:
int main()
{
  return get(Foo(3));
}
C++ has “helpfully” inserted this constructor for us inside the function call. So, now that we know our Foo object is being constructed inside the function call, we can ask a different question: When does the constructor get called, exactly? When is it destructed? The answer is that all the expressions in your function call are evaluated first, from left-to-right. Our expression allocated a new temporary Foo object by pushing it onto the stack and then calling the constructor. However, do be aware that compilers aren't always so great about respecting initialization order in function calls or other initialization lists. But, ostensibly, they're supposed to be evaluated from left-to-right.

So, once all expressions inside the parameteres have been evaluated, we then push the parameters on to the stack and copy the results of the expressions into them, allocate space on the stack for the return value, and then we enter the function. Our function executes, copies a return value into the space we reserved, finishes cleaning up, and returns. Then we do something with the return value and pop our parameters off the stack. Finally, after all the function parameter boilerplate has been finished, our expressions go out of scope in reverse order. This means that destructors are called from right-to-left after the function returns. This is all roughly equivilent to doing this:

int main()
{
  int b;
  {
    Foo a = Foo(3); // Construct Foo
    b = get(a); // Call function and copy result
  } // Deconstruct Foo
  return b;
}
This same logic works for all expressions - if you construct a temporary object inside an expression, it exists for the duration of the expression. However, the exact order that C++ evaluates expressions is extremely complicated and not always defined, so this is a bit harder to nail down. Generally speaking, an object gets constructed right before it's needed to evaluate the expression, and gets deconstructed afterwards. These are “temporary lifetimes”, because the object only briefly exists inside the expression, and is deconstructed once the expression is evaluated. Because C++ expressions are not always ordered, you should not attempt to rely on any sort of constructor order for arbitrary expressions. As an example, we can inline our previous get() function:
int main()
{
  return Foo(3).a;
}
This will allocate a temporary object of type Foo, construct it with 3, copy out the value from a, and then deconstruct the temporary object before the return statement is evaluated. For the most part, you can just assume your objects get constructed before the expression happens and get destructed after it happens - try not to rely on ordering more specific than that. The specific ordering rules are also changing in C++20 to make it more strict, which means how strict the ordering is will depend on what compiler you're using until everyone implements the standard properly.

For the record, if you don't want C++ “helpfully” turning your constructors into implicit ones, you can use the explicit keyword to disable that behavior:

struct Foo
{
  explicit Foo(int b)
  {
    a = b;
  }
  ~Foo() {}
  
  int a;
};

Static Variables and Thread Local Lifetimes

Static variables inside a function (not a struct!) operate by completely different rules, because this is C++ and consistency is for the weak.

struct Foo
{
  explicit Foo(int b)
  {
    a = b;
  }
  ~Foo() {}
  
  int a;
};

int get()
{
  static Foo foo(3);
  
  return foo.a;
}

int main()
{
  return get() + get();
}
When is foo constructed? It's not when the program starts - it's actually only constructed the first time the function gets called. C++ injects some magic code that stores a global flag saying whether or not the static variable has been initialized yet. The first time we call get(), it will be false, so the constructor is called and the flag is set to true. The second time, the flag is true, so the constructor isn't called. So when does it get destructed? After main() returns and the program is exiting, just like global variables!

Now, this static initialization is gauranteed to be thread-safe, but that's only useful if you intend to share the value through multiple threads, which usually doesn't work very well, because only the initialization is thread-safe, not accessing the variable. C++ has introduced a new lifetime called thread_local which is even weirder. Thread-local static variables only exist for the duration of the thread they belong to. So, if you have a thread-local static variable in a function, it's constructed the first time you call the function on a per-thread basis, and destroyed when each thread exits, not the program. This means you are gauranteed to have a unique instance of that static variable for each thread, which can be useful in certain concurrency situations.

I'm not going to spend any more time on thread_local because to understand it you really need to know how C++ concurrency works, which is out of scope for this blog post. Instead, let's take a brief look at Move Semantics.

Move Semantics

Let's look at C++'s smart pointer implementation, unique_ptr<>.

int get(int* b)
{
  return *b;
}

int main()
{
  std::unique_ptr<int> p(new int());
  *p = 3;
  int a = get(p.get());
  return a;
}
Here, we allocate a new integer on the heap by calling new, then store it in unique_ptr. This ensures that when our function returns, our integer gets freed and we don't leak memory. However, the lifetime of our pointer is actually excessively long - we don't need our integer pointer after we've extracted the value inside get(). What if we could change the lifetime of our pointer? The actual lifetime that we want is this:
int get(int* b)
{
  return *b;
  // We want the lifetime to end here
}

int main()
{
  // Lifetime starts here
  std::unique_ptr<int> p(new int());
  *p = 3;
  int a = get(p.get());
  return a;
  // Lifetime ends here
}
We can accomplish this by using move semantics:
int get(std::unique_ptr<int>&& b)
{
  return *b;
  // Lifetime of our pointer ends here
}

int main()
{
  // Lifetime of our pointer starts here
  std::unique_ptr<int> p(new int());
  *p                = 3;
  int a             = get(std::move(p));
  return a;
  // Lifetime of p ends here, but p is now empty
}
By using std::move, we transfer ownership of our unique_ptr to the function parameter. Now the get() function owns our integer pointer, so as long as we don't move it around again, it will go out of scope once get() returns, which will delete it. Our previous unique_ptr variable p is now empty, and when it goes out of scope, nothing happens, because it gave up ownership of the pointer it contained. This is how you can implement automatic memory management in C++ without needing to use a garbage collector, and Rust actually uses a more sophisticated version of this built into the compiler.

Move semantics can get very complex and have a lot of rules surrounding how temporary values work, but we're not going to get into all that right now. I also haven't gone into the many different ways that constructors can be invoked, and how those constructors interact with the different ways you can initialize objects. Hopefully, however, you now have a grasp of what lifetimes are in C++, which is a good jumping off point for learning about more advanced concepts.


[1] Pedantic assembly-code analysts will remind us that the stack allocations usually happen exactly once, at the beginning of the function, and then are popped off at the very end of the function, but the standard technically doesn't even require a stack to exist in the first place, so we're really talking about pushing and popping off the abstract stack concept that the language uses, not what the actual compiled assembly code really does.

[2] We're dereferencing the pointer here because we want to return the value of the pointer, not the pointer itself! If you tried to return the pointer itself from the function, it would point to freed memory and crash after the function returned. Trying to return pointers from functions is a common mistake, so be careful if you find yourself returning a pointer to something. It's better to use unique_ptr to manage lifetimes of pointers for you.


Factorio Is The Best Technical Interview We Have


There's been a lot of hand-wringing over The Technical Interview lately. Many people realize that inverting a binary tree on a whiteboard has basically zero correlation to whether or not someone is actually a good software developer. The most effective programming test anyone's come up with is still Fizzbuzz. One consequence of this has been an increased emphasis on Open Source Contributions, but it turns out these aren't a very good metric either, because most people don't have that kind of time.

The most effective programming interview we have now is usually some kind of take-home project, where a candidate is asked to fix a bug or implement a small feature within a few days. This isn't great because it takes up a lot of time, and they could recieve outside help (or, if the feature is sufficiently common, google it). On the other hand, some large companies have instead doubled-down on whiteboard style interviews by subjecting prospective engineers to multiple hour-long online coding assessments, with varying levels of invasive surveillience.

All these interviewing methods pale in comparison to a very simple metric: playing Factorio with someone. Going through an entire run of Factorio is almost the best possible indication of how well someone deals with common technical problems. You can even tweak the playthrough based on the seniority of the position you're hiring for to get a better sense of how they'll function in that role.

Factorio?

Factorio is a game about automation. The best introduction is probably this trailer, but in essence, your job is to build an automated factory capable of launching a rocket into space.

You begin with nothing. You mine stone manually to craft a smelter that can smelt iron ore you mined into iron plates, which you then use to build a coal-driven automatic miner. You could grab the iron ore from the miner and put it in the smelter yourself, but it's more efficient to use an inserter to do the inserting for you. Then you can use the iron this gives you to make another miner, which automates coal mining. Then you can use belts to take the coal and use an inserter to put it in the iron miner. Then you use the iron plates this tiny factory produces to make a third miner to start gathering copper, which then lets you craft copper wire, which lets you craft a circuit, which lets you build a water pump. Combined with a boiler and a steam engine, you can then build produce power, and use this power to run a research facility to unlock new technology, like assembly machines. Once you've unlocked assembly machines, you can use your circuits to craft an assembly machine that can craft copper wire for you, and insert this into an assembly machine that crafts circuits for you.

Eventually you unlock trains and robots and logistic systems which help you deal with the increasing logistic complexity the game demands, until you finally manage to launch a rocket into space.

Self-Direction

The beginning of the game starts with no goals and barely any direction. A senior developer should be able to explore the UI and figure out a goal, then establish a plan for accomplishing that goal. A junior developer should be able to perform a task that a senior developer has provided for them. An intern is expected to require quite a bit of mentoring, but a junior developer should be able to troubleshoot basic problems with their own code before requiring assistance from the senior developer. An intermediate developer should be able to operate independently once given a task, but is not expected to do any architecture design.

In more concrete terms, you might expect the following:

  • An Intern is generally expected to be able to fill in a pre-placed blueprint, and use belts to hook up their blueprint with something else, like an ore patch.
  • A Junior Developer should be able to build a production line by themselves, although it probably won't be very optimal. They may need assistance from the senior developer on how to route the belts properly to all of the intermediate assembly machines.
  • An Intermediate Developer should be capable of designing a near-optimal production line (without beacons) once given direction, with minimal oversight.
  • The Senior Developer needs no direction, and is capable of determining what goals need to happen and designing a plan of action, then delegating these tasks to other coders.

Teamwork

A critical aspect of software development is the ability to work on a team. This means coordinating your efforts with other people, accomadating the needs of other people's designs and cooperating with the team, instead of simply running off on your own and refusing to adjust your design to help integrate it with someone else's work. This, naturally, arises all the time in Factorio, because base layout designs are limited by physical space. As a result, you need to carefully consider what other people are doing, and sometimes adjust your design to fit in size constraints or deal with someone else's design that took more room than anticipated.

Anyone who simply runs off and starts doing things themselves or fixing problems without telling people is going to quickly earn the ire of their fellow players, for the exact same reasons cowboy programmers do. Luckily, Factorio includes a built-in equivelent to git blame, by showing you the last player that modified any entity. Thus, when people duct tape temporary solutions and don't inform the team about the problem they were fixing, when their temporary solution finally blows up, people will find out. If people want to win they game, they'll have to learn to cooperate well with their teammates.

Debugging

One of the most important skills for any programmer is their ability to debug problems. This is perhaps the most obvious parallel between Factorio and real software engineering. Something can go wrong very far away from the actual source of the problem. Being able to rapidly hone in on the real problem is a critical skill, and the thinking process is almost identical to tracing the cause of a crash in an actual program. If an assembly machine has stopped working, first you have to see if there are multiple outputs that got backed up. Then you have to check what ingredient it's missing. Then you have to trace the ingredient back through your factory to find out where you're making it, and repeat ad nauseum.

Factorio's debugging gets fairly complicated quite quickly. As soon as you start working on oil processing you'll be dealing with cracking, where you're dealing with 3 different outputs and if any of them get backed up for any reason, the entire thing stops. There are cases where your entire factory can grind to a halt because you started researching something that doesn't require yellow science, which stopped using up robot frames, which stopped using up electric engines, which stopped using lubricant, which stopped consuming heavy oil, which backed up and stopped oil production, which made you run out of petroleum, which broke plastic, which broke red circuits, which broke the rest of the factory. Seasoned players will anticipate scenarios like this and use circuits to construct self-balancing oil cracking to ensure the system is balanced and will only back up if petroleum backs up. A new player who is a good programmer, when presented with a factory that has collapsed, will usually be able to trace the issue back to the source, realize what's happened, and promptly attempt to figure out a solution. On the other hand, if someone simply plops down a few storage tanks, unless they can provide a good reason (they are very confident we will never stop consuming lubricant in the future), then this is a red flag for how they approach problem solving in their programs.

Situations like these allow Factorio to closely mimic the complex interdependencies that programmers routinely deal with, and the complexity simply increases the more gameplay concepts are added. This closely follows the increased complexity that additional layers of abstraction introduce when attempting to debug a crash that could have potentially occured deep inside one of the frameworks you use.

Code Reviews

Often, initial designs need to be tweaked for performance or throughput. Good programmers will not only accept critique of their designs, but incorporate that feedback into their future work. If they disagree with a proposed change, they will provide a concrete reason for why they disagree so that the team can more accurately weigh the pros and cons of the proposed change.

Resisting feedback without providing good reasons is a well-known red flag, but what can also be problematic is a programmer who begrudgingly accepts proposed changes, but refuses to adjust future designs accordingly. They end up requiring constant reminders to adhere to some standard way of solving a problem while giving no reason for why they don't seem to like the way the team is doing things. These can be ticking time-bombs in organizations, because when left unsupervised they can rapidly accumulate technical debt for other team members. This kind of problem is almost impossible to catch in a traditional interview, unless it's an internship.

Code Style and Frameworks

Refusing to incorporate feedback is often just a slice of a much larger problem, where someone is unable to integrate properly into an existing framework being used. There are many ways to build a factory in Factorio, and each one requires standard methods of building pieces. Failing to adhere to standards can very quickly jam up an entire factory, often in subtle ways that aren't necessarily obvious to a careless developer.

In the Main Belt design, a set of 4-8 chunk of belts, divided by 2 spaces to allow for underground belts, are placed in the center of the factory, and all production happens perpendicular to the belt. This design relies on several rules that can wreck havoc if not followed correctly. One, players must always use a splitter to pull items off of a belt, never redirecting the entire belt, otherwise using the empty space for a different belt of items means you'll have permanently lost one entire belt of resources, even after upgrading belts. Two, all factories must be scalable in a direction perpendicular to the main belt. Failing to do this will rapidly result in either a massive waste of space, or a production line that cannot be scaled up because it's surrounded by other production lines.

There are also different ways of building logistic networks. The simplest method is with passive provider chests, but another method uses a storage chest with a filter, which is used to solve the trashed item problem. Both of these methods require properly setting limiters in the right location. Passive provider chests generally are limited by chest space. Storage chests require hooking the inserter for the chest up to the logistics network and ensuring that less than N of an item exists before inserting it. Forgetting to perform these limiting steps is a massive waste of resources. Consistently forgetting to put limiters on outputs is a red flag for someone who is careless about performance in real-world applications.

In other cases, the team may be using some pre-designed blueprints, like a nuclear reactor design, or a bot factory. These can be extremely complex, but as long as people are willing to learn how to use them, they can be huge time-savers. Beware of candidates who don't want to learn how to set up a new item in the bot factory simply because they can't debug the complex logic that drives it, or ones that get frustrated learning how to use a bot factory despite the clear and obvious benefits.

Multithreading

Trains in Factorio are a direct analogue to multithreading: one train is one thread of execution, and each train intersection or train stop is a place in memory where two threads could potentially write at the same time. Train signals are locks, or mutexes. All bugs in train networks manifest in exactly the same way software race conditions do, because they're literally physical race conditions. All of the tradeoffs apply here as well - if you make a lock too large, it slows down your throughput, because now the intersection is blocked for a longer period of time. Incorrectly signaled tracks routinely cause train deadlocks that are exactly the same as a software deadlock, because you end up with a circular lock dependency. The most common deadlock is when a train is too long and unexpectedly blocks a second intersection while waiting to enter one. This second intersection then prevents another train from leaving, preventing the first intersection from ever being unblocked.

The number of lanes of track in your network is equivilent to the number of cores available in your CPU. A single rail line is difficult to scale beyond a few threads because the entire system gets throughput limited very quickly, even with wait areas. The most common design is a two-lane design where each lane is a single direction, but this will eventually suffer from throughput issues when you need trains constantly being unloaded. Thus, large bases tend to have at least 4 lanes, with two outer lanes acting as bypasses to avoid the intersection whenever possible.

Missing signal problems in these systems can take a ridiculous amount of time to actually show up. A single missing signal in one rail network once caused a deadlock after functioning correctly for two weeks. This is remniscient of difficult to pin down race conditions in software that only occur once a month or so when under high contention.

Scaling

Just like in software, scaling up production in Factorio introduces new problems with initial designs, and often require complete redesigns that can pipe resources into factories as fast as possible, while taking advange of production modules and speed module beacons. Belt limits become problematic even at the fastest belt speed, forcing players to find ways to split designs up so that more belts can be put in later down the line, or split up their factories into modules.

Handling your logistics network itself becomes a logistics problem in the late game because of how problematic expansive bot networks are. You generally need to start segmenting the logistics network and either using trains to transport items between them, or build a requester chest/provider chest that propagates items across bounderies.

Managing trains in the late game necessitates switching to a pull architecture from a push architecture, because the push architecture can't function in high throughput. This inevitably requires taking advantage of the Train Limits feature and learning how circuit networks can be used to encode basic logic, such that a station only requests a train when it is actually ready to completely fill the train with resources, instead of the common early game tactic of simply telling a bunch of trains to go to stations named “Iron Pickup”. This minimizes the number of trains you need while making sure all stops are served on the network.

Often times, limitations in the number of possible inputs to an assembly machine and inserter speed require redesigning factories around them, just like how high-speed computing requires being aware of subtle bottlenecks in how your CPU works. These bottlenecks are almost never a problem until you reach a certain scale, at which point they begin to dominate your efficiency.

Microservices and Plugin Architectures

Eventually, factories get so enormous they must abandon a simple main belt or spaghetti design and use a more scalable framework instead. To reach Megabase-scale, factories generally either use a train system or a module system, which corresponds roughly to microservices or a plugin-architecture.

A train-based megabase is sometimes referred to a “city-block” design, where trains surrounded factory blocks and control all input and output. Thus, each individual city-block is isolated from all other city-blocks, since all their input is “pure” in that it comes from the train network. This is almost identical to a micro-services architecture (over HTTP) or a multi-process design (using IPC), and has similar potential issues with input and output latency, because results cannot be continually provided, they must be emitted in “packets”, or trains along the network.

The plugin architecture seeks to maintain some semblence of a main-belt, but instead splits belts off through the factory and uses modular blocks that take standard inputs and standard outputs. Sometimes this can be achieved entirely through bots, but materials usually need to be belted over long distances. This closely resembles a plugin system for a monolithic application, and has similar tradeoffs.

These megabases mark the extreme upper end of a vanilla Factorio server. However, there are plenty of mods to make things much more complex.

Distributed Systems

Space Exploration is an overhaul of Factorio that adds an entire space segment of the game, and makes planets have limited resources, requiring players to colonize other planets and use rockets to transfer resources between planets. Because of the enormous latency involved with shipping materials between planets, coordinating these different bases winds up having similar problems to a globally distributed database system. Even the circuit network has to contend with latency, because an automatic request system loses track of items that have been launched but haven't yet reached the target planet. Not accounting for this will result in a double-request for all the items you wanted, which is the exact same problem that distributed systems have when trying to ensure consistency.

Conclusion

Collectively, the software industry simply has no idea how to hire software developers. Factorio is probably the best technical interview we have right now, and that's embarassing. It is also wildly impractical, taking over 20 hours in an initial multiplayer playthrough, or 8 hours if you have a lot of people and know what you're doing. What's the takeaway from this? I don't know. We certainly can't switch to using Factorio as an interviewing method - you might as well just give a candidate a take-home assignment.

At the very least, we can do better than whiteboard interviews.


Why You Can't Use Prebuilt LLVM 10.0 with C++17


C++17 introduced an alignment argument to ::operator new(). It's important to note that if you allocate something using aligned new, you absolutely must deallocate it using aligned delete, or the behavior is undefined. LLVM 10.x takes advantage of this alignment parameter, if the compiler supports it. That means if you are compiling on Windows with MSVC set to C++14, __cpp_aligned_new is not defined and the extra argument isn't passed. Otherwise, if it's compiled with MSVC set to C++17, __cpp_aligned_new is defined and the extra argument is passed.

There's just one teeny tiny little problem - the check was in a header file.

Now, if this header file were completely internal and LLVM itself only ever called ::operator new() from inside it's libraries, this wouldn't be a problem. As you might have guessed, this was not the case. LLVM was calling allocate_buffer() from inside header files. The corresponding deallocate_buffer call was inside the same header file, but sadly, it was called from a destructor, and that destructor had been called from inside one of LLVM's libraries, which meant it didn't know about aligned allocations… Oops!

This means, if your program uses C++17, but LLVM was compiled with C++14, your program will happily pass LLVM aligned memory, which LLVM will then pass into the wrong delete operator, because it's calling the unaligned delete function from C++14 instead of the aligned delete function from C++17. This results in strange and bizarre heap corruption errors because of the mismatched ::operator new and ::operator delete. Arguably, the real bug here is LLVM calling any allocation function from a header file in the first place, as this is just begging for ABI problems.

Of course, one might ask why a C++17 application would be linking against LLVM compiled with C++14. Well, you see, the prebuilt LLVM binaries were compiled against C++14… OOPS! Suddenly if you wanted to avoid compiling LLVM yourself, you couldn't use C++17 anymore!

Luckily this has now been fixed after a bunch of people complained about the extremely unintuitive errors resulting from this mismatch. Unfortunately, as of this writing, LLVM still hasn't provided a new release, which means you will still encounter this problem using the latest precompiled binaries of LLVM. Personally, I recompile LLVM using C++17 for my own projects, just to be safe, since LLVM does not gaurantee ABI compatibility in these situations.

Still, this is a particularly nasty case of an unintentional ABI break between C++ versions, which is easy to miss because most people assume C++ versions are backwards compatible. Be careful, and stop allocating memory in header files that might be deallocated inside your library!


Avatar

Archive

  1. 2022
  2. 2021
  3. 2020
  4. 2019
  5. 2018
  6. 2017
  7. 2016
  8. 2015
  9. 2014
  10. 2013
  11. 2012
  12. 2011
  13. 2010
  14. 2009