Erik McClure

We Need New Motherboards Before GPUs Collapse Under Their Own Gravity


You can't have a 4-slot GPU. You just can't.

We have finally left sanity behind, with nvidia's 4000 series cards yielding a “clown car” of absurd GPU designs, as GamersNexus put it. These cards are so huge they need “GPU Support Sticks”, which are an actual real thing now. The fact that we insist on relegating the GPU to interfacing with the system while hanging off of a single, increasingly absurd PCIe 6.0 x16 slot that can push 128 GBps is completely insane. There is no real ability to just pick the GPU you want and then pair it with a cooler that is actually attached to the motherboard. The entire cooling solution has to be in the card itself and we are fast reaching the practical limitations here due to gravity and the laws of physics. Top-heavy GPUs are now essentially giant levers pulling on the PCIe slot, with the only possible anchor point that is above the center of mass being the bracket on one side.

A 4090 series card will demand a whopping 450 W, which dwarfs the Ryzen 9 5900X peak power consumption of only 140 W. That's over 3 times as much power! The graphics card is now drawing more power than the entire rest of the computer! We'll have to wait for benchmarks to be sure, but the laws of thermodynamics suggest that the GPU will now also be producing more heat than every other component of the PC, combined. And this is the thing we have hanging off of a PCIe slot that doesn't have any other way of mounting a cooling solution to the motherboard?!

What the FUCK are we doing?!

Look, I'm not a hardware guy. I just write all the shader code that makes GPUs cry. I don't actually know how we should fix this problem, because I don't know what designs are thermally efficient or not. I do know, however, that something has to change. Maybe we can make motherboards with a GPU slot next to the CPU slot and have a unified massive radiator sitting on top of them - or maybe it's a better idea to put the two processor units on opposite ends of the board. I don't know, just do something so I can use a cooling solution that is actually screwed into the fucking motherboard instead of requiring a “GPU Support Stick” so gravity doesn't rip it out of the PCIe slot.

As an example of alternative solutions, here is an MXM form-factor for laptops that allow them to provide custom cooling solutions appropriate for the laptop.

In fact, the PCIe spec itself actually contains a rear-bracket mount that, if anyone was paying attention, would help address this problem:

See that funky looking metal thing labeled “2” on the diagram? That sure looks like a good alternative to a “support stick” if anyone ever actually paid attention to the spec. Or maybe this second bracket doesn't work very well and we need to rethink how motherboards work entirely. Should we have GPU VRAM slots alongside CPU RAM slots? Is that even possible? (Nope.) Or maybe we can come up with an alternative form factor for GPU cards that you can actually attach to the motherboard with screws?

I have no idea what is or isn't practical, but please, just do something before the GPUs collapse under their own gravity and create strange new forms of matter inside my PC case.


I'll Never Respect My Elders After What They've Done


21 years ago, I walked into my 5th grade teacher's classroom and it was dead silent. The TV was on. Something was happening that I didn't understand, but my instincts told me that This Was Very Wrong. That moment is carved into my memory not because of what actually happened on September 11th, which I was too young to fully grasp, but because the way the adults were behaving signaled to my young mind that this was very important and I needed to remember it.

I have one other distinct memory from that time, which was either someone on the news or a teacher in a classroom explaining that “the goal of the terrorist attack is to make us scared. If we let fear rule us, the terrorists have won.”

I guess I should congratulate the terrorists for winning the war on terror.

The utter lack of rational behavior that followed was something that continued to bother me as I grew up and went to college. I could tell something was wrong, but I wasn't quite sure what it was. As I realized college wasn't going to teach me anything useful, I just started taking more and more difficult classes in an effort to glean math jargon I could Google instead. My faith in institutions was shaken, and continued to decline when I got a job and realized how much of the business world was complete bullshit. However, Obama was president during this time, and as a result I still had some faith that while we clearly had problems, these problems were surmountable. We even signed the Paris agreement in 2015, and I hoped we would finally do something about Climate Change.

Then we elected Trump.

I'm not going to rehash the Trump years, we all know it was a clown show. But somehow, I still believed that Trump had only been elected thanks to massive voter suppression efforts. Then Covid happened, and it taught me a very important lesson: People don't give a shit. We managed to politicize one of the worst pandemics in modern history. We completely refused to obey even a partial lockdown, let alone a full lockdown, or even slow the spread with properly implemented contact tracing and testing, because this required people to act for the sake of their community, instead of themselves. At this point, America is completely incapable of collective action.

Every single year that September 11th rolls around, people point out that 3000 people dying really isn't that many when you compare it to, say, 1750 people dying from firearm homicides every month, which we continue to do nothing about, and every year people say this is in “bad taste” or is “disrespectful”. But now, idiots on social media are trying to say that this somehow isn't the right time to point out that Covid-19 killed ONE MILLION PEOPLE in the United States alone, because we're supposed to stop and remember the 3000 people who died because a terrorist from a group of rebels that we set up in the first place blew up a skyscraper?!

ARE YOU FUCKING KIDDING ME?

You want my respect? How about you explain to me why my friend's boyfriend had to die from a preventable disease just because they lived in Venezuala and never had access to a vaccine that America was hoarding after scientists said this was a bad idea? He's dead now. He's just as dead as whoever was in the World Trade Center when it collapsed. Why are we honoring the people who died from being torn apart in an explosion and not the people who are being torn apart in slow motion by a trillion tiny viruses? Where are the Covid-19 memorials? Why does this country need something to fucking explode before we care about it?!

One of my friends is transgender. Their mom is, shall we say, not supportive, to say the least. Two years ago during the pandemic lockdown, her mom (who works as a nurse) showed up uninvited at her doorstep, without a mask, and essentially barged into her apartment. They messaged me for help, and I showed up to make sure her mom didn't pull any stunts, because we already knew she was extremely manipulative. Meeting this slimy ghoul of a woman for the first time is something I will also never forget, because she made my skin crawl in a way that I can't explain. It was like every word out of her mouth was dripping with deceit. Everything she said was an attempt to control the conversation or weaponize the situation against us. The entire time she misgendered and deadnamed her daughter while showing zero respect to anyone. All she did was act “polite” and then tried to pretend that saying please and thank you means being respectful. As the argument continued, she eventually said The Words.

“You should respect your elders.”

In that moment, it took every ounce of self-control I had to not tell this horrible woman that she had no right to tell me anything. Instead, I simply said “excuse me?", because this was not my fight - I was there to support my friend's decisions, not my own, and to ensure her mom actually listened to those decisions. We did eventually get her mom out of there, but the fact that someone actually said that has stuck with me, because it reeks of the bullshit I have been fed for the past decade from old people who demand respect yet have done nothing to earn it.

I want to make this absolutely crystal clear: I don't respect any elders. They're destroying the planet and they want me to respect them? Get out. I don't think the older generations understand that you don't ask for my respect, you earn it. One of the very few people from older generations that I do respect are my parents, because they never told me I should respect them. They didn't tell me what to do. They simply tried to be good parents and respected me as a person by letting me make my own decisions, and as a result I deeply respect them, and they respect me, because respect is a two-way street.

The bullshit that conservatives say nowadays reminds me of my friend's mom. It's so patently disingenious, so obscenely manipulative, so obviously bad-faith that the only reason I take it seriously is because a real person said the same things to me in real life. I don't think they respect anyone. I don't think they care about anyone other than themselves. To conservatives, social behavior is just a game of pretend where everyone lies about everything all the time. It's performance, where you say the words you're supposed to say and do the things you're supposed to do, and if you don't you're a Bad Person.

The worst part is that this bleeds into programming. When I was a senior at high school, I argued that garbage collected languages were too inefficient for core game systems because it would cause memory fragmentation. I was brushed off because people said that magic heap compaction from the garbage collector would take care of it. 15 years later, all mainstream game engines are still C++ and many have data-oriented architectures specifically designed to minimize memory fragmentation! Despite this, I still see people brushing off younger programmers pointing out that these same problems are precisely why Rust is necessary, but when I step in and make the exact same points, they listen to me, because I'm older and have a job at some game company they've heard of. This means people are not actually listening to the technical substance of an argument, but judging it by the credentials of the author. JeanHeyd Meneide mentions this exact phenomenon in a 2020 talk about problematic behavior in the C++ standards committee, a committee is supposedly purely about technical merit that is nonetheless plagued by intense hypocrisey.

Long ago, we had a hole in the Ozone layer caused by CFCs. World governments got together and banned it right before I was born, and now the ozone has partially recovered. Today, we can't even hit the Paris Agreement goals, we're dismantling nuclear power and replacing it with coal, we continue building car dependent infrastructure while refusing to maintain any of it. We are the architects of our own demise, except I didn't vote for any of this garbage, and other millenials didn't, either. The boomers won't give us health care, they tell us to go to college but won't pay for it, they buy up all the houses, and for good measure they almost took down the entire financial system.

These people want my respect?

Performative social behaviors are not respect, and you are not entitled to my respect, or anyone else's. If you want respect, you can earn it, by respecting the agency of the other person, by actually listening to what they're saying, and by having an honest dialogue instead of arguing over semantics. Maybe if boomers actually cared about other people, they'd get some respect.


Camp Vista - Growing Up Next To Microsoft


This blog was started on LiveJournal shortly after I graduated high school in 2009. It has survived this long because I was very persistent about porting all my posts to Blogger, and then to my own static website. Of course, I have limits. Most of my terrible old posts were removed long ago, because they were so bad they didn't contribute anything. One post was a rant about my high school internship at Microsoft. I took it down because it was bad, but more importantly I took it down because I wrote it before graduating college, while I was still completely delusional.

Let me explain.

I grew up 20 minutes from Microsoft. I had absolutely no idea what an absurdly warped childhood I had until I graduated college and met people who hadn't grown up in a tech bubble. I never questioned how my middle school always somehow had the latest copy of Windows and maintained six different computer labs (one in each classroom hub, plus the library, plus a dedicated lab). I didn't realize how unusual it was for my high school to offer an “intro to game dev course” in 2008, or how ridiculous it was that my AP Computer Science professor had worked on the first version of DirectX at Microsoft. I never thought it was weird that we had Microsoft Connector buses driving around town, creating an entire secondary bus system for the sole purpose of moving around Microsoft employees.

I definitely did not realize how utterly insane it was that, in July 2006, students from my high school (and a few other high schools around the area) were invited to a week long event where we would get to experiment with and “bug test” a beta version of Windows Vista, a full six months before it was released. At 8:00AM every morning, our parents dropped us off at Microsoft Building 40, and we were led into a room filled with rows of desks with computers on them. There were maybe 50 of us, and we were given guest accounts and full internet access while Microsoft employees gave presentations about various new features they had introduced for Vista.

I still remember some of those presentations. One that really stuck with me at the time was Microsoft jumping on the “192kHz sampling rate support” bandwagon, which makes absolutely no sense outside of music production, and in retrospect just seems incredibly dumb. Another presentation had us build wifi mesh networks between the computers, which was touted as a way to share files and communicate with friends in places with little to no internet connectivity. I remember this one both because I thought it was very cool, and because someone managed to bluescreen Windows while attempting to set it up, so they actually had an SDE come in and take a look.

In their attempts to make this a “camp” experience, they gave us all a project on the final day: create our own presentation (using the new Microsoft™ Office™ PowerPoint™ 2007, of course), about some new feature we wanted on Windows or some improvement that wasn't there yet. I don't remember what our presentation was, but I do know we were all terrible. Afterwards we were all invited to the “Windows Vista Consumer Support Assisted Beta”, a predecessor to the Windows Insider program.

This “Camp Vista” thing was a strange, one-time event, but the Hunt The Wumpus competition is still going. If you're lucky enough to be attending school in the Lake Washington School District, you can team up with several other students and get tutoring from a Microsoft employee over the course of several months while you build a very basic video game and compete for Microsoft sponsored prizes. I fondly remember our attempts at building a game using XNA back when it was brand new (it's now dead), and making a bad Descent clone. We tried to use SVN, but weren't allowed to install it on school computers, so we resorted to an FTP folder and e-mailing zip files.

Here we have the crux of the problem with my initial impressions of my high school internship - it's not a normal thing! Students worldwide compete for a chance to get college internships at Microsoft, but the high school interns are just random CS students from the Lake Washington School District. When my team learned I knew quite a bit of programming already, they had to come up with something else for me to do because they had assumed I barely knew how to code. It was just another outreach program for nearby high schools, only available to 31000 kids on the entire planet.

So in the midst of me having an experience that almost no other high school student gets to have, I am complaining about things like “wow, we have too many meetings” and “wow, this software is bad”. Yes, we know Microsoft is a dysfunctional catastrophe, but focusing on issues that are omnipresent in large corporations only serves to detract from the actual crazy parts of the internship, like how the program we were working on, if compiled with no optimizations, took 20 minutes to start. Or the meeting where the entire team spent 15 in-person minutes sitting around an actual table deciding that one of our function names was too long and debated over what to rename it before deciding not to change it. Or that one time the department head (my boss's boss's boss) took me, an intern, to a meeting with his boss, who I think directly reported to a vice president. It got better, though, because one afternoon, there was a 3-hour period of time where my boss, his boss, and the department head were all gone, so I, a 19-year-old high school student, was technically supposed to take my questions about how to use C# PInvoke to the department head's boss.

I was really careful not to break anything for 3 hours.

Looking back at my teenage years has made me realize how easy it is for people to simply miss how some aspect of their upbringing was deeply unusual. For some, it may be a silent hindrance, but for others, it might be a quiet boon, softly sending opportunities their way that almost no one else has access to. How many opportunities did I let slip by, unaware of how unique they were?


Blockchain Is The New JavaScript


Over 25 years ago, Brenden Eich created JavaScript, named after Java simply because it was popular. It's prototypical nature and dynamic typing made it unsuitable for anything other than slow interpreters, forcing Google to invent a way to JIT the language just to make it fast. We even built a package manager for JavaScript which managed to create such an absurd dependency hell that one guy's left-pad function managed to break web development for a day.

Now the entire world economy runs on JavaScript.

This is the kind of dumb future that we live in, and it is the kind of dumb future we can look forward to with blockchain. It is why people who insist that blockchain will never work because of various technological reasons are ignoring the simple fact that humanity is incredibly good at standardizing on the worst possible things, then building our entire future on top of them. In fact, there is a disturbing number of direct parallels between the rise of JavaScript and the rise of blockchain technologies, with a few key differences.

JavaScript was originally going to be Scheme, a lisp dialect. Sadly, the management demanded that it look more like Java, so Brenden hid Scheme inside a vaguely Java-esque syntax, which became Mocha, which became LiveScript, which then became JavaScript. This has led to many suffering web developers agonizing over the fact that “We could have had Lisp!". Even worse, CSS originated in DSSSL, which was another dialect of Scheme for constructing style sheets, but it was deemed too complicated (too many parenthesis!), so a much simpler version was created, called CSS. Modern CSS, of course, has simply reinvented everything DSSSL had, except worse.

This is what drives the entire internet, and in turn, Amazon's trillion dollar empire.

In 2009, Satoshi Nakamoto unleashed Bitcoin upon an unsuspecting world. Most people ignored it, for good reason - it was completely impractical for any sort of global transaction system, capable of processing a whopping 7 transaction per second. Then the first bubble happened and Bitcoin skyrocketed to $1000 USD in late 2013. It then promptly crashed to $300 in 2014, after which most people thought it had simply been a passing fad, and ignored it for the next 3 years, until the second bubble of 2017.

But not everyone. After his favorite World of Warcraft spell got nerfed by Blizzard, some maniac Vitalik Buterin argued that the Bitcoin network should support programmable smart contracts, at the peak of the 2013 bubble. Here, we almost had a repeat of JavaScript, nearly bolting on smart contracts to a blockchain algorithm that was never designed for it. However, history thankfully granted us a reprieve, because the Bitcoin developers told Vitalik to take a hike. So, he invented Ethereum, which is like Bitcoin except slightly less stupid. Unfortunately, it still ran on Proof-of-Work, which means burning coal to solve sudokus.

At this point, you may be expecting me to draw a parallel from Google's V8 JavaScript engine to Proof-of-Stake, but that actually isn't correct. Proof-of-Stake doesn't make the network faster, it just lets the network run without wasting astronomical amounts of energy. Proof-of-Stake is more analogous to the introduction of AJAX in 1999, the foundational JavaScript extension that allowed for asynchronous websites, and in turn, the entire modern web. Proof-of-Stake is a change that finally makes Ethereum usable for large scale smart contracts without burning ludicrous amounts of electricity to run the network. This, however, isn't enough, because Ethereum's network is still painfully slow. Granted, at 14 transactions per second, it's at least twice as fast as Bitcoin, but that doesn't really count for much when you need to be several orders of magnitude faster.

So, naturally, just like we did with JavaScript, a bunch of extremely smart people are inventing ways to make Ethereum's horribly slow network go really fast, either by creating Layer 2 Rollups, or via sharding through the proposed Shard Chains. Individually, these optimizations are expected to yield 2000-3000 transactions per second, and if combined, the optimistic estimate is that it will allow hundreds of thousands of transactions per second. We'll have to wait until we see the true results of the speedups, but even the pessimistic estimations expect a 100x increase in transaction speed with existing Layer 2 rollup solutions, which is a pretty big deal.

Of course, if you give an inch, they'll take a mile, and when Web Developers discovered that we could make JavaScript go fast, they started putting entire software solutions on the web. We got Web Apps. We got Electron. Now I've got JavaScript running in my goddamn IDE and there's no end in sight. We've forgotten how to make native apps (and the lack of good UI solutions for anything other than JavaScript hasn't helped, either), so now the Web isn't just inside your Web Browser, it's in all your programs too. We've created a monster, and there's no getting it back in the box.

This, I think, is something we should keep in mind when criticizing “unnecessary” blockchain solutions. Opponents of blockchain correctly point out that there are, in fact, very few things that actually need to be decentralized. Oftentimes, you can achieve decentralization through federation, or a DHT, or some other option instead, without needing an entire decentralized permanent public ledger. In many cases, having a permanent write-only public ledger is objectively worse than existing solutions.

These criticisms are all 100% true and also don't matter. If software actually needed to work well in order for people to use it, nobody would use anything Oracle made. Our future is filled with blockchains that we have spent obscene amounts of time and effort to make fast so we can create centralized decentralized solutions, private public ledgers, and dumb smart contracts. There are good ideas buried underneath this mess, but if we spend our time railing against blockchain instead of implementing them, all we'll get is more left-pad. We only have one chance to make the future suck less, even if we're just proposing less awful blockchain designs.

This is why I have begun learning the basics of blockchain - not because any of this makes sense, or is even a good idea. I simply recognize that the future is coming, and the future is dumb.


C++ Constructors, Memory, and Lifetimes


C++ Initialization Hell

What exactly happens when you write Foo* foo = new Foo();? A lot is packed into this one statement, so lets try to break it down. First, this example is allocating new memory on the heap, but in order to understand everything that's going on, we're going to have to explain what it means to declare a variable on the stack. If you already have a good understanding of how the stack works, and how functions do cleanup before returning, feel free to skip to the new statement.

Stack Lifetimes

Describing the stack is very often glossed over in many other imperative languages, despite the fact that those languages still have one (functional languages are an entirely different level of weird). Let's start with something very simple:

int foobar(int b)
{
  int a;
  a = b;
  return a;
}
Here, we are declaring a function foobar that takes an int and returns an int. The first line of the function declares a variable a of type int. This is all well and good, but where is the integer?. On most modern platforms, int resolves to a 32-bit integer that takes up 4 bytes of space. We haven't allocated any memory yet, because no new statement happened and no malloc() was called. Where is the integer?

The answer is that the integer was allocated on the stack. If you aren't familiar with the computer science data structure of the same name, your program is given a chunk of memory by the operating system that is organized into a stack structure, hence the name. It's like a stack of plates - you can push items on top of the stack, or you can remove items from the top of the stack, but you can't remove things from the middle of the stack or all the plates will come crashing down. So if we push something on top of the stack, we're stuck with it until we get rid of everything on top of it.

When we called our function, the parameter int b was pushed on to the stack. Parameters take up memory, so on to the stack they go. Hence, before we ever reach the statement int a, 4 bytes of memory were already pushed onto our stack. Here's what our stack looks like at the beginning of the function if we call it with the number 90 (assuming little-endian):

Stack for b

int a tells the compiler to push another 4 bytes of memory on to the stack, but it has no initial value, so the contents are undefined:

Stack for a and b

a = b assigns b to a, so now our stack looks like this:

Stack for initialized a and b

Finally, return a tells the compiler to evaluate the return expression (which in our case is just a so there's nothing to evaluate), then copy the result into a chunk of memory we reserved ahead of time for the return value. Some programmers may assume the function returns immediately once the return statement is executed - after all, that's what return means, right? However, the reality is that the function still has to clean things up before it can actually return. Specifically, we need to return our stack to the state it was before the function was called by removing everything we pushed on top of it in reverse order. So, after copying our return value a, our function pops the top of the stack off, which is the last thing we pushed. In our case, that's int a, so we pop it off the stack. Our stack now looks like this:

Stack without a

The moment from which int a was pushed onto the stack to the moment it was popped off the stack is called the lifetime of int a. In this case, int a has a lifetime of the entire function. After the function returns, our caller has to pop off int b, the parameter we called the function with. Now our stack is empty, and the lifetime of int b is longer than the lifetime of int a, because it was pushed first (before the function was called) and popped afterwards (after the function returned). C++ builds it's entire concept of constructors and destructors on this concept of lifetimes, and they can get very complicated, but for now, we'll focus only on stack lifetimes.

Let's take a look at a more complex example:

int foobar(int b)
{
  int a;
  
  {
    int x;
    x = 3;
    
    {
      int z;
      int max;
      
      max = 999;
      z = x + b;
      
      if(z > max)
      {
        return z - max;
      }
      
      x = x + z;
    }
    
    // a = z; // COMPILER ERROR!
    
    {
      int ten = 10;
      a = x + ten;
    }
  } 
  
  return a;
}
Let's look at the lifetimes of all our parameters and variables in this function. First, before calling the function, we push int b on to the stack with the value of whatever we're calling the function with - say, 900. Then, we call the function, which immediately pushes int a on to the stack. Then, we enter a new block using the character {, which does not consume any memory, but instead acts as a marker for the compiler - we'll see what it's used for later. Then, we push int x on to the stack. We now have 3 integers on the stack. We set int x to 3, but int a is still undefined. Then, we enter another new block. Nothing interesting has happened yet. We then push both int z and int max on to the stack. Then we assign 999 to int max and assign int z the value x + b - if we passed in 900, this means z is now equal to 903, which is less than the value of int max (999), so we skip the if statement for now. Then we assign x to x + z, which will be 906.

Now things get interesting. Our topmost block ends with a } character. This tells the compiler to pop all variables declared inside that block. We pushed int z on to the stack inside this block, so it's gone now. We cannot refer to int z anymore, and doing so will be a compiler error. int z is said to have gone out of scope. However, we also pushed int max on to the stack, and we pushed it after int z. This means that the compiler will first pop int max off the stack, and only afterwards will it then pop int z off the stack. The order in which this happens will be critical for understanding how lifetimes work with constructors and destructors, so keep it in mind.

Then, we enter another new scope. This new scope is still inside the first scope we created that contains int x, so we can still access x. We define int ten and initialize it with 10. Then we set int a equal to x + ten, which will be 916. Then, our scope ends, and int ten goes out of scope, being popped off the stack. Immediately afterwards, we reach the end of our first scope, and int x is popped off the stack.

Finally, we reach return a, which copies a to our return value memory segment, pops int a, and returns to our caller, who then pops int b. That's what happens when we pass in 900, but what happens if we pass in 9000?

Everything is the same until we reach the if statement, whose condition is now satisfied, which results in the function terminating early and returning z - max. What happens to the stack?

When we reach return z - max, the compiler evaluates the statement and copies the result (8004) out. Then it starts popping everything off the stack (once again, in the reverse order that things were pushed). The last thing we pushed on to the stack was int max, so it gets popped first. Then int z is popped. Then int x is popped. Then int a is popped, the function returns, and finally int b is popped by the caller. This behavior is critical to how C++ uses lifetimes to implement things like smart pointers and automatic memory management. Rust actually uses a similar concept, but it uses it for a lot more than C++ does.

new Statements

Okay, now we know how lifetimes work and where variables live when they aren't allocated, but what happens when you do allocate memory? What's going on with the new statement? To look at this, let's use a simplified example:

int* foo = new int();
Here we have allocated a pointer to an integer on the stack (which will be 8 bytes if you're on a 64-bit system), and assigned the result of new int() to it. What happens when we call new int()? In C++, the new operator is an extension of malloc() from C. This means it allocates memory from the heap. When you allocate memory on the heap, it never goes out of scope. This is what most programmers are familiar with in other languages, except that most other languages handle figuring out when to deallocate it and C++ forces you to delete it yourself. Memory allocated on the heap is just there, floating around, forever, or until you deallocate it. So this function has a memory leak:
int bar(int b)
{
  int* a = new int();
  *a = b;
  return *a;
}
This is the same as our first example, except now we allocate a on the heap instead of the stack. So, it never goes out of scope. It's just there, sitting in memory, forever, until the process is terminated. The new operator looks at the type we passed it (which is int in this case) and calls malloc for us with the appropriate number of bytes. Because int has no constructors or destructors, it's actually equivelent to this:
int bar(int b)
{
  int* a = (int*)malloc(sizeof(int));
  *a = b;
  return *a;
}
Now, people who are familiar with C will recognize that any call to malloc should come with a call to free, so how do we do that in C++? We use delete:
int bar(int b)
{
  int* a = new int();
  *a = b;
  int r = *a;
  delete a;
  return r;
}
IMPORTANT: Never mix new and free or malloc and delete. The new/delete operators can use a different allocator than malloc/free, so things will violently explode if you treat them as interchangeable. Always free something from malloc and always delete something created with new.

Now we aren't leaking memory, but we also can't do return *a anymore, because it's impossible for us to do the necessary cleanup. If we were allocating on the stack, C++ would clean up our variable for us after the return statement, but we can't put anything after the return statement, so there's no way to tell C++ to copy the value of *a and then manually delete a without introducing a new variable r. Of course, if we could run arbitrary code when our variable went out of scope, we could solve this problem! This sounds like a job for constructors and destructors!

Constructors and delete

Okay, let's put everything together and return to our original statement in a more complete example:

struct Foo
{
  // Constructor for Foo
  Foo(int b)
  {
    a = b;
  }
  // Empty Destructor for Foo
  ~Foo() {}
  
  int a;
};

int bar(int b)
{
  // Create
  Foo* foo = new Foo(b);
  int a = foo->a;
  // Destroy
  delete foo;
  return a; // Still can't return foo->a
}
In this code, we still haven't solved the return problem, but we are now using constructors and destructors, so let's walk through what happens. First, new allocates memory on the heap for your type. Foo contains a 32-bit integer, so that's 4 bytes. Then, after the memory is allocated, new automatically calls the constructor that matches whatever parameters you pass to the type. Your constructor doesn't need to allocate any memory to contain your type, since new already did this for you. Then, this pointer is assigned to foo. Then we delete foo, which calls the destructor first (which does nothing), and then deallocates the memory. If you don't pass any parameters when calling new Type(), or you are creating an array, C++ will simply call the default constructor (a constructor that takes no parameters). This is all equivelent to:
int bar(int b)
{
  // Create
  Foo* foo = (Foo*)malloc(sizeof(Foo));
  new (foo) Foo(b); // Special new syntax that ONLY calls the constructor function (this is how you manually call constructors in C++)
  int a = foo->a; 
  // Destroy
  foo->~Foo(); // We can, however, call the destructor function directly
  free(foo);
  
  return a; // Still can't return foo->a
}
This uses a special new syntax that doesn't allocate anything and simply lets us call the constructor function directly on our already allocated memory. This is what the new operator is doing for you under the hood. We then call the destructor manually (which you can do) and free our memory. Of course, this is all still useless, because we can't return the integer we allocated on the heap!

Destructors and lifetimes

Now, the magical part of C++ is that constructors and destructors are run when things are pushed or popped from the stack [1]. The fact that constructors and destructors respect variable lifetimes allows us to solve our problem of cleaning up a heap allocation upon returning from a function. Let's see how that works:

struct Foo
{
  // Default constructor for Foo
  Foo()
  {
    a = new int();
  }
  // Destructor frees memory we allocated using delete
  ~Foo()
  {
    delete a;
  }
  
  int* a;
};

int bar(int b)
{
  Foo foo;
  *foo.a = b;
  return *foo.a; // Doesn't leak memory!
}
How does this avoid leaking memory? Let's walk through what happens: First, we declare Foo foo on the stack, which pushes 4 bytes on to the stack, and then C++ calls our default constructor. Inside our default constructor, we use new to allocate a new integer and store it in int* a. Returning to our function, we then set our integer pointer foo.a to b. Then, we return the value stored in foo.a from the function[2]. This copies the value out of foo.a first by dereferencing the pointer, and then C++ calls our destructor ~Foo before Foo foo is popped off the stack. This destructor deletes int* a, ensuring we don't leak any memory. Then we pop off int b from the stack and the function returns. If we could somehow do this without constructors or destructors, it would look like this:
int bar(int b)
{
  Foo foo;
  foo.a = new int();
  *foo.a = b;
  int retval = *foo.b;
  delete a;
  return retval;
}
The ability to run a destructor when something goes out of scope is an incredibly important part of writing good C++ code, becuase when a function returns, all your variables go out of scope when the stack is popped. Thus, all cleanup that is done during destructors is gauranteed to run no matter when you return from a function. Destructors are gauranteed to run even when you throw an exception! This means that if you throw an exception that gets caught farther up in the program, you won't leak memory, because C++ ensures that everything on the stack is correctly destroyed when processing exception handling, so all destructors are run in the same order they normally are.

This is the core idea behind smart pointers - if a pointer is stored inside an object, and that object deletes the pointer in the destructor, then you will never leak the pointer because C++ ensures that the destructor will eventually get called when the object goes out of scope. Now, if implemented naively there is no way to pass the pointer into different functions, so the utility is limited, but C++11 introduced move semantics to help solve this issue. We'll talk about those later. For now, let's talk about different kinds of lifetimes and what they mean for when constructors and destructors are called.

Static Lifetimes

Because any struct or class in C++ can have constructors or destructors, and you can put structs or classes anywhere in a C++ program, this means that there are rules for how to safely invoke constructors and destructors in all possible cases. These different possible lifetimes have different names. Global variables, or static variables inside classes, have what's called “static lifetime”, which means their lifetime begins when the program starts and ends once the program exits. The exact order these constructors are called, however, is a bit tricky. Let's look at an example:

struct Foo
{
  // Default constructor for Foo
  Foo()
  {
    a = new int();
  }
  // Destructor frees memory we allocated using delete
  ~Foo()
  {
    delete a;
  }
  
  int* a;
  static Foo instance;
};

static Foo GlobalFoo;

int main()
{
  *GlobalFoo.a = 3;
  *Foo::instance.a = *GlobalFoo.a;
  return *Foo::instance.a;
}
When is instance constructed? When is GlobalFoo constructed? Can we safely assign to GlobalFoo.a immediately? The answer is that all static lifetimes are constructed before your program even starts, or more specifically, before main() is called. Thus, by the time your program has reached your entry point (main()), C++ gaurantees that all static lifetime objects have already been constructed. But what order are they constructed in? This gets complicated. Basically, static variables are constructed in the order they are declared in a single .cpp file. However, the order these .cpp files are constructed in is undefined. So, you can have static variables that rely on each other inside a single .cpp file, but never between different files.

Likewise, all static lifetime objects get deconstructed after your main() function returns, and once again, this order is random, although it should be in the reverse order they were constructed in. Technically this should be respected even if an exception occurs, but because the compiler can assume the process will terminate immediately after an unhandled exception occurs, this is unreliable.

Static lifetimes still apply for shared libraries, and are constructed the moment the library is loaded into memory - that's LoadLibrary on Windows and dlopen on Linux. Most kernels provide a custom function that fires when the shared library is loaded or unloaded, and these functions fall outside of the C++ standard, so there's no gaurantee about whether the static constructors have actually been called when you're inside the DllLoad, but almost nobody actually needs to worry about those edge cases, so for any normal code, by the time any function in your DLL can be called by another program, you can rest assured all static and global variables have had their constructors called. Likewise, they are destructed when the shared library is unloaded from memory.

While we're here, there are a few gotchas in the previous example that junior programmers should know about. You'll notice that I did not write static Foo* = new GlobalFoo(); - this will leak memory!. In this case, C++ doesn't actually call the destructor because Foo doesn't have a static lifetime, the pointer it's stored in does!. So the pointer will get it's constructor called before the program starts (which does nothing, because it's a primitive), and then the pointer will have it's destructor called after main() returns, which also does nothing, which means Foo never actually gets deconstructed or deallocated. Always remember that C++ is extremely picky about what you do. C++ won't magically extend Foo's lifetime to the lifetime of the pointer, it will instead do exactly what you told it to do, which is to declare a global pointer primitive.

Another thing to avoid is to not accidentally write Foo::instance.a = GlobalFoo.a;, because this doesn't copy the integer, it copies the pointer from GlobalFoo to Foo::instance. This is extremely bad, because now Foo::instance will leak it's pointer and instead try to free GlobalFoo's pointer, which was already deleted by GlobalFoo, so the program will crash, but only AFTER successfully returning 3. In fact, it will crash outside of the main() function completely, which is going to look very weird if you don't know what's going on.

Implicit Constructors and Temporary Lifetimes

Lifetimes in C++ can get complicated, because they don't just apply to function blocks, but also function parameters, return values, and expressions. This means that, for example, if we are calling a function, and we construct a new object inside the function call, there is an implicit lifetime that exists for the duration of the function call, which is well-defined but very weird unless you're aware of exactly what's going on. Let's look at a simple example of a function call that constructs an object:

class Foo
{
  // Implicit constructor for Foo
  Foo(int b)
  {
    a = b;
  }
  // Empty Destructor for Foo
  ~Foo() {}
  
  int a;
}

int get(Foo foo)
{
  return foo.a;
}

int main()
{
  return get(3);
}
To understand what's going on here, we need to understand implicit constructors, which are a “feature” of C++ you never wanted but got anyway. In C++, all constructors that take exactly 1 argument are implicit, which means the compiler will attempt to use call them to satisfy a type transformation. In this case, we are trying to pass 3 into the get() function. 3 has the type int, but get() takes an argument of type Foo. Normally, this would just cause an error, because the types don't match. But because we have a constructor for Foo that takes an int, the compiler actually calls it for us, constructing an object of type Foo and passing it into the function! Here's what it looks like if we do this ourselves:
int main()
{
  return get(Foo(3));
}
C++ has “helpfully” inserted this constructor for us inside the function call. So, now that we know our Foo object is being constructed inside the function call, we can ask a different question: When does the constructor get called, exactly? When is it destructed? The answer is that all the expressions in your function call are evaluated first, from left-to-right. Our expression allocated a new temporary Foo object by pushing it onto the stack and then calling the constructor. However, do be aware that compilers aren't always so great about respecting initialization order in function calls or other initialization lists. But, ostensibly, they're supposed to be evaluated from left-to-right.

So, once all expressions inside the parameteres have been evaluated, we then push the parameters on to the stack and copy the results of the expressions into them, allocate space on the stack for the return value, and then we enter the function. Our function executes, copies a return value into the space we reserved, finishes cleaning up, and returns. Then we do something with the return value and pop our parameters off the stack. Finally, after all the function parameter boilerplate has been finished, our expressions go out of scope in reverse order. This means that destructors are called from right-to-left after the function returns. This is all roughly equivilent to doing this:

int main()
{
  int b;
  {
    Foo a = Foo(3); // Construct Foo
    b = get(a); // Call function and copy result
  } // Deconstruct Foo
  return b;
}
This same logic works for all expressions - if you construct a temporary object inside an expression, it exists for the duration of the expression. However, the exact order that C++ evaluates expressions is extremely complicated and not always defined, so this is a bit harder to nail down. Generally speaking, an object gets constructed right before it's needed to evaluate the expression, and gets deconstructed afterwards. These are “temporary lifetimes”, because the object only briefly exists inside the expression, and is deconstructed once the expression is evaluated. Because C++ expressions are not always ordered, you should not attempt to rely on any sort of constructor order for arbitrary expressions. As an example, we can inline our previous get() function:
int main()
{
  return Foo(3).a;
}
This will allocate a temporary object of type Foo, construct it with 3, copy out the value from a, and then deconstruct the temporary object before the return statement is evaluated. For the most part, you can just assume your objects get constructed before the expression happens and get destructed after it happens - try not to rely on ordering more specific than that. The specific ordering rules are also changing in C++20 to make it more strict, which means how strict the ordering is will depend on what compiler you're using until everyone implements the standard properly.

For the record, if you don't want C++ “helpfully” turning your constructors into implicit ones, you can use the explicit keyword to disable that behavior:

struct Foo
{
  explicit Foo(int b)
  {
    a = b;
  }
  ~Foo() {}
  
  int a;
};

Static Variables and Thread Local Lifetimes

Static variables inside a function (not a struct!) operate by completely different rules, because this is C++ and consistency is for the weak.

struct Foo
{
  explicit Foo(int b)
  {
    a = b;
  }
  ~Foo() {}
  
  int a;
};

int get()
{
  static Foo foo(3);
  
  return foo.a;
}

int main()
{
  return get() + get();
}
When is foo constructed? It's not when the program starts - it's actually only constructed the first time the function gets called. C++ injects some magic code that stores a global flag saying whether or not the static variable has been initialized yet. The first time we call get(), it will be false, so the constructor is called and the flag is set to true. The second time, the flag is true, so the constructor isn't called. So when does it get destructed? After main() returns and the program is exiting, just like global variables!

Now, this static initialization is gauranteed to be thread-safe, but that's only useful if you intend to share the value through multiple threads, which usually doesn't work very well, because only the initialization is thread-safe, not accessing the variable. C++ has introduced a new lifetime called thread_local which is even weirder. Thread-local static variables only exist for the duration of the thread they belong to. So, if you have a thread-local static variable in a function, it's constructed the first time you call the function on a per-thread basis, and destroyed when each thread exits, not the program. This means you are gauranteed to have a unique instance of that static variable for each thread, which can be useful in certain concurrency situations.

I'm not going to spend any more time on thread_local because to understand it you really need to know how C++ concurrency works, which is out of scope for this blog post. Instead, let's take a brief look at Move Semantics.

Move Semantics

Let's look at C++'s smart pointer implementation, unique_ptr<>.

int get(int* b)
{
  return *b;
}

int main()
{
  std::unique_ptr<int> p(new int());
  *p = 3;
  int a = get(p.get());
  return a;
}
Here, we allocate a new integer on the heap by calling new, then store it in unique_ptr. This ensures that when our function returns, our integer gets freed and we don't leak memory. However, the lifetime of our pointer is actually excessively long - we don't need our integer pointer after we've extracted the value inside get(). What if we could change the lifetime of our pointer? The actual lifetime that we want is this:
int get(int* b)
{
  return *b;
  // We want the lifetime to end here
}

int main()
{
  // Lifetime starts here
  std::unique_ptr<int> p(new int());
  *p = 3;
  int a = get(p.get());
  return a;
  // Lifetime ends here
}
We can accomplish this by using move semantics:
int get(std::unique_ptr<int>&& b)
{
  return *b;
  // Lifetime of our pointer ends here
}

int main()
{
  // Lifetime of our pointer starts here
  std::unique_ptr<int> p(new int());
  *p                = 3;
  int a             = get(std::move(p));
  return a;
  // Lifetime of p ends here, but p is now empty
}
By using std::move, we transfer ownership of our unique_ptr to the function parameter. Now the get() function owns our integer pointer, so as long as we don't move it around again, it will go out of scope once get() returns, which will delete it. Our previous unique_ptr variable p is now empty, and when it goes out of scope, nothing happens, because it gave up ownership of the pointer it contained. This is how you can implement automatic memory management in C++ without needing to use a garbage collector, and Rust actually uses a more sophisticated version of this built into the compiler.

Move semantics can get very complex and have a lot of rules surrounding how temporary values work, but we're not going to get into all that right now. I also haven't gone into the many different ways that constructors can be invoked, and how those constructors interact with the different ways you can initialize objects. Hopefully, however, you now have a grasp of what lifetimes are in C++, which is a good jumping off point for learning about more advanced concepts.


[1] Pedantic assembly-code analysts will remind us that the stack allocations usually happen exactly once, at the beginning of the function, and then are popped off at the very end of the function, but the standard technically doesn't even require a stack to exist in the first place, so we're really talking about pushing and popping off the abstract stack concept that the language uses, not what the actual compiled assembly code really does.

[2] We're dereferencing the pointer here because we want to return the value of the pointer, not the pointer itself! If you tried to return the pointer itself from the function, it would point to freed memory and crash after the function returned. Trying to return pointers from functions is a common mistake, so be careful if you find yourself returning a pointer to something. It's better to use unique_ptr to manage lifetimes of pointers for you.


Avatar

Archive

  1. 2022
  2. 2021
  3. 2020
  4. 2019
  5. 2018
  6. 2017
  7. 2016
  8. 2015
  9. 2014
  10. 2013
  11. 2012
  12. 2011
  13. 2010
  14. 2009