Garbage collectors suck for games

"Sure, it might beat hard to find (mostly C++) memory leaks, but if you make it into a habit of writing the free statement right after the create, you just solved that problem. And some profiling will take care of the rest."

Things are rarely that simple in real applications.

Personally I've written millions of lines of assembler and probably 10's of millions of lines of C/C++, I'm very aware of the errors people make and I still don't write error free memory management code 100% of the time.

The obvious traditional issue in games is a messaging system, mesaging systems usually keep lists of recipients, but while walking the list of recipients and sending the messages recipients in the list can be created or destroyed. The messaging system doesn't usually own the objects, but it makes the implementation an order of magnitude easier, if it at least partially owns them.

There's always the ownrship questions between the "Game object manager" and the hierarchy of in game objects. It invariable leads to notification patterns, weak pointers etc etc.

And that's before you get into 3rd party libraries or just internal code written by someone else with unclear memory management semantics.
Yes, I know, I've encountered many things like that as well. It might be better to put it like this:

"First figure out the lifecycle of objects and write the code to create, handle and dispose them, before actually using them."

There are lots of border cases, but I tend to always create a manager for each (class of) object(s). And yes, COM objects (or in general everything using an interface), a factory or library that isn't very specific in who creates/owns/disposes anything is a pain to manage.
 
What do you mean by this? An instance property or a type property? If it's the later you have that with "base" C# or "MyBase" (ugg) for VB. If you mean the former, you could add one if your model requires it. EDIT: On second thought, you can't mean this.
The global .NET class hierarchy doesn't have a property to record which object created an object, it uses reflection instead.

Which is a big pain, as you first have to figure out which component model it resides in (for example, WPF uses a different one than plain .NET), which object hierarchy should be used to track the owner (WPF uses multiple ones, depending), which class it actually is (that's easy), cascade through all the base classes until you find one you recognize (as bad as it sounds). And all the different hierarchies have different functions/owner class methods to do that!

A simple property that stores the owner, and a simple check if one of it's base classes is the supplied one would solve a lot of problems.

I created my own object hierarchy from scratch for a project that had a class hierarchy of close to 100 classes, many of them (close to) equal to an existing .NET class, simply because it was too complex to track the hierarchy.

And they had to be serialized, which is a big problem as well (same reason). I ended up writing my own enumerator and serializer for all of them as well.

I may be misunderstanding you but there is static analysis in the compiler, that's exactly why you have to test and then cast so that the compiler let's you access a method from an object with a different (parent) statically declared type. It's only when you dwell in the dark recesses of the new dynamic language features this (obviously) starts to break down. Example. You only need to check whether an instance has a method before calling it if you're using siblings WITHOUT using interfaces. The only other reason would be declaring your instances as Object rather than their specific type (or super-type). Both are drawbacks/advantages of OOP not of any specific language. In fact, C++ tries to workaround some of that with multiple inheritance but that brings problems of its own.
Well, that pretty much breaks down completely if you're using factories, COM, WCF, WPF and the other new additions. Because in that case, the type checking is done at RUNTIME, through reflection. Which means: you have to do it yourself, if you don't want your program to break.

If you find yourself continuously checking for type/method topology, mayhap your model needs rearrangement?
I think the .NET model needs it. :)

You avoid this problem by always instantiating objects rather than declaring them first and assigning the ref later. Although that isn't always optimal, it's an option. Same thing why I always assign a value to primitives even though I know Java/.NET pre-clears the address with 0x0, old C habits die hard :|.
Agreed. I do the same as well.

Also, see your comment on your other post:

Instantiate most, profile/debug the rest. Your method flow is also important. The use of outgoing parameters in .NET is another robust way to guarantee the ref you were just handed is not NULL and it's a little more elegant than instantiating something just for the sake of it.
Yes, and that works well, unless you're trying to call unmanaged code. Even worse: the declaration of that unmanaged function tends to depend on the .NET version used.

I don't know how Minecraft was coded but from my dealings with XNA you'll never want to instantiate during the gameloop; similar to never allocating buffers in C/C++. This way the GC has very little to do when it matters most. Obviously you're trading load times for framerate stability but in games this is worth it.

Having said that, a GC isn't an excuse to neglect memory management. I've seen lots of managed code that simply forget about handles.
Or writing and calling a dispose method for anything that even touches things like handles ;).

Not knowing the details this could mean a number of things. For instance, were you using direct DB access, ADO.NET's managed controls, web service, etc. to access the DB? The most similar app I did in .NET was an "OLAP" cube filter with a 2gb dataset of just 7 dimensions in MSSQL using ADO.NET's bulky managed controls running on a 1GB VM. I didn't have memory problems for what it's worth. :shrug: The thing took over 10 secs for some of the simpler queries and chewed through the page file like crazy though.
At that time, the GC only initialized when less than 20% of system memory was free, and it did so at it's leisure (ie: slowly). And ADO.NET was very dynamic: it created/freed any single resource it needed. (MS SQL Reporting Services is written in C#.)

When the total amount of memory used by the .NET runtime exceeded the total amount of RAM available, Windows killed it.
 
The global .NET class hierarchy doesn't have a property to record which object created an object, it uses reflection instead. <snip> A simple property that stores the owner, and a simple check if one of it's base classes is the supplied one would solve a lot of problems.

Thanks, I understand now. Let me ask you why exactly do you need to know which instance created which other instance? Which problems would it solve? I recognise the value in determining object life-time, etc. but aside from plumbing what are the advantages you are thinking about. See below, calling unmanaged code.

Well, that pretty much breaks down completely if you're using factories, COM, WCF, WPF and the other new additions. Because in that case, the type checking is done at RUNTIME, through reflection. Which means: you have to do it yourself, if you don't want your program to break.

Hate COM. But WCF, WPF, even WinForms in some places are highly abstract so some measure of dynamic type validation must happen.

I think the .NET model needs it. :)

Sure, no argument there but I doubt any other framework of this size and capability is much better heh. Btw, have you looked at .NET 4 in this regard?

Yes, and that works well, unless you're trying to call unmanaged code. Even worse: the declaration of that unmanaged function tends to depend on the .NET version used.

I haven't had many opportunities/need to employ P/Invoke and marshalling. Mostly because any time I have to look at MFC and even SDK code I get flashbacks to the bad c++ days. I mostly use it for the 7/Vista bling-bling MS keeps leaving out of WinForms. Since you need lower level access, you'll definitely require more flexibility and more information on the object tree.

Or writing and calling a dispose method for anything that even touches things like handles ;).

I take it you don't like having to debug why your program crashes reading from disk with a null exception only 10% of the time. :devilish: Dispose is Chaotic Evil. I'd rather have objects taking up memory, than trying to subvert the GC and blowing up in my face!

When the total amount of memory used by the .NET runtime exceeded the total amount of RAM available, Windows killed it.

Physical RAM? How odd.
 
Just anote on the .net GC, unmanaged interop and memory usage.

My biggest problem is the totally passive way it does collection, it will wait until it gets within some threshold of the limit before freeing objects.

You can actually restrict the maximum ammount of memory the .Net runtime will use, but you have to write a host. It's <100 lines of C++.

If the runtime was killed it was because it couldn't allocate enough memoy for the "Out Of Memory" exception to be thrown. This means that the GC could not free any memory, so you had a leak.

If you're making calls into unmanaged code, especially through COM interop with pinned memory, it's extremely easy to write leaks, they are extremely hard to track down.

Having had to write COM wrappers for external libraries, it's astonishing to me how PInvoke just works most of the time.
 
If the runtime was killed it was because it couldn't allocate enough memoy for the "Out Of Memory" exception to be thrown. This means that the GC could not free any memory, so you had a leak.
Except it doesn't happen, the OutOfMemoryException is singleton it won't fail to be thrown, any unhandled exception will cause Windows to terminate it, even an OutOfMemoryException in the handler of a OutOfMemoryException wich tried to allocate a new object.
 
The global .NET class hierarchy doesn't have a property to record which object created an object, it uses reflection instead.
Introducing a global parent property would be highway one into leak hell. In order to have an object safely collected by the GC you would have to delete all references to that object and null the owner property. As long as ownership is clear that might not be a problem, but when objects start getting passed around between domain code, framework code and third-party libraries things would get ugly in no time. And I haven't even mentioned the non-obvious cases like anonymous delegates that get passed around. So, sorry, but dumb idea.

And what do you mean with "it uses reflection instead"? Where does .NET use reflection in order to find an owner object?

At that time, the GC only initialized when less than 20% of system memory was free, and it did so at it's leisure (ie: slowly). And ADO.NET was very dynamic: it created/freed any single resource it needed. (MS SQL Reporting Services is written in C#.)

When the total amount of memory used by the .NET runtime exceeded the total amount of RAM available, Windows killed it.
Sounds like a leak to me.

Just anote on the .net GC, unmanaged interop and memory usage.

My biggest problem is the totally passive way it does collection, it will wait until it gets within some threshold of the limit before freeing objects.

You can actually restrict the maximum ammount of memory the .Net runtime will use, but you have to write a host. It's <100 lines of C++.
You can actually influence (lower) the threshold by calling GC.AddMemoryPressure(). It's far from perfect, but it's a start.
 
Introducing a global parent property would be highway one into leak hell. In order to have an object safely collected by the GC you would have to delete all references to that object and null the owner property. As long as ownership is clear that might not be a problem, but when objects start getting passed around between domain code, framework code and third-party libraries things would get ugly in no time. And I haven't even mentioned the non-obvious cases like anonymous delegates that get passed around.
Exactly. That's the whole idea abut ownership, and is exactly what you want to happen, unless you have a very weird idea about object/memory management.

So, sorry, but dumb idea.
So, sorry, but dumb idea.

:D

And what do you mean with "it uses reflection instead"? Where does .NET use reflection in order to find an owner object?
Well, everywhere and all the time? Even the GC uses it.

Sounds like a leak to me.
Well, it happened with the standard MS SQL Reporting Services, as written by Microsoft themselves. I tried to patch things up to stop it from happening, but had only little success.

So, if it's a leak, it was as intended (TM), or not mine.

You can actually influence (lower) the threshold by calling GC.AddMemoryPressure(). It's far from perfect, but it's a start.
Microsoft said:
.NET Framework

Supported in: 4, 3.5, 3.0, 2.0
 
I'm curious which games use gc'ed VMs. LUA seems to be used all over the place. The first game I've seen to use JVM was Vampire: The Masquerade - Redeemption. It installed a JVM 1.1.7 and from what I've seen all the game logic was in java.

UnrealScript? :mrgreen:

There is no denying you have to treat a garbage collected language differently.

All my experience with .net is that memory isn't your problem unless you have done something badly wrong, whereas (for example) managed<->unmanaged transitions will kill performance.
 
I forgot to respond, so here it is:

Thanks, I understand now. Let me ask you why exactly do you need to know which instance created which other instance? Which problems would it solve? I recognise the value in determining object life-time, etc. but aside from plumbing what are the advantages you are thinking about. See below, calling unmanaged code.
The specific project I was talking about, was a .docx report generator. There are many ways to create something like that, but as the OfficeXML standard is a very loose one, that greatly depends on it's intepretation on the type and state of the elements around it (many different state machines and containers that can contain a variety of children, from many different layers in the model), it wasn't feasible to simply iterate through a description of the document, and I needed an object (class) structure that took all those discrepancies into account.

Which is alike the object hierarchy you want for many other projects, like games, where you need to be able to assume that every object (instance) also manages all their children, no matter what type they are (like, from a simple sprite up to a flow shader).

First you have to build your abstract model, be it a document or game scene, then you have to provide manipulators (methods) to insert the actual data, which can be used through the UI. And at the highest level, you want something like an "execute" method, that creates the actual draw calls, document or database manipulations.

In such a model, it is paramount that all the object (instance) management happens in a strictly top-down way, and completely transparent to the higher levels. Which requires that you can create, fill, use and dispose a variety of children through a ripple-down effect.

But the other direction is equally important: a child has to be able to communicate with it's parent, for status changes as well as handling exceptions (which can be quite mundane and not the "raising" kind).



If you want to scale your application over multiple threads/processors/servers, there are basically two models: thread spawning (which requires shared memory, something you really don't want if you can prevent it), or job spawning/stream processing, in which you create independent jobs that go and execute somewhere and sometime convenient. All of this also requires the same up- and downward object interactions.


Or, in short: you need to be able to keep track of your instance hierarchy, and be able to talk with children, parents and siblings. So you can simply fill it up with data, and call "execute", after which all actions needed ripple through the model and serialize and output all data as needed.
 
Last edited by a moderator:
Is that dificult to pass "this" as a parameter during object creation?

I don't know how complex .docx are (other say they are ridiculous complex tought) and even if there is the need for children objects to comunicate to parents by means other than return values and exceptions, letting the children to have a reference to a parent is something I usually avoid for several reasons and so I see several reason to this not be the default behavior.
 
In languages like C++/C one of the most important things in design is having clear delineation of ownership, understanding when ownership is handed off etc etc. ref-counting doesn't absolve the designer of that, GC does.

It could perhaps be more explicit, but I would assume that returning a smart pointer is handing off ownership (or at least offering it), returning a ref is not. I think the semantic suggestion here is probably just as valuable as the functionality.
 
The issue is that returning a smart pointer often isn't handing off ownership, it's just returning a smart pointer. Any none trivial set of structures will usually result in smart pointers creating ref loops, because they tend to be used as safe references in classes rather than actual indicators of strong ownership.

With garbage collectors there is no real delineation between strong ownership and just references to the object, because ref loops don't generally cause leaks (though it's quite possible to write leaks). I do agree that it's still important to understand ownership at a semantic level even in a GC language.

I personally work around the ownership issue in C/C++ by using API conventions, if a function takes a ** to a class as an output it's assumed the caller is responsible for cleanup, otherwise the creator is responsible for cleanup.

I've wasted more hours of my life cleaning up dangling pointer bugs than I would care to count, it's too easy a bug to write, has often totally random symptoms and often extremly difficult to find.
 
No it's not a GC error, it's a user error.
The classic one is someone creates a manager class of some sort, and never removes things from the manager. The manager retains a ref, even though the resource is no longer in use. You can argue whether it's the same sort of leak or not. But the effect is the same, something that is no longer in use hangs around indefinitely.
 
Is that dificult to pass "this" as a parameter during object creation?
That's what I did, but to be able to do that I had to create my own class hierarchy from scratch, as very many things in .NET require constructors without parameters, and most default classes have no property that can be used to store that pointer (and some are non-inheritable).

I don't know how complex .docx are (other say they are ridiculous complex tought)
That's an understatement :)

But it's more a general problem, if you want to store object references that have no common (inherited) parent that supplies the methods and properties needed.

Best then to start from scratch with a base class that does.

.. and even if there is the need for children objects to comunicate to parents by means other than return values and exceptions, letting the children to have a reference to a parent is something I usually avoid for several reasons and so I see several reason to this not be the default behavior.
Agreed.

It depends on how you want to use those classes. If you mostly use them as "smart" functions, there is no need. But if you want to use them to build an abstract representation or model of something, you need all the cogs to be able to give feedback.
 
I'm clearly missing some nuance of the discussion here, but why not dynamic_cast in this situation? You are trying to ref the parent, correct?
 
About garbage collection and performance:

1) Reference counting has big performance overhead
Every time you are taking new refs to object or removing ref, you have overhead. When doing function calls with the obejcts you practically always do it. And as the overhead of removing a ref includes if (refcount == 0) brach, it's much worse for performance than simple update of value.

And if you try to use ref counting with objects that were not designed for it(not derived from some recountable base class/template), you have to add an additional intermediate layer which means you have 2-level memory references which makes those objects slower to use.

2) generational garbage collection makes memory allocation much faster.
If you have a generational garbage collection, malloc() is just one addition and comparison (actually the whole malloc overhead is same as the overhead of removing reference when ref counting)

3) There are real-time garbage collectors.
If you are using real-time garbage collector, there are NONE of those "GC pauses" people seem to be so afraid of.
 
Back
Top