Saturday, May 3, 2008

How the Garbage Collector works - Part 2

How the Garbage Collector works - Part 1

Now let’s go deeper to understand how the Garbage Collector (GC) is actually collecting the dead objects and how this may affect the performance.

Collecting the Garbage

The GC is able to collect the garbage in two ways: full collections (searching the entire managed heap for dead objects) and partial collections (searching only a single generation zone).
When the GS starts collecting the garbage, performing a full or partial collection, the first thing it does is to stop the application execution. So, at least from this point of view, collecting dead objects is an expensive task! For a full collection, the application can be stopped for a very long time!

The second step is to identify the root objects. A root object is an object having no references from other objects. For example the global members of an application are suitable to be root objects. Starting with these roots, the GC follows each reference contained by them inspecting recursively all the child objects. In this way the GC will have found every reachable or live object. The other objects, the unreachable ones, are now condemned to be collected.

If a partial collection is performed, the GC will iterate only thru objects having same age or younger. For example, a Gen1 root may have child objects from Gen1 or Gen0. Considering this, inspecting Gen2 roots is equivalent to perform a full collection, which is very expensive, because Gen2 objects may have references to children from Gen1 and Gen0.

All the live objects have been found will have their age incremented and be upgraded to the next generation, if necessary. Upgrading an object to the next generation involves moving its data on a different memory location of the managed heap. In order to not affect the performance too much, an object must have been survived more than one collection on its current generation to be upgraded on the next one.

All the condemned objects are checked for a finalizer. A finalizer is an optional special class method than can be called by the framework only in order to release any unmanaged resources that the object may use. In C# you use the ~Class syntax to specify the finalizer (the destructor).

The objects without a finalizer are immediately killed and the memory released. For the others, the things are a little bit more complicated.

How finalization affects performance

When the garbage collector first encounters an object that is otherwise dead but still needs to be finalized it must abandon its attempt to reclaim the space for that object at that time. The object is instead added to a list of objects needing finalization and, furthermore, the collector must then ensure that all of the pointers within the object remain valid until finalization is complete.

This means that no child object referred by an object with finalizer method can be killed until the finalizer has been executed. This is bad mostly if the finalizable object creates a lot of temporary objects (Gen0 objects). Normally, killing Gen0 objects is cheap and the memory is released immediately, but in this case all the temporary objects must live until the parent object is finalized. A lot of memory is locked and can’t be released!

Once the collection is complete, the finalization thread will go through the list of objects needing finalization and invoke the finalizers. When this is done the objects once again become dead and will be naturally collected in the normal way on the next collection.

As a conclusion about destructors:
  • The finalizable objects live a lot longer than the regular ones.
  • The things are getting worse if the finalizable object is a Gen1 or even a Gen2 object.
  • The finalizer method should do as little work as possible, otherwise finalization thread will take longer to execute and this will affect the application’s performance.
So, think twice before adding a destructor to your classes!

How the Garbage Collector works - Part 1



kick it on DotNetKicks.com

No comments: