Garbage Collectionâ?"Part 2: Automatic Memory Management in the Microsoft .NET Framework
|
Jeffrey Richter |
This article assumes you�re familiar with C and C++ |
Level of Difficulty 1 2 3 |
SUMMARY The
first part of this two-part article explained how the garbage
collection algorithm works, how resources can clean up properly when
the garbage collector decides to free a resource's memory, and how to
force an object to clean up when it is freed. The conclusion of this
series explains strong and weak object references that help to manage
memory for large objects, as well as object generations and how they
improve performance. In addition, the use of methods and properties for
controlling garbage collection, resources for monitoring collection
performance, and garbage collection for multithreaded applications are
covered. |
ast month, I described the motivation for garbage-collected
environments: to simplify memory management for the developer. I also
discussed the general algorithm used by the common language runtime
(CLR) and some of the internal workings of this algorithm. In addition,
I explained how the developer must still explicitly handle resource
management and cleanup by implementing Finalize, Close, and/or Dispose
methods. This month, I will conclude my discussion of the CLR garbage
collector. I'll start by exploring a feature called weak
references, which you can use to reduce the memory pressure placed on
the managed heap by large objects. Then, I'll discuss how the garbage
collector uses generations as a performance enhancement. Finally, I'll
wrap up by discussing a few other performance enhancements offered by
the garbage collector, such as multithreaded collections and
performance counters exposed by the CLR that allow you to monitor the
garbage collector's real-time behavior.
Weak References
When a root points to an object, the
object cannot be collected because the application's code can reach the
object. When a root points to an object, it's called a strong reference
to the object. However, the garbage collector also supports weak
references. Weak references allow the garbage collector to collect the
object, but they also allow the application to access the object. How
can this be? It all comes down to timing. If only weak
references to an object exist and the garbage collector runs, the
object is collected and when the application later attempts to access
the object, the access will fail. On the other hand, to access a weakly
referenced object, the application must obtain a strong reference to
the object. If the application obtains this strong reference before the
garbage collector collects the object, then the garbage collector can't
collect the object because a strong reference to the object exists. I
know this all sounds somewhat confusing, so let's clear it up by
examining the code in Figure 1. Why
might you use weak references? Well, there are some data structures
that are created easily, but require a lot of memory. For example, you
might have an application that needs to know all the directories and
files on the user's hard drive. You can easily build a tree that
reflects this information and as your application runs, you'll refer to
the tree in memory instead of actually accessing the user's hard disk.
This procedure greatly improves the performance of your application. The
problem is that the tree could be extremely large, requiring quite a
bit of memory. If the user starts accessing a different part of your
application, the tree may no longer be necessary and is wasting
valuable memory. You could delete the tree, but if the user switches
back to the first part of your application, you'll need to reconstruct
the tree again. Weak references allow you to handle this scenario quite
easily and efficiently. When the user switches away from the
first part of the application, you can create a weak reference to the
tree and destroy all strong references. If the memory load is low for
the other part of the application, then the garbage collector will not
reclaim the tree's objects. When the user switches back to the first
part of the application, the application attempts to obtain a strong
reference for the tree. If successful, the application doesn't have to
traverse the user's hard drive again. The WeakReference type offers two constructors:
WeakReference(Object target); WeakReference(Object target, Boolean trackResurrection);
The target parameter identifies the object that the
WeakReference object should track. The trackResurrection parameter
indicates whether the WeakReference object should track the object
after it has had its Finalize method called. Usually, false is passed
for the trackResurrection parameter and the first constructor creates a
WeakReference that does not track resurrection. (For an explanation of
resurrection, see part 1 of this article at http://msdn.microsoft.com/msdnmag/issues/1100/GCI/GCI.asp.) For
convenience, a weak reference that does not track resurrection is
called a short weak reference, while a weak reference that does track
resurrection is called a long weak reference. If an object's type
doesn't offer a Finalize method, then short and long weak references
behave identically. It is strongly recommended that you avoid using
long weak references. Long weak references allow you to resurrect an
object after it has been finalized and the state of the object is
unpredictable. Once you've created a weak reference to an
object, you usually set the strong reference to the object to null. If
any strong reference remains, the garbage collector will be unable to
collect the object. To use the object again, you must turn the
weak reference into a strong reference. You accomplish this simply by
calling the WeakReference object's Target property and assigning the
result to one of your application's roots. If the Target property
returns null, then the object was collected. If the property does not
return null, then the root is a strong reference to the object and the
code may manipulate the object. As long as the strong reference exists,
the object cannot be collected.
Weak Reference Internals
From the previous discussion,
it should be obvious that WeakReference objects do not behave like
other object types. Normally, if your application has a root that
refers to an object and that object refers to another object, then both
objects are reachable and the garbage collector cannot reclaim the
memory in use by either object. However, if your application has a root
that refers to a WeakReference object, then the object referred to by
the WeakReference object is not considered reachable and may be
collected. To fully understand how weak references work, let's
look inside the managed heap again. The managed heap contains two
internal data structures whose sole purpose is to manage weak
references: the short weak reference table and the long weak reference
table. These two tables simply contain pointers to objects allocated
within the managed heap. Initially, both tables are empty.
When you create a WeakReference object, an object is not allocated from
the managed heap. Instead, an empty slot in one of the weak reference
tables is located; short weak references use the short weak reference
table and long weak references use the long weak reference table. Once
an empty slot is found, the value in the slot is set to the address of
the object you wish to trackâ?"the object's pointer is passed to the
WeakReference's constructor. The value returned from the new operator
is the address of the slot in the WeakReference table. Obviously, the
two weak reference tables are not considered part of an application's
roots or the garbage collector would not be able to reclaim the objects
pointed to by the tables. Now, here's what happens when a garbage collection (GC) runs:
- The garbage collector builds a graph of all the reachable objects. Part 1 of this article discussed how this works.
- The
garbage collector scans the short weak reference table. If a pointer in
the table refers to an object that is not part of the graph, then the
pointer identifies an unreachable object and the slot in the short weak
reference table is set to null.
- The garbage collector
scans the finalization queue. If a pointer in the queue refers to an
object that is not part of the graph, then the pointer identifies an
unreachable object and the pointer is moved from the finalization queue
to the freachable queue. At this point, the object is added to the
graph since the object is now considered reachable.
- The
garbage collector scans the long weak reference table. If a pointer in
the table refers to an object that is not part of the graph (which now
contains the objects pointed to by entries in the freachable queue),
then the pointer identifies an unreachable object and the slot is set
to null.
- The garbage collector compacts the memory, squeezing out the holes left by the unreachable objects.
Once
you understand the logic of the garbage collection process, it's easy
to understand how weak references work. Accessing the WeakReference's
Target property causes the system to return the value in the
appropriate weak reference table's slot. If null is in the slot, the
object was collected. A short weak reference doesn't track
resurrection. This means that the garbage collector sets the pointer to
null in the short weak reference table as soon as it has determined
that the object is unreachable. If the object has a Finalize method,
the method has not been called yet so the object still exists. If the
application accesses the WeakReference object's Target property, then
null will be returned even though the object actually still exists. A
long weak reference tracks resurrection. This means that the garbage
collector sets the pointer to null in the long weak reference table
when the object's storage is reclaimable. If the object has a Finalize
method, the Finalize method has been called and the object was not
resurrected.
Generations
When I first started working in a
garbage-collected environment, I had many concerns about performance.
After all, I've been a C/C++ programmer for more than 15 years and I
understand the overhead of allocating and freeing memory blocks from a
heap. Sure, each version of Windows® and each version of the C runtime
has tweaked the internals of the heap algorithms in order to improve
performance. Well, like the developers of Windows and the C
runtime, the GC developers are tweaking the garbage collector to
improve its performance. One feature of the garbage collector that
exists purely to improve performance is called generations. A
generational garbage collector (also known as an ephemeral garbage
collector) makes the following assumptions:
- The newer an object is, the shorter its lifetime will be.
- The older an object is, the longer its lifetime will be.
- Newer objects tend to have strong relationships to each other and are frequently accessed around the same time.
- Compacting a portion of the heap is faster than compacting the whole heap.
Of
course, many studies have demonstrated that these assumptions are valid
for a very large set of existing applications. So, let's discuss how
these assumptions have influenced the implementation of the garbage
collector. When initialized, the managed heap contains no
objects. Objects added to the heap are said to be in generation 0, as
you can see in Figure 2. Stated simply, objects in generation 0 are young objects that have never been examined by the garbage collector.
Figure 2 Generation 0
Now,
if more objects are added to the heap, the heap fills and a garbage
collection must occur. When the garbage collector analyzes the heap, it
builds the graph of garbage (shown here in purple) and non-garbage
objects. Any objects that survive the collection are compacted into the
left-most portion of the heap. These objects have survived a
collection, are older, and are now considered to be in generation 1
(see Figure 3).
Figure 3 Generations 0 and 1
As
even more objects are added to the heap, these new, young objects are
placed in generation 0. If generation 0 fills again, a GC is performed.
This time, all objects in generation 1 that survive are compacted and
considered to be in generation 2 (see Figure 4). All survivors
in generation 0 are now compacted and considered to be in generation 1.
Generation 0 currently contains no objects, but all new objects will go
into generation 0.
Figure 4 Generations 0, 1, and 2
Currently,
generation 2 is the highest generation supported by the runtime's
garbage collector. When future collections occur, any surviving objects
currently in generation 2 simply stay in generation 2.
Generational GC Performance Optimizations
As I stated
earlier, generational garbage collecting improves performance. When the
heap fills and a collection occurs, the garbage collector can choose to
examine only the objects in generation 0 and ignore the objects in any
greater generations. After all, the newer an object is, the shorter its
lifetime is expected to be. So, collecting and compacting generation 0
objects is likely to reclaim a significant amount of space from the
heap and be faster than if the collector had examined the objects in
all generations. This is the simplest optimization that can be
obtained from generational GC. A generational collector can offer more
optimizations by not traversing every object in the managed heap. If a
root or object refers to an object in an old generation, the garbage
collector can ignore any of the older objects' inner references,
decreasing the time required to build the graph of reachable objects.
Of course, it is possible that an old object refers to a new object. So
that these objects are examined, the collector can take advantage of
the system's write-watch support (provided by the Win32® GetWriteWatch
function in Kernel32.dll). This support lets the collector know which
old objects (if any) have been written to since the last collection.
These specific old objects can have their references checked to see if
they refer to any new objects. If collecting generation 0
doesn't provide the necessary amount of storage, then the collector can
attempt to collect the objects from generations 1 and 0. If all else
fails, then the collector can collect the objects from all
generationsâ?"2, 1, and 0. The exact algorithm used by the collector to
determine which generations to collect is one of those areas that
Microsoft will be tweaking forever. Most heaps (like the C
runtime heap) allocate objects wherever they find free space.
Therefore, if I create several objects consecutively, it is quite
possible that these objects will be separated by megabytes of address
space. However, in the managed heap, allocating several objects
consecutively ensures that the objects are contiguous in memory. One
of the assumptions stated earlier was that newer objects tend to have
strong relationships to each other and are frequently accessed around
the same time. Since new objects are allocated contiguously in memory,
you gain performance from locality of reference. More specifically, it
is highly likely that all the objects can reside in the CPU's cache.
Your application will access these objects with phenomenal speed since
the CPU will be able to perform most of its manipulations without
having cache misses which forces RAM access. Microsoft's
performance tests show that managed heap allocations are faster than
standard allocations performed by the Win32 HeapAlloc function. These
tests also show that it takes less than 1 millisecond on a 200Mhz
Pentium to perform a full GC of generation 0. It is Microsoft's goal to
make GCs take no more time than an ordinary page fault.
Direct Control with System.GC
The System.GC type allows
your application some direct control over the garbage collector. For
starters, you can query the maximum generation supported by the managed
heap by reading the GC.MaxGeneration property. Currently, the
GC.MaxGeneration property always returns 2. It is also possible to force the garbage collector to perform a collection by calling one of the two methods shown here:
void GC.Collect(Int32 Generation) void GC.Collect()
The first method allows you to specify which generation to
collect. You may pass any integer from 0 to GC.MaxGeneration,
inclusive. Passing 0 causes generation 0 to be collected; passing 1
causes generation 1 and 0 to be collected; and passing 2 causes
generation 2, 1, and 0 to be collected. The version of the Collect
method that takes no parameters forces a full collection of all
generations and is equivalent to calling:
GC.Collect(GC.MaxGeneration);
Under most circumstances, you should avoid calling
any of the Collect methods; it is best to just let the garbage
collector run on its own accord. However, since your application knows
more about its behavior than the runtime does, you could help matters
by explicitly forcing some collections. For example, it might make
sense for your application to force a full collection of all
generations after the user saves his data file. I imagine Internet
browsers performing a full collection when pages are unloaded. You
might also want to force a collection when your application is
performing other lengthy operations; this hides the fact that the
collection is taking processing time and prevents a collection from
occurring when the user is interacting with your application. The
GC type also offers a WaitForPendingFinalizers method. This method
simply suspends the calling thread until the thread processing the
freachable queue has emptied the queue, calling each object's Finalize
method. In most applications, it is unlikely that you will ever have to
call this method. Lastly, the garbage collector offers two methods that allow you to determine which generation an object is currently in:
Int32 GetGeneration(Object obj) Int32 GetGeneration(WeakReference wr)
The first version of GetGeneration takes an object
reference as a parameter, and the second version takes a WeakReference
reference as a parameter. Of course, the value returned will be
somewhere between 0 and GC.MaxGeneration, inclusive. The code in Figure 5 will help you understand how generations work. It also demonstrates the use of the garbage collection methods just discussed.
Performance for Multithreaded Applications
In the
previous section, I explained the GC algorithm and optimizations.
However, there was a big assumption made during that discussion: only
one thread is running. In the real world, it is quite likely that
multiple threads will be accessing the managed heap or at least
manipulating objects allocated within the managed heap. When one thread
sparks a collection, other threads must not access any objects
(including object references on its own stack) since the collector is
likely to move these objects, changing their memory locations. So,
when the garbage collector wants to start a collection, all threads
executing managed code must be suspended. The runtime has a few
different mechanisms that it uses to safely suspend threads so that a
collection may be done. The reason there are multiple mechanisms is to
keep threads running as long as possible and to reduce overhead as much
as possible. I don't want to go into all the details here, but suffice
it to say that Microsoft has done a lot of work to reduce the overhead
involved with performing a collection. Microsoft will continue to
modify these mechanisms over time to help ensure efficient garbage
collections. The following paragraphs describe a few of the
mechanisms that the garbage collector employs when applications have
multiple threads: Fully Interruptible Code
When a collection starts, the collector suspends all application
threads. The collector then determines where a thread got suspended and
using tables produced by the just-in-time (JIT) compiler, the collector
can tell where in a method the thread stopped, what object references
the code is currently accessing, and where those references are held
(in a variable, CPU register, and so on). Hijacking
The collector can modify a thread's stack so that the return address
points to a special function. When the currently executing method
returns, this special function will execute, suspending the thread.
Stealing the thread's execution path this way is referred to as
hijacking the thread. When the collection is complete, the thread will
resume and return to the method that originally called it. Safe Points
As the JIT compiler compiles a method, it can insert calls to a special
function that checks if a GC is pending. If so, the thread is
suspended, the GC runs to completion, and the thread is then resumed.
The position where the compiler inserts these method calls is called a
GC safe point. Note that thread hijacking allows threads that
are executing unmanaged code to continue execution while a garbage
collection is occurring. This is not a problem since unmanaged code is
not accessing objects on the managed heap unless the objects are pinned
and don't contain object references. A pinned object is one that the
garbage collector is not allowed to move in memory. If a thread that is
currently executing unmanaged code returns to managed code, the thread
is hijacked and is suspended until the GC completes. In
addition to the mechanisms I just mentioned, the garbage collector
offers some additional improvements that enhance the performance of
object allocations and collections when applications have multiple
threads. Synchronization-free Allocations
On a multiprocessor system, generation 0 of the managed heap is split
into multiple memory arenas using one arena per thread. This allows
multiple threads to make allocations simultaneously so that exclusive
access to the heap is not required. Scalable Collections
On a multiprocessor system running the server version of the execution
engine (MSCorSvr.dll), the managed heap is split into several sections,
one per CPU. When a collection is initiated, the collector has one
thread per CPU; all threads collect their own sections simultaneously.
The workstation version of the execution engine (MSCorWks.dll) doesn't
support this feature.
Garbage-collecting Large Objects
There is one more
performance improvement that you might want to be aware of. Large
objects (those that are 20,000 bytes or larger) are allocated from a
special large object heap. Objects in this heap are finalized and freed
just like the small objects I've been talking about. However, large
objects are never compacted because shifting 20,000-byte blocks of
memory down in the heap would waste too much CPU time. Note
that all of these mechanisms are transparent to your application code.
To you, the developer, it looks like there is just one managed heap;
these mechanisms exist simply to improve application performance.
Monitoring Garbage Collections
The runtime team at
Microsoft has created a set of performance counters that provide a lot
of real-time statistics about the runtime's operations. You can view
these statistics via the Windows 2000 System Monitor ActiveX® control.
The easiest way to access the System Monitor control is to run
PerfMon.exe and select the + toolbar button, causing the Add Counters
dialog box to appear (see Figure 6).
Figure 6 Adding Performance Counters
To
monitor the runtime's garbage collector, select the COM+ Memory
Performance object. Then, you can select a specific application from
the instance list box. Finally, select the set of counters that you're
interested in monitoring and press the Add button followed by the Close
button. At this point, the System Monitor will graph the selected
real-time statistics. Figure 7 describes the function of each counter.
Conclusion
So that's just about the full story on garbage
collection. Last month I provided the background on how resources are
allocated, how automatic garbage collection works, how to use the
finalization feature to allow an object to clean up after itself, and
how the resurrection feature can restore access to objects. This month
I explained how weak and strong references to objects are implemented,
how classifying objects in generations results in performance benefits,
and how you can manually control garbage collection with System.GC. I
also covered the mechanisms the garbage collector uses in multithreaded
applications to improve performance, what happens with objects that are
larger than 20,000 bytes, and finally, how you can use the Windows 2000
System Monitor to track garbage collection performance. With this
information in hand, you should be able to simplify memory management
and boost performance in your applications.
|
For related articles see: http://msdn.microsoft.com/msdnmag/issues/1100/GCI/GCI.asp For background information see: Garbage Collection: Algorithms for Automatic Dynamic Memory Management by Richard Jones and Rafael Lins (John Wiley & Son, 1996) Programming Applications for Microsoft Windows by Jeffrey Richter (Microsoft Press, 1999)
|
Jeffrey Richter (http://www.JeffreyRichter.com) is the author of Programming Applications for Microsoft Windows (Microsoft Press, 1999), and is a co-founder of Wintellect (http://www.Wintellect.com),
a software education, debugging, and consulting firm. He specializes in
programming/design for .NET and Win32. Jeff is currently writing a
Microsoft .NET Framework programming book and offers .NET technology
seminars. |
From the December 2000 issue of MSDN Magazine
|