-
Notifications
You must be signed in to change notification settings - Fork 140
Copying Garbage Collector
Best algo against fragmentation, extremely fast (one-pass only), but needs double memory.
Esp. Cheney's Two-Finger variant is desirable. potion/p2 uses that and has a medium gc time of 4ms for medium sized heaps; i.e. real-time. But for guaranteed real-time you'd need to pause the GC, and this is not possible with a Copying collector, only Mark & Sweep.
The following is an excerpt from Garbage Collection: Algorithms for Automatic Dynamic Memory Management by Richard Jones and Rafael Sims. For a further discussion of the copying garbage collector, see TT #616.
The basic concept of a copying collector is that the heap is divided into two equal semi-spaces, one of which contains the current data and the other obsolete data. The Garbage Collection phase starts by flipping the roles of the two semi-spaces. The collector then scans the active data in the old semi-space, Fromspace
, copying each live cell into the new semi-space, Tospace
. On completion of the collection phase, Tospace
contains all the active data and Fromspace
is effectively abandoned.
One major advantage of this is that the active data is compacted at the bottom of Tospace
. Compacting collectors can allocate space much more efficiently than collectors in which fragmentation is a problem.
Allocation costs are extremely low. The out-of-space check is a simple pointer comparison. New memory is acquired by simply incrementing the free-space pointer. Fragmentation is eliminated by simply copying the active data into the bottom of Tospace
.
The chief drawback of copying collectors is the need to divide the available memory into two semi-spaces. As the residency of a program increases the performance of the collector degrades, as less free space is recovered and collections become more frequent.
The basic copying algorithm:
init() =
Tospace = Heap_bottom
space_size = Heap_size / 2
top_of_space = Tospace + space_size
Fromspace = top_of_space + 1
free = Tospace
New(n) =
if free + n > top_of_spece
flip()
if free + n > top_of_space
abort "Memory exhausted"
newcell = free
free = free + n
return newcell
flip() =
Fromspace, Tospace = Tospace, Fromspace
top_of_space = Tospace + space_size
free = Tospace
for R in roots
R = copy(R)
-- P points to a word not a cell
copy(P) =
if atomic(P) or P == nil -- P is not a pointer
return P
if not forwarded(P)
n = size(P)
P' = free -- reserve space in Topace
free = free + n
temp = P[0] -- field 0 will hold the forwarding address
forwardingaddress(P) = P'
P'[0] = copy(temp)
for i = 1 to n-1 -- copy each field of P into P'
P'[i] = copy(P[i])
return forwarding_address(P)
First, the roles of Tospace
and Fromspace
are swapped by a flip, which resets the variables Tospace
, Fromspace
and top_of_space
. Each cell reachable from a root is then copied from Fromspace
to Tospace
. copy(P)
scavenges the fields of the cell pointed to by P.
Care has to be taken when copying data structures to ensure that the topology of the shared data structures is preserved. Failure to do this would lead to multiple copies of shared objects, which at best would increase heap residency of the program but may also break the user program (for example, if it updated one copy of a cell, but then read the value from another). Copying cyclic data structures without preserving sharing would also require a lot of room!
Copying collectors preserve sharing by leaving a forwarding address in the Fromspace
object when it is copied. The forwarding address is the address of the copy in Tospace
. When a cell in Fromspace
is visited, copy()
checks to see if it has already been copied, if it has the forwarding address is returned, otherwise memory is reserved for the copy in Tospace
. In this recursive copying algorithm, the forwarding address is set to point to this reserved memory before the constituent fields of the object are copied -- this ensures termination and that sharing is preserved.
The forwarding address might be held in its own field in the cell. More generally it can be written over the first word in the cell provided that the original value of the cell is saved beforehand. In the above it is assumed that the forwarding address field in cell P
is P[0]
, and forwarding address(P)
and P[0]
are used interchangeably.
Excerpt from:<br > Garbage Collection: Algorithms for Automatic Dynamic Memory Management<br > by Richard Jones and Rafael Sims<br > http://www.cs.kent.ac.uk/people/staff/rej/gc.html