Bug: 22170

Impact:
Informational

Product:
GemStone/S

Versions:
6.6.3.3, 6.6.3.2, 6.6.3, 6.6.2, 6.6.1, 6.6, 6.5.8, 6.5.7.5, 6.5.7, 6.5.6, 6.5.5, 6.5.4, 6.5.2, 6.5.1, 6.5, 6.3.1, 6.3, 6.2.x, 6.2, 6.1.6, 6.1.5, 6.1.x, 6.0.x, 5.1.5.1, 5.1.5

Platform:
All

Fixed In:

reclaimAll can fail immediately after markForCollection

If you execute:
        Repository>>reclaimAll
immediately following a markForCollection (for example, in a topaz script)
the reclaim can fail by returning false, or by returning true
but not actually reclaiming any pages.

This occurs because, after running markForCollection, the following
operations must occur:

1. The Gem running markForCollection sends the Stone the list of
possibly dead objects and returns. The Stone now holds the list of
possibly dead objects -- referred to as the "possible dead set" --
in RAM.

2. Now every Gem currently logged in the system must search the
possible dead set for any objects to which it holds references.
Then it must commit or abort,at which time it votes to either keep
an object in the set, or remove it (if it holds a reference).

-- NOTE --
Without commiting or aborting, Gems do not vote and can delay
the process indefinitely.  The vote cannot be finalized,
garbage collection halts at this point, and commit records accumulate.

3. But what about Gems that aren't on the system now, but were when
garbage collection started? Their modified objects are in the commit
record backlog, in the write sets of each commit record, which the
GcGem reads in order to vote on their behalf.

Until these events occur, objects marked by markForCollection are not truly
dead, only "possibly dead", and their pages cannot be reclaimed.

If you call stopOtherSessions after the markForCollection, you
terminate the GcGem before it can vote on behalf of the logged-out
Gems. Therefore, objects in the possible dead set are not promoted
to DeadNotReclaimed, and the subsequent reclaimAll has nothing to
reclaim.

Workaround:

You can tell when it's time to stop other sessions (if necessary) and run
reclaimAll by using statmonitor and Visual Statistics Display (VSD) to
watch three cache statistics:

When PossibleDeadSize and GcPossibleDeadSize fall to zero, and
DeadNotReclaimedSize goes up, you can run reclaimAll confident that
the operation will perform as intended.