Pike v8.0 release 1738

Method Pike.count_memory()


Method count_memory

int count_memory(int|mapping(string:int) options, array|multiset|mapping|object|program|string|type|int ... things)

Description

In brief, if you call Pike.count_memory(0,x) you get back the number of bytes x occupies in memory.

The detailed story is a bit longer:

This function calculates the number of bytes that all things occupy. Or put another way, it calculates the number of bytes that would be freed if all those things would lose their references at the same time, i.e. not only the memory in the things themselves, but also in all the things that are directly and indirectly referenced from those things and not from anywhere else.

The memory counted is only that which is directly occupied by the things in question, including any overallocation for mappings, multisets and arrays. Other memory overhead that they give rise to is not counted. This means that if you would count the memory occupied by all the pike accessible things you would get a figure significantly lower than what the OS gives for the pike process.

Also, if you were to actually free the things, you should not expect the size of the pike process to drop the amount of bytes returned by this function. That since Pike often retains the memory to be reused later.

However, what you should expect is that if you actually free the things and then later allocates some more things for which this function returns the same size, there should be essentially no increase in the size of the pike process (some increase might occur due to internal fragmentation and memory pooling, but it should be small in general and over time).

The search for things only referenced from things can handle limited cyclic structures. That is done by doing a "lookahead", i.e. searching through referenced things that apparently have other outside references. You can control how long this lookahead should be through options (see below). If the lookahead is too short to cover the cycles in a structure then a too low value is returned. If the lookahead is made gradually longer then the returned value will eventually become accurate and not increase anymore. If the lookahead is too long then unnecessary time might be spent searching through things that really have external references.

Objects that are known to be part of cyclic structures are encouraged to have an integer constant or variable pike_cycle_depth that specifies the lookahead needed to discover those cycles. When Pike.count_memory visits such objects, it uses that as the lookahead when going through the references emanating from them. Thus, assuming objects adhere to this convention, you should rarely have to specify a lookahead higher than zero to this function.

Note that pike_cycle_depth can also be set to zero to effectively stop the lookahead from continuing through the object. That can be useful to put in objects you know have global references, to speed up the traversal.

Parameter options

If this is an integer, it specifies the maximum lookahead distance. -1 counts only the memory of the given things, without following any references. 0 extends the count to all their referenced things as long as there are no cycles (except if pike_cycle_depth is found in objects - see above). 1 makes it cover cycles of length 1 (e.g. a thing points to itself), 2 handles cycles of length 2 (e.g. where two things point at each other), and so on.

However, the lookahead is by default blocked by programs, i.e. it never follows references emanating from programs. That since programs seldom are part of dynamic data structures, and they also typically contain numerous references to global data which would add a lot of work to the lookahead search.

To control the search in more detail, options can be a mapping instead:

lookahead : int

The maximum lookahead distance, as described above. Defaults to 0 if missing.

block_arrays : int

When any of these are given with a nonzero value, the corresponding type is blocked when lookahead references are followed. They are unblocked if the flag is given with a zero value. Only programs are blocked by default.

These blocks are only active during the lookahead, so blocked things are still recursed and memory counted if they are given as arguments or only got internal references.

block_mappings : int
block_multisets : int
block_objects : int
block_programs : int
block_strings : int

If positive then strings are always excluded (except any given directly in things), if negative they are always included. Otherwise they are counted if they have no other refs, but note that since strings are shared they might get refs from other unrelated parts of the program.

block_pike_cycle_depth : int

Do not heed pike_cycle_depth values found in objects. This is implicit if the lookahead is negative.

return_count : int

Return the number of things that memory was counted for, instead of the byte count. (This is the same number internal contains if collect_stats is set.)

collect_internals : int

If this is nonzero then its value is replaced with an array that contains the things that memory was counted for.

collect_externals : int

If set then the value is replaced with an array containing the things that were visited but turned out to have external references (within the limited lookahead).

collect_direct_externals : int

If set then the value is replaced with an array containing the things found during the lookahead that (appears to) have direct external references. This list is a subset of the collect_externals list. It is useful if you get unexpected global references to your data structure which you want to track down.

collect_stats : int

If this is nonzero then the mapping is extended with more elements containing statistics from the search; see below.

When the collect_stats flag is set, the mapping is extended with these elements:

internal : int

Number of things that were marked internal and hence memory counted. It includes the things given as arguments.

cyclic : int

Number of things that were marked internal only after resolving cycles.

external : int

Number of things that were visited through the lookahead but were found to be external.

visits : int

Number of times things were visited in total. This figure includes visits to various internal things that aren't visible from the pike level, so it might be larger than what is apparently motivated by the numbers above.

revisits : int

Number of times the same things were revisited. This can occur in the lookahead when a thing is encountered through a shorter path than the one it first got visited through. It also occurs in resolved cycles. Like visits, this count can include things that aren't visible from pike.

rounds : int

Number of search rounds. This is usually 1 or 2. More rounds are necessary only when blocked types turn out to be (acyclic) internal, so that they need to be counted and recursed anyway.

work_queue_alloc : int

The number of elements that were allocated to store the work queue which is used to keep track of the things to visit during the lookahead. This is usually bigger than the maximum number of things the queue actually held.

size : int

The memory occupied by the internal things. This is the same as the normal return value, but it's put here too for convenience.

Parameter things

One or more things to count memory size for. Only things passed by reference are allowed, except for functions which are forbidden because a meaningful size calculation can't be done for them.

Integers are allowed because they are bignum objects when they become sufficiently large. However, passing an integer that is small enough to fit into the native integer type will return zero.

Returns

Returns the number of bytes occupied by the counted things. If the return_count option is set then the number of things are returned instead.

Note

The result of Pike.count_memory(0,a,b) might be larger than the sum of Pike.count_memory(0,a) and Pike.count_memory(0,b) since a and b together might reference things that aren't referenced from anywhere else.

Note

It's possible that a string that is referenced still isn't counted, because strings are always shared in Pike and the same string may be in use in some unrelated part of the program.