Skip to content

Concerning performance regression in 2.0.4 #172

@amadvance

Description

@amadvance

As the author of the TommyDS data structure library (www.tommyds.it), I
regularly perform comparison benchmarks between various high-performance
container libraries, including google::dense_hash_map.

I encountered a serious and concerning performance regression in the latest
2.0.4 version of the library (sourced from https://github.com/sparsehash).
This issue does NOT appear in version 2.0.3.

This test demonstrates a pathological slowdown: tests that should take seconds
may instead take minutes or hours.

This test case is designed to demonstrate a pathological performance
regression observed in the google::dense_hash_map implementation,
specifically in version 2.0.4, which does not appear in version 2.0.3.

The core problem manifests when the number of elements approaches a power of
two. In these instances, the performance slows dramatically as the
power-of-two count is approached. After reaching the worst-case limit,
increasing the element count by just one unit instantly resolves the issue,
returning the execution time to the expected state.

Testing has confirmed that changing the underlying hash function does not
mitigate this size-dependent performance degradation.

For instance, using 65530 elements yields expected, immediate performance:

$ time ./tommybench 65530
65530
insert
change
remove

real    0m0.006s
user    0m0.004s  // Immediate
sys     0m0.002s

However, reducing the count by one element to 65529 results in a severe
slowdown:

$ time ./tommybench 65529
65529
insert
change // HERE IS SLOW
remove

real    0m5.057s
user    0m5.056s  // Approximately 5 seconds delay
sys     0m0.001s

The attached test program is intended to document and track this
size-dependent performance degradation.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <stddef.h>
#include <stdbool.h>
#include <stdint.h>

/* From https://github.com/sparsehash/sparsehash version 2.0.4 */
#include <google/dense_hash_map>

struct KnuthMultiplicativeHash {
    size_t operator()(const int& key) const {
        const uint32_t C = 2654435761U;     
        uint32_t h = (uint32_t)key;
        h *= C;
        h ^= (h >> 16); 
        return (size_t)h;
    }
};

struct JenkinsMixHash {
    size_t operator()(const int& key) const {
        size_t h = (size_t)key;
        h += (h << 10);
        h ^= (h >> 6);
        h += (h << 3);
        h ^= (h >> 11);
        h += (h << 15);
        return h;
    }
};

/*
 * The hash function used here (KnuthMultiplicativeHash) is arbitrary.
 * Testing has shown that substituting it with JenkinsMixHash, other
 * custom hash functions, or even relying on the default hash function
 * does not resolve the pathological performance issue.
 */
typedef google::dense_hash_map<int, int, KnuthMultiplicativeHash> googledensehash_t;

googledensehash_t* googledensehash;

unsigned the_max; /**< Number of elements in test. */

void test_insert(void)
{
	unsigned i;

	for(i=0;i<the_max;++i) {
		/* insert */
		unsigned key = i * 2;
		unsigned value = key;
		(*googledensehash)[key] = value;
	}
}

void test_change(void)
{
	unsigned i;

	for(i=0;i<the_max;++i) {
		/* find */
		unsigned key = i * 2;
		unsigned value;
		googledensehash_t::iterator ptr;
		ptr = googledensehash->find(key);
		if (ptr == googledensehash->end())
			abort();
		value = ptr->second;
		if (value != key)
			abort();

		/* remove */
		googledensehash->erase(ptr);

		/* reinsert with a different key */
		key = key + 1;
		value = key;
		(*googledensehash)[key] = value;
	}
}

void test_remove(void)
{
	unsigned i;

	for(i=0;i<the_max;++i) {
		/* find */
		unsigned key = i * 2 + 1;
		unsigned value;
		googledensehash_t::iterator ptr = googledensehash->find(key);
		if (ptr == googledensehash->end())
			abort();
		value = ptr->second;
		if (value != key)
			abort();

		/* remove */
		googledensehash->erase(ptr);
	}
}

int main(int argc, char * argv[])
{
/*
 * The values below represent element counts (N) found to exhibit the
 * pathological performance regression in google::dense_hash_map v2.0.4.
 *
 * These numbers are slightly larger than powers of two (e.g., 2^20 + 9,
 * 2^19 - 7, 2^18 + 9, 2^17 + 9, and 2^16 - 7), suggesting an issue with
 * resizing or load factor calculation at critical power-of-two boundaries.
 *
 * the_max is intentionally set to 65529 as the default test case
 * because, unlike the others, this specific pathological size terminates
 * within a reasonable timeframe (seconds) instead of requiring manual
 * termination (minutes/hours).
 */
	the_max = 1048569;
	the_max = 524281;
	the_max = 262137;
	the_max = 131065;
	the_max = 65529;

	if (argc > 1)
		the_max = atoi(argv[1]);

	printf("%u\n", the_max);

	googledensehash = new googledensehash_t;
	googledensehash->set_empty_key(-1);
	googledensehash->set_deleted_key(-2);

	printf("insert\n");
	test_insert();

	printf("change\n");
	test_change();

	printf("remove\n");
	test_remove();

	delete googledensehash;

	return EXIT_SUCCESS;
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions