Concerning performance regression in 2.0.4

As the author of the TommyDS data structure library (www.tommyds.it), I
regularly perform comparison benchmarks between various high-performance
container libraries, including google::dense_hash_map.

I encountered a serious and concerning performance regression in the latest
2.0.4 version of the library (sourced from https://github.com/sparsehash).
This issue does NOT appear in version 2.0.3.

This test demonstrates a pathological slowdown: tests that should take seconds
may instead take minutes or hours.

This test case is designed to demonstrate a pathological performance
regression observed in the `google::dense_hash_map` implementation,
specifically in version 2.0.4, which does not appear in version 2.0.3.

The core problem manifests when the number of elements approaches a power of
two. In these instances, the performance slows dramatically as the
power-of-two count is approached. After reaching the worst-case limit,
increasing the element count by just one unit instantly resolves the issue,
returning the execution time to the expected state.

Testing has confirmed that changing the underlying hash function does not
mitigate this size-dependent performance degradation.

For instance, using **65530** elements yields expected, immediate performance:

```
$ time ./tommybench 65530
65530
insert
change
remove

real    0m0.006s
user    0m0.004s  // Immediate
sys     0m0.002s
```

However, reducing the count by one element to **65529** results in a severe
slowdown:

```
$ time ./tommybench 65529
65529
insert
change // HERE IS SLOW
remove

real    0m5.057s
user    0m5.056s  // Approximately 5 seconds delay
sys     0m0.001s
```

The attached test program is intended to document and track this
size-dependent performance degradation.

```
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <stddef.h>
#include <stdbool.h>
#include <stdint.h>

/* From https://github.com/sparsehash/sparsehash version 2.0.4 */
#include <google/dense_hash_map>

struct KnuthMultiplicativeHash {
    size_t operator()(const int& key) const {
        const uint32_t C = 2654435761U;     
        uint32_t h = (uint32_t)key;
        h *= C;
        h ^= (h >> 16); 
        return (size_t)h;
    }
};

struct JenkinsMixHash {
    size_t operator()(const int& key) const {
        size_t h = (size_t)key;
        h += (h << 10);
        h ^= (h >> 6);
        h += (h << 3);
        h ^= (h >> 11);
        h += (h << 15);
        return h;
    }
};

/*
 * The hash function used here (KnuthMultiplicativeHash) is arbitrary.
 * Testing has shown that substituting it with JenkinsMixHash, other
 * custom hash functions, or even relying on the default hash function
 * does not resolve the pathological performance issue.
 */
typedef google::dense_hash_map<int, int, KnuthMultiplicativeHash> googledensehash_t;

googledensehash_t* googledensehash;

unsigned the_max; /**< Number of elements in test. */

void test_insert(void)
{
	unsigned i;

	for(i=0;i<the_max;++i) {
		/* insert */
		unsigned key = i * 2;
		unsigned value = key;
		(*googledensehash)[key] = value;
	}
}

void test_change(void)
{
	unsigned i;

	for(i=0;i<the_max;++i) {
		/* find */
		unsigned key = i * 2;
		unsigned value;
		googledensehash_t::iterator ptr;
		ptr = googledensehash->find(key);
		if (ptr == googledensehash->end())
			abort();
		value = ptr->second;
		if (value != key)
			abort();

		/* remove */
		googledensehash->erase(ptr);

		/* reinsert with a different key */
		key = key + 1;
		value = key;
		(*googledensehash)[key] = value;
	}
}

void test_remove(void)
{
	unsigned i;

	for(i=0;i<the_max;++i) {
		/* find */
		unsigned key = i * 2 + 1;
		unsigned value;
		googledensehash_t::iterator ptr = googledensehash->find(key);
		if (ptr == googledensehash->end())
			abort();
		value = ptr->second;
		if (value != key)
			abort();

		/* remove */
		googledensehash->erase(ptr);
	}
}

int main(int argc, char * argv[])
{
/*
 * The values below represent element counts (N) found to exhibit the
 * pathological performance regression in google::dense_hash_map v2.0.4.
 *
 * These numbers are slightly larger than powers of two (e.g., 2^20 + 9,
 * 2^19 - 7, 2^18 + 9, 2^17 + 9, and 2^16 - 7), suggesting an issue with
 * resizing or load factor calculation at critical power-of-two boundaries.
 *
 * the_max is intentionally set to 65529 as the default test case
 * because, unlike the others, this specific pathological size terminates
 * within a reasonable timeframe (seconds) instead of requiring manual
 * termination (minutes/hours).
 */
	the_max = 1048569;
	the_max = 524281;
	the_max = 262137;
	the_max = 131065;
	the_max = 65529;

	if (argc > 1)
		the_max = atoi(argv[1]);

	printf("%u\n", the_max);

	googledensehash = new googledensehash_t;
	googledensehash->set_empty_key(-1);
	googledensehash->set_deleted_key(-2);

	printf("insert\n");
	test_insert();

	printf("change\n");
	test_change();

	printf("remove\n");
	test_remove();

	delete googledensehash;

	return EXIT_SUCCESS;
}
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concerning performance regression in 2.0.4 #172

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Concerning performance regression in 2.0.4 #172

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions