Skip to content

Avoid HnswIndex heavy construction because of mt19937 #2398

@Beihao-Zhou

Description

@Beihao-Zhou

Search before asking

  • I had searched in the issues and found no similar issues.

Motivation

As mentioned in PR #2368 in this comment,
HnswIndex is a relatively heavy class since it includes mt19937, sizeof(mt19937) ~= 5000), we can try to optimize its construction.

Thanks @PragmaTwice for pointing it out! :)

Solution

One way I could think of to improve it is to make all instances of HnswIndex share a single mt19937 instance.

We can implement a RandomGenerator using Singleton Pattern as below:

class RandomGenerator {
public:
    static RandomGenerator& getInstance() {
        static RandomGenerator instance;
        return instance;
    }

    std::mt19937& getGenerator() {
        return generator_;
    }

private:
    RandomGenerator() {
        std::random_device rd;
        generator_ = std::mt19937(rd());
    }
    // delete copy & move constructor
    std::mt19937 generator_;
};

struct HnswIndex {
    HnswIndex(...) {
        generator_ = &RandomGenerator::getInstance().getGenerator();
    }
};

One callout is that mt19937 is not thread-safe, so I'm not quite sure if sharing it among all HnswIndex instances will cause any side effects, will further investigate if we finally use the solution.

Welcome any other ideas and thoughts!

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions