Skip to content

Conversation

@kczimm
Copy link
Contributor

@kczimm kczimm commented Oct 11, 2023

Enables support for vLLM. To use, you must specify the model field in the task parameter of the pgml.transform function and you must add "backend": "vllm" in the task parameters. For example,

SELECT * FROM pgml.transform(
    task => '{"model":"tiiuae/falcon-7b","backend":"vllm"}'::JSONB,
    inputs => Array['hello']
);

A list of supported models for vLLM can be found here.

Only one vLLM model can be loaded per client connection process due to a limitation in vLLM. The first call to pgml.transform with a given model will load the model ("cold start"), but subsequent calls will use the cached model. If you change the specified model in the same client connection, the cached model will be replaced with the new one.

@kczimm kczimm marked this pull request as ready for review October 19, 2023 20:46
@levkk
Copy link
Contributor

levkk commented Oct 19, 2023

Rebase on master to get #1102 which should fix the tests.

@kczimm kczimm force-pushed the kczimm-vllm-support branch from 5e20276 to aca505c Compare October 19, 2023 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants