Changelog

Improvement

Jul 3, 2024

Genta Inference Engine v0.1

We are proud to launch the new Genta Inference Engine version 0.1 for enterprise, on-cloud and on-premise Large Language Model deployment with 3x the throughput compared to vLLM and TGI with up to 96 concurrent request on single L40s GPU on Llama 3 8B.

Improvement

Jul 3, 2024

Genta Inference Engine v0.1