We are proud to launch the new Genta Inference Engine version 0.1 for enterprise, on-cloud and on-premise Large Language Model deployment with 3x the throughput compared to vLLM and TGI with up to 96 concurrent request on single L40s GPU on Llama 3 8B.