Tiny-vLLM

Name: Tiny-vLLM
Availability: InStock
Author: Aiinity

High-performance LLM inference engine in C++ and CUDA.

App

396.8K views

Launch App

About Tiny-vLLM

Tiny-vLLM is a C++ and CUDA-based inference engine designed for running Large Language Models (LLMs) efficiently. It prioritizes high performance and low resource usage, making it suitable for various deployment scenarios.

App Information

Version1.0.0

Developer

Unknown

Developer Website