Storage-tier-aware LLM inference scheduler for Apple Silicon.
Hypura is an LLM inference scheduler designed specifically for Apple Silicon. It optimizes performance by intelligently managing data placement across different storage tiers, resulting in faster and more efficient inference.
Unknown