waybarrios/vllm-mlx: OpenAI and Anthropic compatible server for Apple Silicon. I use this to run mlx-community/gemma-3-12b-it-4bit on my MacBook Air. It works very well, a small shell script to start the server and then I am autonomous. Not as comfortable as Ollama, but it perfectly supports Apple's MLX and thus makes good use of Silicon.