This Mac App Lets You Run AI Models Locally Without Paying for Every Token
-

Osaurus is an open-source, Apple-only LLM server that allows users to run AI models locally on their Mac hardware while switching freely between local models and cloud providers like OpenAI and Anthropic — all through a single interface that keeps files, tools, and model memory on the user's own device rather than in the cloud. The project grew out of a simple question from users of Dinoki, a previous AI companion app built by co-founder Terence Pae: why pay for tokens on top of the app subscription? That frustration pushed Pae, previously a software engineer at Tesla and Netflix, toward building a local-first AI infrastructure tool that anyone could use without developer expertise. Osaurus functions as what the industry calls a harness — a control layer connecting different AI models, tools, and workflows through a unified interface — but differentiates itself from developer-oriented competitors like Ollama and LM Studio with a consumer-friendly UI and a hardware-isolated virtual sandbox that limits the AI's system access for security, addressing concerns that have dogged similar tools like OpenClaw. The app currently supports MiniMax M2.5, Gemma 4, Qwen3, Llama, DeepSeek V4, Apple's on-device foundation models, and more than a dozen cloud providers, ships with over 20 native plugins covering Mail, Calendar, Browser, Git, Filesystem, and others, and operates as a full MCP server for connecting compatible clients to the user's local tools.
The hardware requirements remain the main practical barrier: local models need at least 64GB of RAM, and running larger models like DeepSeek V4 comfortably requires around 128GB. Pae is candid about this but frames it as a temporary constraint given the trajectory of local AI capability. "Last year, local AI could barely finish sentences, but today it can actually run tools, write code, access your browser, and order stuff from Amazon," he told TechCrunch. The app has been downloaded more than 112,000 times since launching roughly a year ago and the founders are now participating in the Alliance accelerator in New York while exploring enterprise applications in legal and healthcare — sectors where running AI locally addresses privacy concerns that cloud-based processing cannot resolve regardless of provider commitments. For individuals who are frustrated with per-token costs, concerned about sending sensitive data to cloud providers, or simply want an AI assistant that works without internet dependency, Osaurus represents one of the more complete and accessible local AI solutions currently available for Mac users.
-
Why pay tokens on top of subscription, fair question