Your device,
sentient.

An on-device intelligence layer
for your entire digital life.

A custom-built vision LLM runs on your own phone or computer while they charge overnight, to privately understand everything you've ever saved.

“What was that wine I liked?”

“Who did I wanna meet next week?”

Talk to your data

by holding down one button

And with MCP, you can let your LLM talk to your data too — your ChatGPT can finally understand you.

A hand holds an iPhone showing an auto-generated knowledge graph branching from RECIPES into Saved Recipes, Cooking Tips, and meal categories.

Knowledge Graphs

Organising your entire digital life.

Tap any bubble to find your knowledge — the stuff you wanted to save is no longer buried in 1000s of screenshots, notes, and files.

An iPhone notification from Sentient OS reminding the user about Billie Eilish concert tickets opening tomorrow.A macOS notification from Sentient OS reminding the user that a tax return in their Downloads folder is due next week.

Ambient Intelligence

Your AI looks out for you, across every device.

If it found anything that was really worth reminding you about, it'll do so at the right time.

Sentient OS is coming to your phone and PC.

Why has no one else built this yet?

You can't run AI on your entire digital life in the cloud:

  • It's a privacy nightmare.
  • It'd cost a fortune.

So it has to run on your own devices, overnight, while they charge. But the AI smart enough to understand your life is huge, and your devices are not.

Apple makes the world's best on-device inference engine; Qwen makes the world's best small AI model. Neither was built for your entire digital life, and the rest of the many layers required weren't good enough.

So I optimised every layer of the stack from scratch. Together, they form the most optimised on-device AI stack anywhere.

The result: an AI that has no business being this smart, running at speeds it has no business hitting; one that actually sees your images, hears your audio, and understands your screen, on your own devices, in private, every night.

This is the intelligence layer for your entire digital life.

Inference Optimization
proprietary kv cache reuse + flash attention
Surgery on LLMs
vision transplanted from a 4× bigger model
Improving Apple's MLX
MLX modified for batch multimodal inference
Custom Quantization Technique
custom k-quants on MLX
Device-aware RAM optimization
quantization tuned per device
True multi-agent intelligence
AI that verifies its own outputs