I Started Building a Roguelike RPG — Powered by On-Device AI #2
Running On-Device LLM in Unity Android — Everything That Broke (and How I Fixed It) In my last post, I mentioned I was building a roguelike RPG powered by an on-device LLM. This time I'll cover exa...

Source: DEV Community
Running On-Device LLM in Unity Android — Everything That Broke (and How I Fixed It) In my last post, I mentioned I was building a roguelike RPG powered by an on-device LLM. This time I'll cover exactly how I did it, what broke, and what the numbers look like. The short version: I got Phi-4-mini running in Unity on a real Android device in one day. It generated valid JSON. It took 8 minutes and 43 seconds. 0. Why This Tech Stack Before the details, here's why I made each choice. Why Phi-4-mini (3.8B)? Microsoft officially distributes it in ONNX format — no conversion work needed. The INT4 quantized version fits in 4.9GB, which is manageable on a 12GB RAM device. At 3.8B parameters, it's roughly the minimum size that can reliably produce structured JSON output. Smaller models tend to fall apart on formatting tasks. Why ONNX Runtime? Cross-platform support across Android, iOS, Windows, and Mac. There's a Unity C# binding, and the asus4/onnxruntime-unity package makes Unity integration str