The platform lets developers run transformer models, agents, and LLMs natively on smartphones using an offline Python runtime and SDK.
Bengaluru-based deep tech startup NimbleEdge has launched DeliteAI, an open-source, on-device agentic AI platform designed to run entirely on smartphones without relying on cloud infrastructure.
The company claims the platform allows developers to build, deploy, and run fully customised AI-native experiences directly on user devices, positioning it as a privacy-focused alternative to traditional cloud-based AI systems.
The platform introduces a full-stack, production-ready infrastructure designed for optimal performance, privacy, and portability.
According to NimbleEdge, DeliteAI supports deployment of transformer models, large language models (LLMs), and multimodal AI, removing the need for cloud GPUs and enabling data to remain entirely on-device.
DeliteAI comprises three core components integrated into a unified platform: a production SDK featuring an optimised inference stack and, according to the company, the industry's first on-device Python runtime for orchestrating agentic workflows.
Additionally, there is a dedicated Agent Marketplace for discovering and integrating pre-built AI agents into mobile apps, as well as the NimbleEdge Assistant, which the company claims is the world's first fully on-device conversational AI assistant with built-in productivity tools.
"We believe AI shouldn't sit miles away from us in data centres," said Varun Khare, founder and CEO of NimbleEdge, in the statement. "With this launch, any mobile application can scale AI to billions of users while improving data safety and user privacy."
Unlike existing tools such as Ollama, LM Studio, and AutoGPT, which primarily support desktop environments, NimbleEdge told AIM that DeliteAI is built natively for smartphones. The platform includes a mobile-optimised inference runtime for resource-constrained devices and a Python engine that runs on both Android and iOS.
In defining agentic AI, the company refers to modular, tool-using workflows that complete multi-step tasks based on user context. DeliteAI supports on-device tool calling and contextual response adaptation embedded within app interfaces, not just traditional chat formats. Its SDK captures real-time user interactions to enable intent-driven agents across various use cases, including e-commerce, entertainment, and grocery apps.
The platform is capable of running entirely offline, handling inference, planning, memory, context management, and real-time user interactions without requiring cloud connectivity. NimbleEdge notes that models are trained in the cloud but deployed and executed on-device via over-the-air (OTA) updates. Hybrid setups are also supported, allowing developers to configure smart query routing between on-device and cloud infrastructure depending on performance or privacy requirements.
To address hardware constraints common to edge devices, the company states that it has conducted research in contextual sparsity, which improves generation speed by up to 2x and reduces memory consumption by 30%. DeliteAI also features custom fused C++ kernels for transformer models compatible with PyTorch and ExecuTorch. Its Python engine optimises data pipelines to enable sub-100 ms inference for models like Kokoro and LightGBM.
The platform supports a wide range of smartphones across Android and iOS, with reported deployments on more than 30 million devices, including phones as old as eight years. According to the company, a 200KB LightGBM model running on-device consumes less battery than making a standard API call to the cloud.
DeliteAI supports integration with open models such as Llama, Gemma, and Qwen while abstracting away hardware complexities through support for runtimes like ONNX, LiteRT, and ExecuTorch. Its Agent Marketplace features reusable, on-device agents for text summarisation, voice interaction (TTS/ASR), recommendations, AI-assisted search, and other productivity functions.
"For the first time, developers can bring state-of-the-art AI models to consumer devices, orchestrate them with Python, and deploy truly private AI agents at scale," said Neeraj Poddar, co-founder and CTO of NimbleEdge, in the statement. "This is the missing infrastructure layer and developer tooling we wished existed when we were building distributed systems at a global scale."
Aakrit Vaish, an early investor in the company and member of the India AI Mission, said NimbleEdge reflects the kind of innovation emerging from India's AI ecosystem. He noted that the platform's privacy-first design is well-suited to India's vast base of smartphone users.