Explained: Apple’s new method to run LLMs on iPhones – Times of India

Apple GPT Launching on iPhones may soon become a reality. Artificial intelligence researchers at the Cupertino-based tech giant have made a major breakthrough Large language models (LLMs) on iPhones and other Apple devices. Apple researchers said this could be achieved by inventing new flash with limited memory Techniques for using memory.
LLMs are data and memory hungry
LLM based chatbots like ChatGPT and Claude are very data and memory intensive. These models usually require a large amount of memory to work. Such requirements can be a challenge for devices such as iPhones that have limited storage capacity.
To solve this problem, Apple researchers have developed a new technique that uses flash memory To store AI model data. This is the same memory where apps and photos are also stored.
How Apple plans to launch LLMs on iPhones
In a new research paper titled “LLM in a flash: Efficient Large Language Model Inference with Limited Memory” (first spotted by MacRumors), the authors argued that flash storage More abundant in mobile devices than the RAM traditionally used to run LLMs. Their method circumvents the limitation by using two key techniques that reduce data transfer and increase the bandwidth of flash memory. These methods are:
Window: It's like a recycling method. Instead of loading new data each time, the AI ​​model will reuse some of the data it has already processed. This reduces the need for constant memory fetching and makes the process faster and smoother.
Row-column packing: This technique is similar to reading a book in larger chunks instead of one word at a time. It can group data more efficiently, which is read faster from flash memory. This method also speeds up the AI's ability to understand and generate language.
The paper suggests that a combination of these methods will allow AI models to produce twice the size iPhoneavailable memory. This method is expected to increase the speed on standard processors (CPU) by 4-5 times and 20-25 times faster on graphics processors (GPU).
The authors note: “This breakthrough is particularly important for deploying advanced LLMs in resource-constrained environments, thereby expanding their applicability and accessibility.”
How this method will improve AI functions on iPhones
The latest breakthrough in AI efficiency will open up new possibilities for future iPhones. This includes more advanced Siri capabilities, real-time language translation, and other AI-driven features in photography and augmented reality. The technology will also help iPhones run complex AI assistants and chatbots on the device, which Apple is already working on.
In February, Apple held an AI summit and introduced employees to its big language model. Finally, Apple's work on generative AI could be used in the Siri voice assistant.

Apple is developing a smarter version of Siri that is deeply integrated with AI, it reports Bloomberg. The company plans to revamp the way Siri interacts with the Messages app. This allows users to ask complex questions and complete sentences more efficiently. Moreover, Apple also plans to add AI to as many applications as possible.
The iPhone maker is also developing its own generative AI model called ‘Ajax'. Ajax operates on 200 billion parameters, which indicates a high level of complexity and ability to understand and generate the language.
Known internally as “Apple GPT,” Ajax aims to unify machine learning development across the company. This suggests the company's broader strategy to integrate AI more deeply into Apple's ecosystem.
Rumors also suggest that Apple may include some sort of generative AI feature with iOS 18, which will be available on iPhone and iPad around the end of 2024. In October, analyst Jeff Pu said that Apple is building several hundred AI servers in 2023 and more. It is expected to arrive in 2024. Apple is likely to offer a combination of cloud-based AI and AI with on-device processing.