I ran a real AI chatbot locally on my iPhone — here’s how it worked

A man typing on an iPhone — (Image credit: Shutterstock)

AI chatbots like ChatGPT and Gemini usually need the cloud to function. However, what if you could run an entire large language model (LLM) right on your iPhone without a subscription, an internet connection and without any data leaving your device? Thanks to a handful of apps and lightweight, compressed models, you actually can.

I tried it out, and here’s what you need to know.

Running AI locally on iPhone

screenshot of LLM locally — (Image credit: Future)

You can now run open-source models like Llama and Qwen directly on iOS. These models are slimmed down using a process called quantization, which compresses them to fit into mobile memory without completely breaking performance.

The catch: performance depends heavily on your hardware. An iPhone 15 Pro or 15 Pro Max with Apple’s latest chip can load models up to 7B or 8B parameters (like Llama 3.1 8B), while older phones are better suited for smaller 1–3B parameter models.

The apps that make it possible

iPhone 15 Pro shown in hand — (Image credit: Tom's Guide)

LLM Farm (free): The easiest way to start. You can download a small model (like Phi-3.5 Instruct) and run it offline with just a tap. It’s surprisingly smooth for quick Q&A.
MLC Chat (free): This is the one I used. I would have used LLM Farm but for some reason the Apple App Store wasn't giving me the option to download it. Since this one was free, I went for it and it worked just as well.
Private LLM (community project): This is more of a DIY option and not for the casual user. This one has detailed guides for loading models like Llama 3.1 and Qwen on your iPhone. If you like to tinker, definitely give it a shot.
Apollo (paid): I've heard good things but have not tried this app myself. Let me know in the comments what you think of this privacy-focused app.

How to locally run the model

Close up of a person wearing a grey jumper using a blue iPhone — (Image credit: Getty Images)

Once you've downloaded your app of choice, open the app. From there, browse the built-in model list and choose one (e.g., Phi-3.5 Instruct Q4 quantized). I chose Qwen 2.5 for no other reason except that I haven't used it for awhile.

Once you download it, you will see the model on your device (anywhere from a few hundred MB to several GB depending on size). From there, just start chatting.

You'll want to keep expectations realistic; this is not the time to ask for deep dives or long step-by-step plans. Keep the following in mind:

Speed: Small models (1–3B) respond faster; big models can take seconds per token.
Context: Don’t paste entire essays; keep prompts shorter.
Output: Local LLMs may be less polished than ChatGPT, but they’re useful for notes, summaries, Q&A, and lightweight drafting.

I had fun trying a few prompts. Nothing fancy; my goal was just to see the type of responses I got from the local request. One thing you'll notice right away is the speed. It's incredible how fast the LLM responds.

I tried the following prompts and overall, I was impressed.

“Summarize the Declaration of Independence in three bullet points.”
“Write a short bedtime story about a robot and a cat.”
“Give me three dinner ideas using chicken, rice, and broccoli.”

Running a local LLM isn’t the same as chatting with ChatGPT-5. It definitely feels streamlined and raw. If you try this, remember to keep your prompts short because the context windows are much more limited than when using the regular version of the chatbots. Responses will feel slower if you overload the local LLM.

Why would you do this?

A table showing the differences in Llama 4 models — (Image credit: Meta)

No subscription fees. You are not burning credits just to experiment.
Privacy built in. Everything stays on your device.
Surprisingly versatile. I was blown away by how much the mini model could handle. Every time I pushed the limits, it was able to tackle the challenege easily.

Final thoughts

If you have an iPhone 15 or greater and are curious about what AI looks like 'under the hood,' go for it. LLM Farm or MLCChat are fast and free ways to get to get started. For privacy hawks, Apollo is worth a look. And if you’re more of a tinkerer, Private LLM lets you go deep into custom setups.

Just remember that these are not the full-power chatbots you're used to, so don't expect outputs like ChatGPT. But, it is pretty cool and feels futuristic to run your own AI on your iPhone.

Follow Tom's Guide on Google News to get our up-to-date news, how-tos, and reviews in your feeds. Make sure to click the Follow button.

More from Tom's Guide

Back to Mobile Cell Phones

128GB

256GB

512GB

Black

Blue

Gold

Pink

Purple

Teal

White

New

Refurbished

Showing 10 of 110 deals

Filters☰

Apple iPhone 16 Pro Max

(256GB White)

Our Review

☆☆☆☆☆

$1,379

View Deal

Apple iPhone 16

(128GB)

Our Review

☆☆☆☆☆

$20.28

View Deal

Apple iPhone 16 Plus

(128GB Black)

Our Review

☆☆☆☆☆

$1,109

View Deal

Apple iPhone 16 Pro

(128GB)

Our Review

☆☆☆☆☆

$1,179

View Deal

Apple iPhone 15

Our Review

☆☆☆☆☆

$829.99

View Deal

Apple iPhone 16 Pro Max

(256GB)

Our Review

☆☆☆☆☆

$1,099.99

View Deal

Apple iPhone 16

Our Review

☆☆☆☆☆

$729

View Deal

Apple iPhone 16 Plus

(512GB Blue)

Our Review

☆☆☆☆☆

$1,129.99

View Deal

Apple iPhone 16 Pro

(White)

Our Review

☆☆☆☆☆

$1,499

View Deal

Apple iPhone 15

(Blue)

Our Review

☆☆☆☆☆

$729

View Deal

Amanda Caswell is one of today’s leading voices in AI and technology. A celebrated contributor to various news outlets, her sharp insights and relatable storytelling have earned her a loyal readership. Amanda’s work has been recognized with prestigious honors, including outstanding contribution to media.

Known for her ability to bring clarity to even the most complex topics, Amanda seamlessly blends innovation and creativity, inspiring readers to embrace the power of AI and emerging technologies. As a certified prompt engineer, she continues to push the boundaries of how humans and AI can work together.

Beyond her journalism career, Amanda is a long-distance runner and mom of three. She lives in New Jersey.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.

Running AI locally on iPhone

The apps that make it possible

How to locally run the model

Why would you do this?

Final thoughts

More from Tom's Guide

Useful links