How to run LLaMA (and other LLMs) on Android.

llama@lemmy.dbzer0.com · 2 days ago

I see. I don’t think there there are many solutions on that front for Android. For PC there are a few, such as LM Studio.

llama@lemmy.dbzer0.com · edit-2 2 days ago

Thanks for your comment. That for sure is something to look out for. It is really important to know what you’re running and what possible limitations there could be. Not what the original comment said, though.

llama@lemmy.dbzer0.com · edit-2 2 days ago

This is all very nuanced and there isn’t a clear cut answer. It really depends on what you’re running, for how long you’re running it, your device specs, etc. The LLMs I mentioned in the post did just fine and did not cause any overheating if not used for extended periods of time. You absolutely can run a SMALL LLM and not fry your processor if you don’t overdo it. Even then, I find it extremely unlikely that you’re going to cause permanent damage to your hardware components.

Of course that is something to be mindful of, but that’s not what the person in the original comment said. It does run, but you need to be aware of the limitations and potential consequences. That goes without saying, though.

Just don’t overdo it. Or do, but the worst thing that will happen is your phone getting hella hot and shutting down.

llama@lemmy.dbzer0.com · 2 days ago

For me the biggest benefits are:

Your queries don’t ever leave your computer
You don’t have to trust a third party with your data
You know exactly what you’re running
You can tweak most models to your liking
You can upload sensitive information to it and not worry about it
It works entirely offline
You can run several models

llama@lemmy.dbzer0.com · 2 days ago

I am not entirely sure, to be completely honest. In my experience, it is very little but it varies too. It really depends on how many people connect, for how long they connect, etc. If you have limited upload speeds, maybe it wouldn’t be a great idea to run it in your browser/phone. Maybe try running it directly on your computer using the -capacity flag?

I haven’t been able to find any specific numbers either, but I did find a post on the Tor Forum dated April 2023 or a user complaining about high bandwidth usage. This is not the norm in my experience, though.

llama@lemmy.dbzer0.com · 2 days ago

There are a few. There’s Private AI. It is free (as in beer) but it’s not libre (or open source). The app is a bit sketchy too, so I would still recommend doing as the tutorial says.

Out of curiosity, why do you not want to use a terminal for that?

llama@lemmy.dbzer0.com · 2 days ago

I don’t know that one. Is it FOSS?

llama@lemmy.dbzer0.com · edit-2 2 days ago

Thank you for pointing that out. That was worded pretty badly. I corrected it in the post.

For further clarification:

The person who is connecting to your Snowflake bridge is connecting to it in a p2p like connection. So, the person does know what your IP address is, and your ISP also knows that the person’s IP address is – the one that is connecting to your bridge.

However, to both of your ISPs, it will look like both of you are using some kind of video conferencing software, such as Zoom due to Snowflake using WebRTC technology, making your traffic inconspicuous and obfuscating to both of your ISPs what’s actually going on.

To most people, that is not something of concern. But, ultimately, that comes down to your threat model. Historically, there haven’t any cases of people running bridges or entry and middle relays and getting in trouble with law enforcement.

So, will you get in any trouble for running a Snowflake bridge? The answer is quite probably no.

For clarification, you’re not acting as an exit node if you’re running a snowflake proxy. Please, check Tor’s documentation and Snowflake’s documentation.

llama@lemmy.dbzer0.com · 2 days ago

Not true. If you load a model that is below your phone’s hardware capabilities it simply won’t open. Stop spreading fud.

llama@lemmy.dbzer0.com · edit-2 2 days ago

How to run LLaMA (and other LLMs) on Android.

llama@lemmy.dbzer0.com · edit-2 2 days ago

How to run LLaMA (and other LLMs) on Android.

llama@lemmy.dbzer0.com · 3 days ago

How to run LLaMA (and other LLMs) on Android.

llama@lemmy.dbzer0.com · 3 days ago

Though apparently I didn’t need step 6 as it started running after I downloaded it

Hahahha. It really is a little redundant, now that you mention it. I’ll remove it from the post. Thank you!

Good fun. Got me interested in running local LLM for the first time.

I’m very happy to hear my post motivated you to run an LLM locally for the first time! Did you manage to run any other models? How was your experience? Let us know!

What type of performance increase should I expect when I spin this up on my 3070 ti?

That really depends on the model, to be completely honest. Make sure to check the model requirements. For llama3.2:2b you can expect a significant performance increase, at least.

llama@lemmy.dbzer0.com · 3 days ago

Of course! I run several snowflake proxies across my devices and their browsers.

llama@lemmy.dbzer0.com · 3 days ago

I didn’t use an LLM to make the post. I did, however, use Claude to make it clearer since English is not my first language. I hope that answers your question.

llama@lemmy.dbzer0.com · 3 days ago

I have tried on more or less 5 spare phones. None of them have less than 4 GB of RAM, however.

llama@lemmy.dbzer0.com · edit-2 2 days ago

Help people trying to circumvent censorship by running a Snowflake proxy!

llama@lemmy.dbzer0.com · 3 days ago

Great explanation, Max!

llama@lemmy.dbzer0.com · 4 days ago

I would argue there would not be any noticeable differences.

llama@lemmy.dbzer0.com · edit-2 3 days ago

The performance may feel somewhat limited, but this is due to Android devices usually having less processing power compared to computers. However, for smaller models like the ones I mentioned, you likely won’t notice much of a difference when running them on a computer.

llama@lemmy.dbzer0.com · edit-2 3 days ago

How to run LLaMA (and other LLMs) on Android.

llama@lemmy.dbzer0.com · edit-2 1 month ago

That really depends on your threat model. The app isn’t monitoring your activity or has imbedded trackers. It pulls content directly from YouTube’s CDN. All they (Google) know is your IP address, but nothing else. For 99.9% of people that’s totally ok.

llama@lemmy.dbzer0.com · edit-2 1 month ago

My favorite anime website is down; good thing FMHY has a bunch of great ones to choose from. Migrating sucks, though.

llama@lemmy.dbzer0.com · 1 month ago

There’s a flatpak too, but it’s not good.

llama@lemmy.dbzer0.com · 1 month ago

Really? It’s been working just fine for me.