

Ive used roocode on vscodium with kobold. The problem is most small local models dont have ingrained ability to use Cline tools correctly. You should do some looking around for models that specifically advertise cline tool calling like this one https://ollama.com/acidtib/qwen2.5-coder-cline:7b
When connecting to vscode and roo Cline make sure to use full IP address and ports, also you can put in random string for api key. Make sure its connected through openai compatable api









It depends on how powerful and fast you want your model. Yeah, a 500b parameter model running at 20 tokens per second is gonna require a expensive GPU cluster server.
If you happen to not have pewdiepie levels of cash laying around but still want to get in on the local AI you need one powerful GPU inside any desktop with a reasonably fast CPU. A used 16GB 3090 was like 700$USD last I checked on eBay and well say another 100$ for an upgraded power supply to run it. Many people have an old desktop just laying around in the basement but an entry level ibuypower should be no more than 500. So realistically Its more like 1500-2000$USD to get you into the comfy hobbyist status. I make my piece of shit 10 year old 1070ti 8GB work running 8-32b quant models. Ive heard people say 70b is a really good sweetspot and that’s totally attainable without 15k investment.