• 5 Posts
  • 121 Comments
Joined 1 year ago
cake
Cake day: July 1st, 2023

help-circle
  • This is the mid 21st century ma’am. Even the upper class crazy eccentric types with taxidermy black bird goth thrones are enjoying the freedom of sweatpants and t-shirts along side mainainable haircuts in public.

    People not feeling the need to wear goofy social posturing clothing like corsets and three piece suits or go through rivers of hair oil is truly a sign of progression. Now we all get to feel half assed at dressing up compared to our great grandparents.

    When were done with lazy casual I predict a move to get onto the cyberpunk techno phase. You know, really match our clothing with the mid grade dystopia vibes of our current global situation. Maybe conpare social status with cybernetic augmentations and hack job body mods.

    I bet replacing your legs with turbo tank treads then getting them serviced yearly will be the new Iphone. Cybernetic lungs that filter pollutants better than normal ones while extracting more oxygen will be the new Rolex. And of course have a yearly quarterly subscription fee.

    Then when we get bored of that the victorian dressup might come back after. But not before.











  • Hey @brucethemoose hope you don’t mind if I ding you one more time. Today I loaded up with qwen 14b and 32b. Yes, 32B (Q3_KS). I didn’t do much testing with 14B but it spoke well and fast. Was more excited to play with the 32B once I found out it would run to be honest. It just barely makes the mark of tolerable speed just under 2T/s (really more like 1.7 with some context loaded in). I really do mean barely, the people who think 5t/s is slow would eat their heart out. However that reasoning and coherence though? Off the charts. I like the way it speaks more than mistral small too. So wow just wow is all I can say. Can’t believe all the good models that came out in such a short time and leaps made in the past two months. Thank you again for recommending qwen don’t think I would have tried the 32B without your input.



  • Thanks for the recommendation. Today I tried out Mistral Small IQ4_XS in combination with running kobold through a headless terminal environment to squeeze out that last bit of vram. With that, the GPU layers offloaded were able to be bumped up from 28 to 34. The token speed went up from 2.7t/s to 3.7t/s which is like a 50% speed increase. I imagine going to Q3 would get things even faster or allow for a bump in context size.

    I appreciate you recommending Qwen too, ill look into it.