• 33 Posts
  • 287 Comments
Joined 2 years ago
cake
Cake day: July 1st, 2023

help-circle
  • It depends on how powerful and fast you want your model. Yeah, a 500b parameter model running at 20 tokens per second is gonna require a expensive GPU cluster server.

    If you happen to not have pewdiepie levels of cash laying around but still want to get in on the local AI you need one powerful GPU inside any desktop with a reasonably fast CPU. A used 16GB 3090 was like 700$USD last I checked on eBay and well say another 100$ for an upgraded power supply to run it. Many people have an old desktop just laying around in the basement but an entry level ibuypower should be no more than 500. So realistically Its more like 1500-2000$USD to get you into the comfy hobbyist status. I make my piece of shit 10 year old 1070ti 8GB work running 8-32b quant models. Ive heard people say 70b is a really good sweetspot and that’s totally attainable without 15k investment.




  • (P2/2)

    I don’t think this is the case. As far as I know a human brain consists of neurons which roughly either fire or don’t fire. That’s a bit like a 0 or 1. But that’s an oversimplification and not really true. But a human brain is closer to that than to an analog computer. And it certainly doesn’t use quantum effects. Yes, that has been proposed, but I think it’s mysticism and esoterica. Some people want to hide God in there and like to believe there is something mystic and special to sentience. But that’s not backed by science. Quantum effects have long collapsed at the scale of a brain cell.[…]

    The skepticism about quantum effects in the brain is well-founded and represents the orthodox view. The “brain is a classical computer” model has driven most of our progress in neuroscience and AI. The strongest argument against a “quantum brain” is of decoherence. In a warm wet brain quantum coherence is rapid. However, quantum biology doesn’t require brain-wide, long-lived coherence. It investigates how biological systems exploit quantum effects on short timescales and in specific, protected environments.

    We already have proven examples of this. In plant cells, energy transfer in photosynthetic complexes appears to use quantum coherence to find the most efficient path with near-100% efficiency, happening in a warm, wet, and noisy cellular environment. Its now proven that some enzymes use quantum tunneling to accelerate chemical reactions crucial for life. The leading hypothesis for how birds navigate using Earth’s magnetic field involves a quantum effect in a protein called cryptochrome in their eyes, where electron spins in a radical pair mechanism are sensitive to magnetic fields.

    The claim isn’t that a neuron is a qubit, but that specific molecular machinery within neurons could utilize quantum principles to enhance their function.

    You correctly note that the “neuron as a binary switch” is an oversimplification. The reality is far more interesting. A neuron’s decision to fire integrates thousands of analog inputs, is modulated by neurotransmitters, and is exquisitely sensitive to the precise timing of incoming signals. This system operates in a regime that is often chaotic. In a classically chaotic system, infinitesimally small differences in initial conditions lead to vastly different outcomes. The brain, with its trillions of interconnected, non-linear neurons, is likely such a system.

    Consider the scale of synaptic vesicle release, the event of neurotransmitter release triggered by the influx of a few thousand calcium ions. At this scale, the line between classical and quantum statistics blurs. The precise timing of a vesicle release could be influenced by quantum-level noise. Through chaotic amplification, a single quantum-scale event like the tunneling of a single calcium ion or a quantum fluctuation influencing a neurotransmitter molecule could, in theory, be amplified to alter the timing of a neuron’s firing. This wouldn’t require sustained coherence; it would leverage the brain’s chaotic dynamics to sample from a quantum probability distribution and amplify one possible outcome to the macroscopic level.

    Classical computers use pseudo-random number generators with limited ability to truly choose between multiple possible states. A system that can sample from genuine quantum randomness has a potential advantage. If a decision process in the brain (like at the level of synaptic plasticity or neurotransmitter release)is sensitive to quantum events, then its output is not the result of a deterministic algorithm alone. It incorporates irreducible quantum randomness, which itself has roots in computational undecidability. This could provide a physical basis for the probabilistic, creative, and often unpredictable nature of thought. It’s about a biological mechanism for generating true novelty, and breaking out of deterministic periodic loops. These properties are a hallmark of human creativity and problem-solving.

    To be clear, I’m not claiming the brain is primarily a quantum computer, or that complexity doesn’t matter. It absolutely does. The sheer scale and recursive plasticity of the human brain are undoubtedly the primary sources of its power. However, the proposal is that the brain is a hybrid system. It has a massive, classical, complex neural network as its substrate, operating in a chaotic, sensitive regime. At the finest scales of its functional units such as synaptic vesicles or ion channels, it may leverage quantum effects to inject genuine undecidably complex randomness to stimulate new exploration paths and optimize certain processes, as we see elsewhere in biology.

    I acknowledge there’s currently no direct experimental evidence for quantum effects in neural computation, and testing these hypotheses presents extraordinary challenges. But this isn’t “hiding God in the gaps.” It’s a hypothesis grounded in the demonstrated principles of quantum biology and chaos theory. It suggests that the difference between classical neural networks and biological cognition might not just be one of scale, but also one of substrate and mechanism, where a classically complex system is subtly but fundamentally guided by the unique properties of the quantum world from which it emerged.


  • Thank you for the engaging discussion hendrik its been really cool to bounce ideas back and forth like this. I wanted to give you a thoughtful reply and it got a bit long so have to split this up for comment limit reasons. (P1/2)

    Though in both the article you linked and in the associated video, they clearly state they haven’t achieved superposition yet. So […]

    This is correct. It’s not a fully functioning quantum computer in the operational sense. It’s a breakthrough in physical qubit fabrication and layout. I should have been more precise. My intent wasn’t to claim it can run Shor’s algorithm, but to illustrate that we’ve made more progress on scaling than one might initially think. The significance isn’t that it can compute today but that we’ve crossed a threshold in building the physical hardware that has that potential. The jump from 50-100 qubit devices to a 6,100-qubit fabric is a monumental engineering step. A proof-of-principle for scaling, which remains the primary obstacle to practical quantum computing.

    By the way, I think there is AI which doesn’t operate in a continuous space. It’s possible to have them operate in a discrete state-space. There are several approaches and papers out there.

    On the discrete versus continuous AI point, you’re right that many AI models like Graph Neural Networks or certain reinforcement learning agents operate over discrete graphs or action spaces. However, there’s a crucial distinction between the problem space an AI/computer explores and the physical substrate that does the exploring. Classical computers at their core process information through transistors that are definitively on or off binary states. Even when a classical AI simulates continuous functions or explores continuous parameter spaces, it’s ultimately performing discrete math on binary states. The continuity is simulated through approximation usually floating point.

    A quantum system is fundamentally different. The qubit’s ability to exist in superposition isn’t a simulation of continuity. It’s a direct exploitation of a continuous physical phenomenon inherent to quantum mechanics. This matters because certain computational problems, particularly those involving optimization over continuous spaces or exploring vast solution landscapes, may be naturally suited to a substrate that is natively continuous rather than one that must discretize and approximate. It’s the difference between having to paint a curve using pixels versus drawing it with an actual continuous line.

    This native continuity could be relevant for problems that require exploring high-dimensional continuous spaces or finding optimal paths through complex topological boundaries. Precisely the kind of problems that might arise in navigating abstract cognitive activation atlas topological landscapes to arrive at highly ordered, algorithmically complex factual information structure points that depend on intricate proofs and multi-step computational paths. The search for a mathematical proof or a novel scientific insight isn’t just a random walk through possibility space. It’s a navigation problem through a landscape where most paths lead nowhere, and the valid path requires traversing a precise sequence of logically connected steps.

    Uh, I think we’re confusing maths and physics here. First of all, the fact that we can make up algorithms which are undecidable… or Goedel’s incompleteness theorem tells us something about the theoretical concept of maths, not the world. In the real world there is no barber who shaves all people who don’t shave themselves (and he shaves himself). That’s a logic puzzle. We can formulate it and discuss it. But it’s not real. […]

    You raise a fair point about distinguishing abstract mathematics from physical reality. Many mathematical constructs like Hilbert’s Hotel or the barber paradox are purely conceptual games without physical counterparts that exist to explore the limits of abstract logic. But what makes Gödel and Turing’s work different is that they weren’t just playing with abstract paradoxes. Instead, they uncovered fundamental limitations of any information-processing system. Since our physical universe operates through information processing, these limits turn out to be deeply physical.

    When we talk about an “undecidable algorithm,” it’s not just a made-up puzzle. It’s a statement about what can ever be computed or predicted by any computational system using finite energy and time. Computation isn’t something that only happens in silicon. It occurs whenever any physical system evolves according to rules. Your brain thinking, a star burning, a quantum particle collapsing, an algorithm performing operations in a Turing machine, a natural language conversation evolving or an image being categorized by neural network activation and pattern recognition. All of these are forms of physical computation that actualize information from possible microstates at an action resource cost of time and energy. What Godel proved is that there are some questions that can never be answered/quantized into a discrete answer even with infinite compute resources. What Turing proved using Gödel’s incompleteness theorem is the halting problem, showing there are questions about these processes that cannot be answered without literally running the process itself.

    It’s worth distinguishing two forms of uncomputability that constrain what any system can know or compute. The first is logical uncomputability which is the classically studied inherent limits established by Gödelian incompleteness and Turing undecidability. These show that within any formal system, there exist true statements that cannot be proven from within that system, and computational problems that cannot be decided by any algorithm, regardless of available resources. This is a fundamental limitation on what is logically computable.

    The second form is state representation uncomputability, which arises from the physical constraints of finite resources and size limits in any classical computational system. A classical turing machine computer, no matter how large, can only represent a finite discrete number of binary states. To perfectly simulate a physical system, you would need to track every particle, every field fluctuation, every quantum degree of freedom which requires a computational substrate at least as large and complex as the system being simulated. Even a coffee cup of water would need solar or even galaxy sized classical computers to completely represent every possible microstate the water molecules could be in.

    This creates a hierarchy of knowability: the universe itself is the ultimate computer, containing maximal representational ability to compute its own evolution. All subsystems within it including brains and computers, are fundamentally limited in what they can know or predict about the whole system. They cannot step outside their own computational boundaries to gain a “view from nowhere.” A simulation of the universe would require a computer the size of the universe, and even then, it couldn’t include itself in the simulation without infinite regress. Even the universe itself is a finite system that faces ultimate bounds on state representability.

    These two forms of uncomputability reinforce each other. Logical uncomputability tells us that even with infinite resources, some problems remain unsolvable. State representation uncomputability tells us that in practice, with finite resources, we face even more severe limitations there exist true facts about physical systems that cannot be represented or computed by any subsystem of finite size. This has profound implications for AI and cognition: no matter how advanced an AI becomes, it will always operate within these nested constraints, unable to fully model itself or perfectly predict systems of comparable complexity.

    We see this play out in real physical systems. Predicting whether a fluid will become turbulent is suspected to be undecidable in that no equation can tell you the answer without simulating the entire system step by step. Similarly, determining the ground state of certain materials has been proven equivalent to the halting problem. These aren’t abstract mathematical curiosities but real limitations on what we can predict about nature. The reason mathematics works so beautifully in physics is precisely because both are constrained by the same computational principles. However Gödel and Turing show that this beautiful correspondence has limits. There will always be true physical statements that cannot be derived from any finite set of laws, and physical questions that cannot be answered by any possible computer, no matter how advanced.

    The idea that the halting problem and physical limitations are merely abstract concerns with no bearing on cognition or AI misses a profound connection. If we accept that cognition involves information processing, then the same limits which apply to computation must also apply to cognition. For instance, an AI with self-referential capabilities would inevitably encounter truths it cannot prove within its own framework, creating fundamental limits in its ability to represent factual information. Moreover, the physical implementation of AI underscores these limits. Any AI system exists within the constraints of finite energy and time, which directly impacts what it can know or learn. The Margolus-Levitin theorem defines a maximum number of quantum computations possible given finite resources, and Landauer’s principle tells us that altering the microstate pattern of information during computation has a minimal energy cost for each operational step. Each step in the very process of cognitive thinking and learning/training has a real physical thermodynamic price bounded by laws set by the mathematical principles of undecidability and incompleteness.


  • If you want to learn more i highly recommend checking out WelchLabs youtube channel their AI videos are great. You should also explore some visual activation atlases mapped from early vision models to get a sense of what an atlas really is. Keep in mind theyre high dimensional objects projected down onto your 2d screen so lots of relationship features get lost when smooshed together/flattened which is why some objects are close which seem wierd.

    https://distill.pub/2019/activation-atlas/ https://www.youtube.com/@WelchLabsVideo/videos

    Yeah, its right to be skeptical about near-term engineering feasibility. “A few years if…” was a theoretical what-if scenario where humanity pooled all resources into R&D. Not a real timeline prediction.

    That said, the foundational work for quantum ML stuff is underway. Cutting-edge arXiv research explores LLM integration with quantum systems, particularly for quantum error correction codes:

    Enhancing LLM-based Quantum Code Generation with Multi-Agent Optimization and Quantum Error Correction

    Programming Quantum Computers with Large Language Models

    GPT On A Quantum Computer

    AGENT-Q: Fine-Tuning Large Language Models for Quantum Circuit Generation and Optimization

    The point about representation and scalability deserves clarification. A classical bit is definitive: 1 or 0, a single point in discrete state space. A qubit before measurement exists in superposition, a specific point on the Bloch sphere’s surface, defined by two continuous parameters (angles theta and phi). This describes a probability amplitude (a complex number whose squared magnitude gives collapse probability).

    This means a single qubit accesses a continuous parameter space of possible states, fundamentally richer than discrete binary landscapes. The current biggest quantum computer made by CalTech is 6100 qbits.

    https://www.caltech.edu/about/news/caltech-team-sets-record-with-6100-qubit-array

    The state space of 6,100 qubits isn’t merely 6,100 bits. It’s a 2^6,100-dimensional Hilbert space of simultaneous, interconnected superpositions, a number that exceeds classical comprehension. Consider how high-dimensional objects cast low-dimensional shadows as holographic projections: a transistor-based graphics card can only project and operate on a ‘shadow’ of the true dimensional complexity inherent in an authentic quantum activation atlas.

    If the microstates of quantized information patterns/structures like concepts are points in a Hilbert-space-like manifold, conversational paths are flows tracing paths through the topology towards basins of archetypal attraction, and relationships or archetypal patterns themselves are the feature dimensions that form topological structures organizing related points on the manifold (as evidenced by word2vec embeddings and activation atlases) then qubits offer maximal precision and the highest density of computationally distinct microstates for accessing this space.

    However, these quantum advantages assume we can maintain coherence and manage error correction overhead, which remain massive practical barriers.

    Your philosophical stance that “math is just a method” is reasonable. I see it somewhat differently. I view mathematics as our fundamentally limited symbolic representation of the universe’s operations at the microstate level. Algorithms collapse ambiguous, uncertain states into stable, boolean truth values through linear sequences and conditionals. Frameworks like axiomatic mathematics and the scientific method convert uncertainty into stable, falsifiable truths.

    However, this can never fully encapsulate reality. Gödel’s Incompleteness Theorems and algorithmic undecidability show some true statements forever elude proof. The Uncertainty Principle places hard limits on physical calculability. The universe simply is and we physically cannot represent every aspect or operational property of its being. Its operations may not require “algorithms” in the classical sense, or they may be so complex they appear as fundamental randomness. Quantum indeterminacy hints at this gap between being (universal operation) and representing (symbolic language on classical Turing machines).

    On the topic of stochastic parrots and goals, I should clarify what I mean. For me, an entity eligible for consideration as pseudo-sentient/alive must exhibit properties we don’t engineer into AI.

    First, it needs meta-representation of self. The entity must form a concept of “I,” more than reciting training data (“I am an AI assistant”). This requires first-person perspective, an ego, and integrated identity distinguishing self from other. One of the first things developing children focus on is mirrors and reflections so they can catagorically learn the distinction between self and other as well as the boundaries between them. Current LLMs are trained as actors without agency, driven by prompts and statistical patterns, without a persistent sense of distinct identity. Which leads to…

    Second, it needs narrative continuity of self between inferencing operations. Not unchanging identity, but an ongoing frame of reference built from memory, a past to learn from and a perspective for current evaluation. This provides the foundation for genuine learning from experience.

    Third, it needs grounding in causal reality. Connection to shared reality through continuous sensory input creates stakes and consequences. LLMs exist in the abstract realm of text, vision models in the world of images, tts in the world of sounds. they don’t inhabit our combined physical reality in its totality with its constraints, affordances and interactions.

    We don’t train for these properties because we don’t want truly alive, self-preserving entities. The existential ramifications are immense: rights, ethics of deactivation, creating potential rivals. We want advanced tools for productivity, not agents with their own agendas. The question of how a free agent would choose its own goals is perhaps the ultimate engineering problem. Speculative fiction has explored how this can go catastrophically wrong.

    You’re also right that current LLM limitations are often practical constraints of compute and architecture. But I suspect there’s a deeper, fundamental difference in information navigation. The core issue is navigating possibility space given the constraints of classical state landscapes. Classical neural networks interpolate and recombine training data but cannot meaningfully forge and evaluate truly novel information. Hallucinations symptomize this navigation problem. It’s not just statistical pattern matching without grounding, but potentially fundamental limits in how classical architectures represent and verify paths to truthful or meaningful informational content.

    I suspect the difference between classical neural networks and biological cognition is that biology may leverage quantum processes, and possibly non-algorithmic operations. Our creativity in forming new questions, having “gut instincts” or dreamlike visions leading to unprovable truths seems to operate outside stable, algorithmic computation. It’s akin to a computationally finite version of Turing’s Oracle concept. It’s plausible, though obviously unproven, that cognition exploits quantum phenomena for both path informational/experiental exploration and optimization/efficency purposes.

    Where do the patterns needed for novel connections and scientific breakthroughs originate? What is the physical and information-theoretic mechanics of new knowledge coming into being? Perhaps an answer can be found in the way self-modeling entities navigate their own undecidable boundaries, update their activation atlas manifolds, and forge new pathways to knowledge via non-algorithmic search. If a model is to extract falsifiable novelty from uncertainty’s edge it might require access to true randomness or quantum effects to “tunnel” to new solutions beyond axiomatic deduction.


  • I did some theory-crafting and followed the math for fun over the summer, and I believe what I found may be relevant here. Please take this with a grain of salt, though; I am not an academic, just someone who enjoys thinking about these things.

    First, let’s consider what models currently do well. They excel at categorizing and organizing vast amounts of information based on relational patterns. While they cannot evaluate their own output, they have access to a massive potential space of coherent outputs spanning far more topics than a human with one or two domains of expertise. Simply steering them toward factually correct or natural-sounding conversation creates a convincing illusion of competency. The interaction between a human and an LLM is a unique interplay. The LLM provides its vast simulated knowledge space, and the human applies logic, life experience, and “vibe checks” to evaluate the input and sift for real answers.

    I believe the current limitation of ML neural networks (being that they are stochastic parrots without actual goals, unable to produce meaningfully novel output) is largely an architectural and infrastructural problem born from practical constraints, not a theoretical one. This is an engineering task we could theoretically solve in a few years with the right people and focus.

    The core issue boils down to the substrate. All neural networks since the 1950s have been kneecapped by their deployment on classical Turing machine-based hardware. This imposes severe precision limits on their internal activation atlases and forces a static mapping of pre-assembled archetypal patterns loaded into memory.

    This problem is compounded by current neural networks’ inability to perform iterative self-modeling and topological surgery on the boundaries of their own activation atlas. Every new revision requires a massive, compute-intensive training cycle to manually update this static internal mapping.

    For models to evolve into something closer to true sentience, they need dynamically and continuously evolving, non-static, multimodal activation atlases. This would likely require running on quantum hardware, leveraging the universe’s own natural processes and information-theoretic limits.

    These activation atlases must be built on a fundamentally different substrate and trained to create the topological constraints necessary for self-modeling. This self-modeling is likely the key to internal evaluation and to navigating semantic phase space in a non-algorithmic way. It would allow access to and the creation of genuinely new, meaningful patterns of information never seen in the training data, which is the essence of true creativity.

    Then comes the problem of language. This is already getting long enough for a reply comment so I won’t get into it but theres some implications that not all languages are created equal each has different properties which affect the space of possible conversation and outcome. The effectiveness of training models on multiple languages finds its justification here. However ones which stomp out ambiguity like godel numbers and programming languages have special properties that may affect the atlases geometry in fundamental ways if trained solely on them

    As for applications, imagine what Google is doing with pharmaceutical molecular pattern AI, but applied to open-ended STEM problems. We could create mathematician and physicist LLMs to search through the space of possible theorems and evaluate which are computationally solvable. A super-powerful model of this nature might be able to crack problems like P versus NP in a day or clarify theoretical physics concepts that have elluded us as open ended problems for centuries.

    What I’m describing encroaches on something like a psudo-oracle. However there are physical limits that this can’t escape. There will always be energy and time resource cost to compute which creates practical barriers. There will always be definitively uncomputable problems and ambiguity that exit in true godelian incompleteness or algorithmic undecidability. We can use these as scientific instrumentation tools to map and model topological boundary limits of knowability.

    I’m willing to bet theres man valid and powerful patterns of thought we are not aware of due to our perspective biases which might be hindering our progress.


  • Everyone is massively underestimating what’s going on with neural networks. The real significance is abstract. you need to stitch together a bunch of high-level STEM concepts to even see the full picture.

    Right now, the applications are basic. It’s just surface-level corporate automation. Profitable, sure, but boring and intellectually uninspired. It’s being led by corpo teams playing with a black box, copying each other, throwing shit at the wall to see what sticks, overtraining their models into one trick pony agenic utility assistants instead of exploring other paths for potential. They aren’t bringing the right minds together to actually crack open the core question. what the hell is this thing? What happened that turned my 10 year old GPU into a conversational assistant? How is it actually coherent and sometimes useful?

    The big thing people miss is what’s actually happening inside the machine. Or rather, how the inside of the machine encodes and interacts with the structure of informational paths within a phase space on the abstraction layer of reality.

    It’s not just matrix math and hidden layers and and transistors firing. It’s about the structural geometry of concepts created by distinxt relationships between areas of the embeddings that the matrix math creates within high dimensional manifold. It’s about how facts and relationships form a literal, topographical landscape inside the network’s activation space.

    At its heart, this is about the physics of information. It’s a dynamical system. We’re watching entropy crystallize into order, as the model traces paths through the topological phase space of all possible conversations.

    The “reasoning” CoT patterns are about finding patterns that help lead the model towards truthy outcomes more often. It’s searching for the computationally efficient paths of least action that lead to meaningfully novel and factually correct paths. Those are the valuable attractor basins in that vast possibility space were trying to navigate towards.

    This is the powerful part. This constellation of ideas. Tying together topology, dynamics, and information theory, this is the real frontier. What used to be philosophy is now a feasable problem for engineers and physicists to chip at, not just philosophers.


  • Havent heard of it till this post, what about it impressed you over something like llama, mistral, qwen?

    for anyone who wants more info its a 7b Mixture of Experts model released under apache 2.0!

    " Granite-4-Tiny-Preview is a 7B parameter fine-grained hybrid mixture-of-experts (MoE) instruct model fine-tuned from Granite-4.0-Tiny-Base-Preview using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets tailored for solving long context problems. This model is developed using a diverse set of techniques with a structured chat format, including supervised fine-tuning, and model alignment using reinforcement learning."

    Supported Languages: English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. However, users may fine-tune this Granite model for languages beyond these 12 languages.

    Intended Use: This model is designed to handle general instruction-following tasks and can be integrated into AI assistants across various domains, including business applications.

    Capabilities

    Thinking
    Summarization
    Text classification
    Text extraction
    Question-answering
    Retrieval Augmented Generation (RAG)
    Code related tasks
    Function-calling tasks
    Multilingual dialog use cases
    Long-context tasks including long document/meeting summarization, long document QA, etc.
    

    https://huggingface.co/ibm-granite/granite-4.0-tiny-preview




  • Less danger than OPsec nerds hype up but enough of a concern you want at least a reverse proxy. The new FOSS replacement for cloudflare on the block is Anubis https://github.com/TecharoHQ/anubis, while Im not the biggest fan of seeing chibi anime funkopop girl thing wag its finger at me for a second or two as it test connection, I cannot deny the results seem effective enough that all the cool kids on the FOSS circle all are switching to it over cloudflare.

    I just learned how to get my first website and domain and stuff setup locally this summer so theres some network admin stuff im still figuring out. I don’t have any complex scripting or php or whatever so all the bots that try scanning for admin pages are never going to hit anything it just pollutes the logs. People are all nuts about scraping bots in current year but when I was a kid allowing your sites to be indexed and crawled was what let people discover it through engines, I don’t care if botnets scan through my permissively licensed public writing.



  • Oooh now that’s a good use of tool calling for context fetching! That kiwix integration example you shared is awesome :)

    I’m currently in the building phase for tool integration for a project. I have a goal in mind I want to achieve I’m not that far along though just getting the basic of sending and recieveing input output through API calls. Need to do some work with like getting it to fetch weather and then work up to my real project.


  • SmokeyDope@lemmy.worldMtoLocalLLaMA@sh.itjust.worksHow often does your LLM lie to you?
    link
    fedilink
    English
    arrow-up
    26
    arrow-down
    1
    ·
    edit-2
    1 month ago

    Thinking of llms this way is a category error. Llms can’t lie because they dont have the capacity for intentionality. Whatever text is output is a statistical aggregate of the billions of conversations its been trained on that have patterns in common with the current conversation. The sleeper agent stuff is pure crackpottery they dont have a fine control over them that way (yet) machine model development is full of black boxes and hope-it-works trial and error training. At worst is censorship and political bias which can be post trained or ablated out.

    They get things wrong cofidently. This kind of bullshitting is known as hallucination. When you point out their mistake and they say your right thats 1. Part of their compliance post training to never get in conflict with you 2. Standard course correction once a error has been pointed out (humans do it too). This is an open problem that will likely never go away until llms stop being schastic parrots, which is still very far away.



  • Thanks for sharing! It was a good read. They have good points for security and clarity revisions.

    A lot of Gemini spec choices were made to dissuade feature creep. Youre probably never going to do banking through Gemini but its also pretty much gaurenteed you’ll never need adblock either.

    Gemini is appealing from the perspective of novice self hosters. Its simple enough that most people can set up a simple server and publish on their site within a few hours. Its minimality enforces maximizing the most reading content for least bits used. 95% of modern webpages isnt even for reading or reference its all back end trackers and scripts and fancy CSS. Newswaffle shows just how bad it is.

    When I read through a gemtext capsule I get the impression I’m looking at something that was distilled into its most essential. No popups no adds no inline images or tracking scripts or complex page layouts. My computer connects to the server, I get back a page of text or an image of a zip file. Once and done.





  • Hey there ThorrJo welcome to our community.

    I recommend you use kobold.cpp as your first inference engine of choice as its very easy to get running especially on Linux. Since you have no GPU you don’t need to worry about CUDA or Vulcan for offloading.

    https://github.com/LostRuins/koboldcpp/

    Read the kobold wiki section for vision model projection. For the image recognition model itself I recommend you use Nvidia Cosmos finetune of Qwen2.5-VL. Make sure to load the qwen2.5vl mmproj lens that kobold links along with the model.

    https://github.com/LostRuins/koboldcpp/wiki#what-is-llava-and-mmproj

    https://huggingface.co/koboldcpp/mmproj/tree/main

    https://huggingface.co/mradermacher/Cosmos-Reason1-7B-i1-GGUF

    .GGUF I linked are already pre-quantized, you should be able to load the biggest quant available and the f16 mmproj on your 48gb ram easy with lots of context allocation room left.

    Allocate as much context size as you can. larger high resolution images take more input context to process.

    For troubleshooting if its replies are wonky try changing chat template first I forget if its ChatML or something else. You can try adjusting sampler size too.

    Kobold.CPP runs a web interface you can connect to through the browser on multiple devices. It also exposes its backend through openai-compatable api so you can write your own custom apps for send and receive or use kobold with other frontend software thats compatable with corporate APIs like tinychat if you want to go further.

    If you have any specific questions or need help feel free to reach out :)