A lemmy nomad. Wish there was a way to migrate posts and comments from .world to .ml to here… 😪

  • 0 Posts
  • 12 Comments
Joined 2 months ago
cake
Cake day: March 14th, 2025

help-circle





  • will@lemm.eetoLocalLLaMA@sh.itjust.worksSpecialize LLM
    link
    fedilink
    English
    arrow-up
    1
    ·
    8 days ago

    Making your own embeddings is for RAG. Most base model providers have standardized on OpenAIs embeddings scheme, but there are many ways. Typically you embed a few tokens worth of data at a time and store that in your vector database. This lets your AI later do some vector math (usually cosine similarity search) to see how similar (related) the embeddings are to each other and to what you asked about. There are fine tuning schemes where you make embeddings before the tuning as well but most people today use whatever fine tuning services their base model provider offers, which usually has some layers of abstraction.


  • will@lemm.eetoLocalLLaMA@sh.itjust.worksSpecialize LLM
    link
    fedilink
    English
    arrow-up
    7
    ·
    9 days ago

    The easiest option for a layperson is retrieval augmented generation, or RAG. Basically you encode your books and upload them into a special kind of database and then tell a regular base model LLM to check the data when making an answer. I know ChatGPT has a built in UI for this (and maybe anthropic too) but you can also build something out using Langchain or OpenWebUi and the model of your choice.

    The next step up from there is fine tuning, where you kinda retrain a base model on your books. This is more complex and time consuming but can give more nuanced answers. It’s often done in combination with RAG for particularly large bodies of information.