• ☆ Yσɠƚԋσʂ ☆@lemmygrad.ml
    ·
    edit-2
    3 months ago

    It is imperative to note that the output generated by LLMs is a direct reflection of the data they are trained on. The models' outputs are unavoidably influenced by the inherent biases present within the datasets that were fed into it. The types of responses models trained on western mainstream media produce is undeniable evidence of these biases. It's hilarious how liberals are unable to recognize this, but will inevitably moan that a model trained on a different set of data is biased. 🙃

    • loathesome dongeater@lemmygrad.ml
      hexagon
      ·
      3 months ago

      Biases are also coded into the LLM services after the model has been prepared. I am not sure about the exact mechanism but I once saw a GitHub that contained some reverse engineered prompts for this.

      Even with GPTs, you could make them less lib if the prompt contains something like "you are a helpful assistant that is aware of the hegemonic biases in western media and narratives". Personalities are also baked in this way. For example, I tried reasoning with a couple services about laws and regulations around the financial economy mean diddly squat seeing how there is stuff like the 2008 crash and evidence of American politicians trading on the basis of insider information. GPT 3.5 Turbo uses therapy-speak to me like I am a crazy person while Claude 3 Haiku ends up agreeing with me like a spineless yes-man after starting off as a lib. With GPT I am convinced that it is programmed directly or indirectly to uphold the status quo.

      • ☆ Yσɠƚԋσʂ ☆@lemmygrad.ml
        ·
        3 months ago

        Yeah, these things are not fundamentally different from Markov chains. Basically, it has a huge multidimensional graph of tokens, and all it's doing is predicting the next likely token. So, when you introduce specific tokens into the input then it helps focus it in a particular direction.

      • amemorablename@lemmygrad.ml
        ·
        3 months ago

        Yeah, the ones that have been designed to be sanitized "assistants" go through a lot of additional tuning. And unsurprisingly, capitalist exploitation has played a part in it before: https://www.vice.com/en/article/wxn3kw/openai-used-kenyan-workers-making-dollar2-an-hour-to-filter-traumatic-content-from-chatgpt