I've fucked around a bit with ChatGPT and while, yeah, it frequently says wrong or weird stuff, it's usually fairly subtle shit, like crap I actually had to look up to verify it was wrong.

Now I'm seeing Google telling people to put glue on pizza. That a bit bigger than getting the name of George Washington's wife wrong or Jay Leno's birthday off by 3 years. Some of these answers seem almost cartoonish in their wrongness I almost half suspect some engineer at Google is fucking with it to prove a point.

Is it just that Googles AI sucks? I've seen other people say that AI is now getting info from other AIs and it's leading to responses getting increasingly bizarre and wrong so... idk.

  • JohnBrownsBussy2 [he/him]
    ·
    edit-2
    3 months ago

    The LLM is just summarizing/paraphrasing the top search results, and from these examples, doesn't seem to be doing any self-evaluation using the LLM itself. Since this is for free and they're pushing it out worldwide, I'm guessing the model they're using is very lightweight, and probably couldn't reliably evaluate results if even they prompted it to.

    As for model collapse, I'd caution buying too much into model collapse theory, since the paper that demonstrated it was with a very case study (a model purely and repeatedly trained on its own uncurated outputs over multiple model "generations") that doesn't really occur in foundation model training.

    I'll also note that "AI" isn't a homogenate. Generally, (transformer) models are trained at different scales, with smaller models being less capable but faster and more energy efficient, while larger flagship models are (at least, marketed as) more capable despite being slow, power- and data-hungry. Almost no models are trained in real-time "online" with direct input from users or the web, but rather with vast curated "offline" datasets by researchers/engineers. So, AI doesn't get information directly from other AIs. Rather, model-trainers would use traditional scraping tools or partner APIs to download data, do whatever data curation and filtering they do, and they then train the models. Now, the trainers may not be able to filter out AI content, or they may intentional use AI systems to generate variations on their human-curated data (synthetic data) because they believe it will improve the robustness of the model.

    EDIT: Another way that models get dumber, is that when companies like OpenAI or Google debut their model, they'll show off the full-scale, instruct-finetuned foundation model. However, since these monsters are incredibly expensive, they use these foundational models to train "distilled" models. For example, if you use ChatGPT (at least before GPT-4o), then you're using either GPT3.5-Turbo (for free users), or GPT4-Turbo (for premium users). Google has recently debuted its own Gemini-Flash, which is the same concept. These distilled models are cheaper and faster, but also less capable (albeit potentially more capable than if you trained model from scratch at that reduced scale).