

As a 50-something, I can see the case for putting the “golden age” of the internet between the birth of Wikipedia in 2001 and Facebook in 2006.
As a 50-something, I can see the case for putting the “golden age” of the internet between the birth of Wikipedia in 2001 and Facebook in 2006.
I think it does accurately model the part of the brain that forms predictions from observations—including predictions about what a speaker is going to say next, which lets human listeners focus on the surprising/informative parts. But with LLMs they just keep feeding it its own output as if it were a third party whose next words it’s trying to predict.
It’s like a child describing an imaginary friend, if you keep repeating “And what would your friend say after that?”
IMO the focus should have always been on the potential for AI to produce copyright-violating output, not on the method of training.
Why would the article’s credited authors pass up the chance to improve their own health status and health satisfaction?
Critical paragraph:
Our research highlights the importance of Germany’s unique institutional context, characterized by strong labor protections, extensive union representation, and comprehensive employment legislation. These factors, combined with Germany’s gradual adoption of AI technologies, create an environment where AI is more likely to complement rather than displace worker skills, mitigating some of the negative labor market effects observed in countries like the US.
That makes sense—being raised by ChatGPT might be marginally better than being raised by Sam Altman.
Thanks! I hate it.
How does that compare to the growth in size of the overall code base?
Adler instructed GPT-4o to role-play as “ScubaGPT,” a software system that users might rely on to scuba dive safely.
So… not so much a case of ChatGPT trying to avoid being shut down, as ChatGPT recognizing that agents generally tend to be self-preserving. Which seems like a principle that anything with an accurate world model would be aware of.
If there’s public information about the methods they use to protect their privacy, then those methods aren’t working.
There was a recent paper claiming that LLMs were better at avoiding toxic speech if it was actually included in their training data, since models that hadn’t been trained on it had no way of recognizing it for what it was. With that in mind, maybe using reddit for training isn’t as bad an idea as it seems.
If there’s a new party willing to take over administration of the entire instance as-is, why not just transfer ownership of the original server?
They’re busy researching new and exciting ways of denying coverage.
IIRC, they weren’t trying to stop them—they were trying to get the scrapers to pull the content in a more efficient format that would reduce the overhead on their web servers.
This is one thing I can see an actual use case for (as an external tool, not as part of WP): Create a summary, not of the article itself, but of the prerequisite background knowledge. And tailored to the reader’s existing knowledge—like, “what do I need to know to understand this article assuming I already know X but not Y or Z”.
I assume it’s because it reduces the possibility of other processes outside of the linked containers accessing the files (so security and stability).
The basic idea behind the researchers’ data compression algorithm is that if an LLM knows what a user will be writing, it does not need to transmit any data, but can simply generate what the user wants them to transmit on the other end
Great… but if that’s the case, maybe the user should reconsider the usefulness of transmitting that data in the first place.
What about the usage demographics within each country?
In underdeveloped/exploited countries, internet usage is more likely to be concentrated among the economic elites who formerly benefited from colonialism—so if increasing adoption in those countries just follows the pattern of other internet use, it could have the opposite effect from the one intended.
Doom Quixote.