Judge on Meta’s AI training: “I just don’t understand how that can be fair use”
Ars Technica
—
02/05/2025
Large language models (LLMs) are trained on huge amounts of data, but companies rarely explain exactly what data they use. This makes it hard to trust these models, since bad data can lead to wrong answers. There’s also a legal problem: Is it allowed to use free online content (like books or articles) for training, or is it stealing? This article describes the case against Meta, which is being sued by authors who say their books were used without permission. Meta insists that their action falls under 'fair use', but If courts decide this is 'copyright infringement', companies building LLMs might have to completely change how they collect data.