Why AI Companies Are Getting Sued Over Your Data – And What You Can Do

Every time you post a photo, write a review, or comment on a public forum, you are generating content that could end up inside an artificial intelligence model. That is not speculation—many AI systems are trained on vast amounts of text and images scraped from the open web, often without the knowledge or explicit consent of the people who created that content.

In recent months, a growing number of lawsuits have challenged this practice. The core argument is straightforward: using someone’s publicly available work to train a commercial AI product may violate copyright law, privacy rights, or both. For everyday internet users, these legal cases are more than abstract corporate battles. They signal that the way companies collect and use your data is becoming a material risk—and that your rights may be less protected than you assume.

What happened

The National Law Review recently published an analysis titled “From Privacy Compliance to AI Governance: Why the Source of Training Data Is Becoming a Central Legal Risk,” which explains how courts and regulators are starting to scrutinise the origins of training datasets. Several class-action lawsuits have been filed against major AI developers, alleging that they scraped personal photos, written works, and other copyrighted material from the web without permission.

These cases are not limited to celebrity authors or famous artists. Any person whose public-facing content was included in a training corpus could, in theory, be part of a class. The legal theory is still being tested—some courts may rule that scraping public data is fair use; others may decide it requires consent. The point is that the uncertainty itself is creating pressure on companies to change how they operate.

Why it matters

If you have ever uploaded a photo to a social network, posted a blog, or left a product review, there is a chance that your content was used to train an AI model you never agreed to. Many companies do not disclose exactly which datasets they used, and even when they do, the opt-out mechanisms are often buried in privacy settings or introduced only after public backlash.

The practical consequence is that your personal information—including your name, face, and writing style—may be reproduced, paraphrased, or analysed by AI systems without your control. In some cases, people have discovered that their private medical forum posts or confidential messages have been incorporated into public datasets. The legal pushback is partly about compensation, but it is also about the principle of informed consent.

What you can do

The legal landscape is still shifting, but there are concrete steps you can take today to reduce the risk of your data being used without your knowledge.

  • Review privacy settings on platforms you use. Some social media and content-hosting sites now offer a toggle that prevents your data from being used for AI training. Look for options labelled “data sharing,” “model training,” or “research use.”
  • Use opt-out forms where available. Several AI companies have published forms that let you request that your content be excluded from future training runs. Keep in mind that these are voluntary and may not remove data already ingested.
  • Be cautious about what you post publicly. If you are concerned, consider limiting the visibility of your personal photos, writings, and even location check-ins. Encrypted or private channels offer more protection.
  • Stay informed about new regulations. Laws such as the EU AI Act and state-level privacy bills in the US are beginning to impose transparency requirements on training data. Knowing your rights under these rules can help you act when companies fail to comply.

None of these steps are foolproof, and the burden should not be on individuals alone. But given that the legal system is still catching up, taking a proactive approach is the most practical way to protect your privacy right now.

Sources

  • From Privacy Compliance to AI Governance: Why the Source of Training Data Is Becoming a Central Legal Risk, The National Law Review, June 2026.
  • Various class-action filings against AI companies, 2024–2026, as reported in legal trade publications.