OpenAI Ordered to Disclose 20 million ChatGPT Logs

A federal judge in New York has affirmed an order compelling OpenAI to produce 20 million anonymized ChatGPT interaction logs in a consolidated copyright infringement case, according to a Bloomberg report. The decision, issued on 5 January 2026, marks a setback for the AI company amid ongoing litigation over the use of copyrighted material in its model training. The ruling stems from multidistrict litigation involving 16 lawsuits against OpenAI, brought by news organizations including The New York Times Co. and Chicago Tribune Co. LLC. These cases allege that OpenAI's AI systems infringed copyrights by scraping and using protected content without permission.[1]

Magistrate Judge Ona T. Wang initially sided with the plaintiffs in November, directing OpenAI to hand over a sample of 20 million de-identified logs, representing 0.5% of its preserved data. OpenAI challenged this, arguing the order failed to adequately consider user privacy and instead proposed conducting targeted searches for logs referencing the plaintiffs' works.

District Judge Sidney H. Stein upheld Wang's decision, distinguishing it from a prior securities case involving illegal wiretaps. Stein noted that OpenAI legally owns the logs, and users voluntarily submitted their inputs. He emphasized that courts are not obliged to mandate the least burdensome discovery method, stating no case law supports such a requirement.

OpenAI had initially offered the 20 million-log sample after opposing a larger 120 million-log request. However, in October, it refused to produce the full set, opting only for search results. This discovery phase forms part of pretrial proceedings in the Southern District of New York, under case number 1:25-md-03143. The lawsuits represent a fraction of wider actions against AI firms, raising novel questions about intellectual property in machine learning.

Plaintiffs, represented by firms like Susman Godfrey LLP and Loevy & Loevy, seek evidence to demonstrate how ChatGPT processes and potentially reproduces copyrighted material. OpenAI, defended by Keker, Van Nest & Peters LLP and others, maintains that the logs' relevance does not outweigh privacy risks.

In an expert comment, Dr Ilia Kolochenko, CEO of ImmuniWeb, described the ruling as a "legal debacle" for OpenAI, likely encouraging similar demands in other cases. He cautioned that user interactions with AI systems, even with privacy settings enabled, could surface in court due to complex data retention architectures.

Kolochenko highlighted potential criminal implications, such as for security researchers testing AI guardrails by prompting illicit content. He noted that while civil law jurisdictions limit such discovery, US-based AI giants remain vulnerable, and emerging EU regulations, such as the AI Act, may expand access to user data.

He advised users to exercise caution: "Prior to entering anything into an AI-powered system or testing its guardrails, think twice; otherwise, legal consequences may be pretty serious and long-lasting." The case illuminated the growing tensions between innovation, privacy, and intellectual property in the AI era.

This article is shared at no charge for educational and informational purposes only.

Red Sky Alliance is a Cyber Threat Analysis and Intelligence Service organization. We provide indicators-of-compromise information via a notification service (RedXray) or an analysis service (CTAC). For questions, comments, or assistance, please contact the office directly at 1-844-492-7225 or feedback@redskyalliance.com

Reporting: https://www.redskyalliance.org/
Website: https://www.redskyalliance.com/
LinkedIn: https://www.linkedin.com/company/64265941

Weekly Cyber Intelligence Briefings:
REDSHORTS - Weekly Cyber Intelligence Briefings
https://register.gotowebinar.com/register/5207428251321676122

[1] https://www.cybersecurityintelligence.com/blog/openai-ordered-to-disclose-20-million-chatgpt-logs-9008.html

X-Industry

OpenAI Ordered to Disclose 20 million ChatGPT Logs

Comments