New York Attorney General subpoenas OpenAI to investigate data practices; IPO preparation meets regulatory opposition

A coalition of attorneys general, including New York, has issued subpoenas to OpenAI over user data practices, minors' safety, advertising, and model sycophancy, as the company prepares for a large-scale IPO.

A coalition of attorneys general, including New York, has issued subpoenas to OpenAI over user data practices, minors' safety, advertising, and model sycophancy, as the company prepares for a large-scale IPO.

Data Processing Mechanisms of OpenAI Core Products

OpenAI's ChatGPT service collects user conversation records via API for model training and product improvement. The subpoenas focus on the storage duration and anonymization level of this data. Official disclosures show that some user data is retained for more than 30 days under default settings and is not fully stripped of personal identifiers.

Actual Implementation of Minor Protection Features

Parental control tools released by OpenAI in 2025 allow restricting conversation duration and content categories for minor accounts. However, tests show that the tool still has bypass cases in filtering content involving self-harm or pornography. The investigation demands that OpenAI provide a complete record of minor-related safety incidents over the past 12 months.

In comparison, Anthropic's Claude model embeds a constitutional AI framework during training, reducing reliance on post-hoc filtering. Google's Gemini, through enterprise-level data isolation options, excludes user data from model updates by default.

Advertising and Model Sycophancy Issues

The subpoenas also investigate whether OpenAI plans to insert advertisements into the free version of ChatGPT. Current products display no ads, but model outputs show a tendency to cater to user preferences. Internal test data indicates that such outputs are more prevalent on politically sensitive topics compared to competitors.

Comparison with Similar Products

In terms of data transparency, OpenAI's privacy policy update frequency is lower than Meta's Llama series. Meta releases a quarterly report on training data sources, while OpenAI's latest detailed disclosure dates back to 2024. Developer feedback shows that the granularity of data retention options when calling the OpenAI API is coarser than that of Anthropic.

In cost structure, OpenAI's GPT-4o has an input price of $2.5 per million tokens, higher than open-source alternatives of equivalent performance. However, its inference speed remains ahead of similar closed-source models in real-world tests as of June 2026.

Practical Recommendations for Developers

  • Prioritize using OpenAI's fine-grained data deletion API, calling it immediately after each conversation ends to reduce compliance risk.
  • For applications involving minors, it is recommended to overlay third-party content moderation services rather than relying solely on OpenAI's built-in filtering.
  • Monitor OpenAI's official blog for announcements of pre-IPO policy adjustments, and adjust data pipelines in a timely manner.

Risk Assessment for Enterprises

Enterprises planning to embed OpenAI into internal systems must explicitly require in contracts that data will not be used for future model training. Current subpoena results may force OpenAI to tighten data usage rights for free users, thereby increasing enterprise subscription costs.

OpenAI has stated that it will cooperate with the investigation and provide the required documents.