OpenAI Just Released a Privacy Filter – Here’s How It Works and How to Use It

On April 22, 2026, OpenAI released an open-source tool called the OpenAI Privacy Filter. It automatically detects and masks personally identifiable information (PII) in documents—things like names, email addresses, phone numbers, and other sensitive data. The tool is designed to help people clean up documents before sharing them or using them with AI services. It’s available on GitHub, can run locally on your own machine, and doesn’t require sending your data to OpenAI’s servers.

What the Privacy Filter Does

The filter works by running documents through a fine-tuned language model that identifies pieces of text that look like personal information. When it finds something—say, an email address or a full name—it replaces that text with a placeholder like [NAME] or [EMAIL]. The output is a redacted version of the document. You can use it on plain text, but OpenAI also documented ways to process files like PDFs or Word documents by extracting text first.

The model itself is small enough to run locally and is open source, meaning anyone can inspect, modify, or adapt it. This is a notable shift from the usual “API only” approach. You don’t need to connect to the internet or pay for usage to apply the filter.

Why It Matters

Privacy concerns around AI tools have grown over the past few years. People often paste sensitive documents into chatbots or upload files to cloud-based services without knowing exactly how that data is handled. Even if a company promises not to train on your data, there’s still the risk of exposure through bugs, leaks, or shared screens. The Privacy Filter gives users a way to remove PII before the document ever reaches an external service. Small business owners handling customer data, researchers sharing patient notes, or anyone writing contracts can use it to reduce the blast radius if something goes wrong.

This isn’t a replacement for proper data governance, but it’s a practical layer of protection that puts control back in the user’s hands. And because it’s open source, it can be audited by security researchers and adapted for specific needs.

How to Use It

The simplest way to use the Privacy Filter is to download it from OpenAI’s GitHub repository. You’ll need Python and a few dependencies, but the instructions are straightforward. Once installed, you run it as a command-line tool:

openai-privacy-filter input.txt --output redacted.txt

The tool also supports chunking long documents and batch processing multiple files. If you’re a developer, you can integrate the filter into your own application via the provided Python library or a local API—no cloud calls required. OpenAI has also published example integrations for common workflows like processing CSV files and emails.

For non-technical users, third-party tools might package the filter into graphical applications in the coming weeks, but for now, basic command-line familiarity is needed.

Limitations

Like any automated tool, the Privacy Filter isn’t perfect. It may miss some forms of PII, particularly unusual formats or data written in context that the model wasn’t trained on. It can also over-redact, marking things that aren’t actually personal information. OpenAI itself advises that you should always review the redacted output before sharing it. The tool is a helper, not a guarantee.

Additionally, the filter only handles text-based documents. Images, scanned PDFs, or handwriting are out of scope unless the text is first extracted. And because it runs locally, its speed depends on your machine’s hardware.

Sources

This article draws on OpenAI’s official announcement (April 22, 2026) and coverage by MSN and GIGAZINE. The full code and documentation are available on OpenAI’s GitHub page. If you’re interested in the technical details, the model weights and a paper describing the filtering approach are also published there.