OpenAI’s company knowledge wants access to all of your internal data

5gDedicated

OpenAI on Thursday rolled out its latest offering, a comprehensive data collection and analysis capability called “company knowledge”. And although vendors have been granted access to a wide range of enterprise data for decades — think of malware detection that reviews all messages and downloads — analysts and industry observers see this OpenAI effort as being meaningfully different.

Part of that difference is the extreme depth of access that OpenAI is proposing, along with a lack of assurances about how that sensitive enterprise data would be used and protected. But an even greater factor is OpenAI itself and how comfortable enterprise IT executives are about trusting a relatively young company with this intense level of access.

Making granting that trust yet more difficult is the lack of clarity around the ultimate OpenAI business model. Specifically, how much OpenAI will leverage sensitive enterprise data in terms of selling it, even with varying degrees of anonymization, or using it to train future models. 

Jeff Pollard, VP/principal analyst for Forrester, said trust, or the lack of trust, is the most critical background factor in this announcement.

“Whether it’s Microsoft Copilot M365, Gemini Enterprise, Anthropic Claude Enterprise Access, and now OpenAI company knowledge, the choice is really between the devil you know — the vendor you already work with — and who do you trust?” Pollard said. “The capabilities across all these solutions are similar, and benefits exist: Context and intelligence when using AI, more efficiency for employees, and better knowledge management.”

But Pollard said the risks of such an offering are equally important. “Data privacy, security, regulatory, compliance, vendor lock-in, and, of course, AI accuracy and trust issues. But for many organizations, the benefits of maximizing the value of AI outweigh the risks.”

The ROI debate

This ROI debate is intrinsic to all AI strategy decisions and goes well beyond this one OpenAI product/service rollout.

Enterprise IT executives “need to remember one important fact: Enterprise AI is shifting from isolated applications to connected agents and agentic systems that integrate with technologies already deployed to maximize value for users,” Pollard said. “These are high risk, high reward integrations that are unavoidable. These solutions exacerbate existing challenges related to identity and access management [such as] entitlements, data security including labeling/categorization, compliance, and governance, but the perceived productivity and efficiency gains make it too tempting for businesses to pass up.”

OpenAI’s statement gave perhaps the best illustration of how extensively they want to integrate with all manner of enterprise data. 

“With company knowledge, the information in your connected apps—like Slack, SharePoint, Google Drive and GitHub—becomes more useful and accessible. It’s powered by a version of GPT‑5 that’s trained to look across multiple sources to give more comprehensive and accurate answers. Every response includes clear citations so you can see where the information came from and trust the results,” OpenAI said. “For example, if you have an upcoming client call, ChatGPT can create a briefing for you based on recent messages from your account channel in Slack, key details from emails with your client, the last call notes in Google Docs, and any escalations from Intercom support tickets since your last meeting.”

How will the data be used?

The only data use restriction that OpenAI’s statement mentions does not address how OpenAI will use the data, but only says that it won’t access information that an individual end user wouldn’t have system permission to view.

“Anyone on ChatGPT Business, Enterprise, and Edu can use company knowledge. Company knowledge respects your existing company permissions, so ChatGPT only has access to what each user is already authorized to view,” the OpenAI statement said.

Some industry officials said enterprises will have to rely on legal contracts, including service level agreements, to control what a vendor can do with their data. The problem is that the data that OpenAI would access would also be available to a very large number of employees, contractors, and third parties. If some of this sensitive data was later discovered on the dark web or in the possession of a data broker, it would be all but impossible to prove from where that data was accessed.

“That data would be difficult to track” and that would make it easy “to find ways to avoid the repercussions” and to potentially deny that the data came from OpenAI, said Brady Lewis, the senior director of AI Innovation at the Marketri marketing consulting firm.

“This is one of those announcements that sounds great on paper, until you start thinking about what it actually means for your organization. The productivity promise is legit. Instead of toggling between Slack to find assignments, tabbing to Google Drive for specific files, or hunting for names and numbers, ChatGPT can deliver all that information directly into your chat session,” Lewis said. “Here’s where my 25+ years in tech makes me pause. Employees are submitting company data to ChatGPT and other tools with little to no oversight, often including PII or PCI data. So while OpenAI is building enterprise-grade controls, the real question is whether organizations are prepared to govern their employees properly.”

Lewis said that much of this comes down to OpenAI’s perception within enterprise IT. He said that OpenAI is seen as “overpromising and underdelivering. They haven’t proven their credibility, their trustworthiness.”

Risks too serious to ignore

Andrew Gamino-Cheong, CTO at Trustible, said his biggest concern with company knowledge is the unintentional loss of data control.

“The main risk for this set up is accidental data leakage of privileged information. This may be exacerbated if systems like ChatGPT don’t leave a breadcrumb [audit trail] that they accessed that information,” said Gamino-Cheong, who does not recommend deploying the feature yet. “There needs to be a very clear understanding of all of the access controls in place.”

Gary Longsine, CEO at IllumineX, said that he still sees OpenAI as a startup that doesn’t precisely know what it wants to be when it grows up — and that includes knowing specifically how it will make its money.

Longsine said that the risks of company knowledge are too serious to ignore. “No company in their right mind would ever [deploy this],” he said, unless they desperately needed this kind of data integration and analysis. And even if that enterprise did need such things, they should demand “their own instance of the LLM to prevent data leakage, and that would require funding a data center. That’s the only way I know how to do this and still protect enterprise data.”

Another cybersecurity executive, Bobby Kuzma, the director of offensive cyber operations at cybersecurity consulting firm ProCircular, added, “For companies that have solid data classification controls, there might be some benefit here. Unfortunately, that’s a very tiny fraction of the universe of organizations.”

But there are important questions to be answered. “OpenAI’s integration is leveraging the individual user’s access. That’s nice,” he said. “[But] how long is that access maintained? Does it only stay valid for that specific interaction with the user or is there longer term storage of tokens that could be leveraged if OpenAI is compromised? This has me more than a bit leery.”

And even if the vendor can be trusted to not be co-opted by other third parties, Kuzma asked what would happen if the US government hit OpenAI with a national security letter and demanded full access? 

But he mostly was worried about the financial incentives for OpenAI to use that data in a wide range of ways. “Think about an anonymized dataset of top manufacturing companies worldwide. Can you imagine the economic value of that, of monetizing access to that data?”

Asked what advice he would offer enterprise IT executives about using company knowledge, Kuzma was direct. “Please don’t,” he said, adding, “we don’t have a long track record of seeing how OpenAI deals with data access. They are subject to the same pressures as every other startup: ‘First become cashflow positive and then maybe we can think about security’.”OpenAI’s company knowledge wants access to all of your internal data – ComputerworldRead More