Data is the new gold, and AI is the future. Now, the question, that arises for many companies is how to protect the gold while still evolving into the future.
One way to do this is by using Microsoft 365 Copilot. In this article, we’ll outline some basic concepts that Microsoft uses to ensure the security of Microsoft Copilot. These concepts ensure that your data stays safe, even when you’re using the full power of its AI-driven assistant.
The following graphic is shows the basic building blocks of Microsoft 365 Copilot. We’ll elaborate on the various parts to clarify how Copilot is works under the hood. Finally, we will examine the whole flow and how these building blocks work together.
The LLM Model used by Microsoft 365 Copilot
The LLM Model is the foundation of Microsoft Copilot and makes sure you can swiftly interact with it and use the knowledge of its training data.
It’s important to understand that an LLM Model is quite a static service. Once trained, it will not learn any more from the queries it gets. If it should include more recent data, retraining the model is required, which usually takes very specialized hardware and plenty of time.
At the time of writing, the underlying model of Microsoft 365 Copilot is GPT.4o. This doesn’t mean that your data is transferred to OpenAI for processing. In fact, Microsoft is hosting its own instance of that model to ensure your data is kept within the Microsoft 365 boundaries. Given this setup, it’s ensured that the data you provide to the LLM is not used to train of the following generations.
Microsoft Graph
The Microsoft Graph API is an interface that has already been around for some time. It’s the main API that one can use to interact with Microsoft 365 and its data. As such, it has a strong security focus and ensurees that a user can only query data accessible by this very user. You can also directly query Graph API via Graph Explorer if you are more tech-savvy and want to see how it works (Use Graph Explorer to try Microsoft Graph APIs – Microsoft Graph | Microsoft Learn).
Microsoft 365 Copilot uses this API to collect relevant data based on your prompt. While doing so, it accesses the Graph API in the context of the current user. Therefore, the Graph API makes sure it doesn’t get any information that the user doesn’t have access to, which is one of the core foundations of Microsoft 365 Copilot’s security.
However, a word of caution on this matter: It’s vital to ensure proper permissions on your data. Up to now, users might have yet to find some data as they didn’t know where it is located. With Copilot’s usage, this will no longer be a concern, as Copilot will surface all relevant information the user can access. This is great if you’re looking for the information that you’ve stored “somewhere”, but it is not so great if it surfaces data, that should be kept secret.
Semantic Index
Given the Graph API as an access point to your tenant’s data, the question arises of how Copilot knows which data to surface for a specific prompt.
At this point, another already existing solution comes into play. With Microsoft Search, your users were already able to get tailored search results over the data of your tenant. Its capabilities are now enhanced with an index that considers semantic similarities to achieve better search results. This index is used by Microsoft Copilot to identify information relevant to your prompt. The indexing itself can be controlled via the admin panel.
External Information for Microsoft 365 Copilot
Mirosoft Copilot can use information in its model or your tenant. You can allow it to use current web data, driven by Bing search, to enhance your prompt. Furthermore, you can connect various third-party systems via Plug-Ins to enrich the available information further. This increases the benefits of using Microsoft 365 Copilot and extends its usability over the boundaries of your tenant data.
Microsoft 365 Copilot and its flow
The copilot component itself is the central coordinator of all these building blocks. It takes your prompt and controls its processing.
As you can see in the graphic I’ve included above, it processes the data in various stages to do so.
- First, it takes your prompt and triggers a process that Microsoft calls grounding. Essentially, it’s querying all available data sources like the Graph API and external Data for relevant information to your prompt. This data is added to your prompt as context data to provide the LLM with current information.
- Once the grounding is done, Microsoft Copilot will forwards this prompt, including the extended context, to the Microsoft-hosted LLM. The LLM produces the response based on its model and some responsible AI checks.
- The response from the LLM is then verified against the security, compliance and privacy settings, of your tenant, before it’s sent back to the user.
Conclusion on Microsoft 365 Copilot Security
Microsoft 365 Copilot bases many of its features on already-existing solutions. These solutions already have built-in features to ensure users only see data that they are allowed to see. This approach ensures that this will also be true for your users’ Copilot prompts.
Furthermore, Microsoft is hosting its own LLM and doesn’t share your company data outside the MS365 boundary. This ensures your data will not be included in any training data of the underlying LLM. All the company information used in the responses is purely provided as context to the LLM and, as such, remains outside of the model.
If you’re now curious and want to start working with Microsoft 365 Copilot, let us know and book a meeting with us. We will help you to experience the benefits of Microsoft 365 Copilot from the very start of your journey.