What it is
A Fabric Data Agent is a configurable, governed Q&A surface powered by Azure OpenAI Assistants. A user types a question in natural language; the agent picks the right data source, generates the appropriate query (SQL, DAX, or KQL), executes it, and returns a grounded answer.
Critically: the agent runs read-only, under the user's own identity, and respects Microsoft Purview data-loss-prevention, sensitivity labels, and tenant policies.
How it works
Three layers:
- Question parsing. The LLM understands the question and consults the configured instructions and example queries.
- Source selection & tool invocation. The agent decides which configured source can answer (semantic model? KQL DB? ontology?) and generates the right query.
- Execution & grounding. The query runs as the user. The answer is rendered in plain language with the data shown.
Data sources
One Data Agent can connect up to five data sources. Mix and match:
- Lakehouses — for raw and curated table access
- Warehouses — for T-SQL-shaped data
- Power BI semantic models — best for already-modeled metrics
- KQL databases (including Eventhouse-backed) — for time-series and high-cardinality
- Ontologies (Fabric IQ) — for semantically-rich, agentic reasoning
- Microsoft Graph — for people, calendar, and Teams context
Configuration that matters
- Instructions (up to 15,000 chars). Tell the agent which source to use for which question type, define organizational terminology, and constrain off-topic responses.
- Example queries (few-shot pairs). The single biggest accuracy lever. Provide 5–20 question/SQL or question/KQL pairs per source.
- Table selection per source. Don't expose every table — pick the small set that answers most questions, in the right shape.
Deployment surfaces
- Inside Fabric — chat surface in the workspace for power users.
- Microsoft 365 Copilot — surfaces in Outlook, Teams, and Excel.
- Copilot Studio — embed in custom apps as a custom skill.
- Foundry agents — orchestrate with other Azure AI Foundry agents.
Best practices
- Curate before you connect. Build the semantic model and ontology first; an agent on raw data fails the trust test.
- Evaluate continuously. Maintain a fixed Q&A test set; measure pass rate after every change to instructions.
- Limit scope per agent. A "Finance" agent and a "Operations" agent each focused beat one omnibus agent every time.
- Log everything. Conversation logs become training data for your next instruction revision.