MuleSoft Direct for Data Cloud
Use case 1 - Unstructured data ingestion
Ingest unstructured data into Salesforce Data Cloud and make it accessible to Agentforce
Overview
See also
Glossary
Term | Definition |
---|---|
SDC | Salesforce Data Cloud |
UDLO | Unstructured Data Lake Object |
Solution overview
The purpose of this solution is to automate the process of ingesting unstructured data from various systems into Salesforce Data Cloud. The MuleSoft platform has been leveraged to incorporate integration applications - implementing content notification and download logic - into Data Cloud by deploying them via MuleSoft Direct.
Goals
- Support ingestion of unstructured content metadata for initial load.
- Support ingestion of unstructured content metadata on-demand, either for full refresh scenarios or incremental updates.
- Support publication of change notifications for event-driven data sources.
- Support the retrieval of content for individual items.
Use case considerations
- Ingest metadata for unstructured data from a source system to Salesforce Data Cloud upon request, such as when a UDLO is first created.
- Send change notifications when there is an event generated for content of interest, such as new items are added to a container or item contents are updated.
- Support the retrieval of incremental updates to resource metadata.
- Retrieve the content for the resource that is requested from Salesforce Data Cloud to support a variety of Agentforce use cases.
Technical considerations
- Each Mule application must be configured at the container level, like site, drive, domain, and can include additional separation by content type where appropriate.
- Content will be made available as either HTML, PDF, or plain text as supported by the source system; binary content will be provided as-is by default but can also be encoded as Base64 text upon request.
- Authentication must be supported, such as OAuth 2.0 forms requiring user interaction.
- Mule applications have been designed to be as stateless as possible.
- Health check endpoints will also validate authentication to the source system upon request.
- MuleSoft Direct supports deployment of Mule applications to both CloudHub and CloudHub 2.0.
High-level architecture
The following diagram describes the general scenarios that may be implemented by the ingestion applications. Note that not all applications implement all scenarios.
Activity diagrams
The following diagrams illustrate common processing sequences for the ingestion applications.
Push notifications
Poll notifications
Get content
Source systems
Ingestion applications are available for the following source systems as of the current release:
- Confluence Cloud
- Google Drive
- Microsoft SharePoint Online
- Sitemaps
Downloadable assets
Process APIs
- SDC Ingestion Process API | API Specification
- SDC Ingestion from Confluence Process API | Implementation Template
- SDC Ingestion from Google Process API | Implementation Template
- SDC Ingestion from SharePoint Process API | Implementation Template
- SDC Ingestion from Sitemaps Process API | Implementation Template
- SDC Ingestion Template Process API | Implementation Template
Custom components
- Accelerator Common Core | Source
- Accelerator POM Parent | Source
- SDC Ingestion Common Library | Source
See the Data Cloud documentation for more details about implementing the use case and making use of the ingested unstructured data in Data Cloud.