Salesforce Data Cloud Ingestion from Sitemaps - Implementation Template

(0 reviews)

Application details

Technical considerations

An instance of the Mule application is deployed per domain
Support discovery of Sitemaps from the organization’s robots.txt file
Processing Sitemap index files is out of scope
Content from a Sitemap should generally be provided as "text/html"
No authentication is required/supported
Synchronous and Asynchronous scans will ingest the full load
The Mule application is designed to be stateless

Activity diagrams

The following activity diagrams illustrate the sequence of processing to ingest the unstructured metadata and its content on-demand.

Initial Load/Full Refresh Synchronous

Initial Load/Full Refresh Asynchronous

Get Content

Processing logic

The primary handling and orchestration of unstructured metadata ingestion will be implemented in the Salesforce Data Cloud Ingestion from the Sitemaps Process API. This process is described in more detail in the following sections.

Initial Load/Full Refresh Synchronous

This flow is triggered by the end user.

A user clicks the Refresh Now button on the UDLO page to initiate the request for a full refresh of resource metadata
Data Cloud invokes the Mule application without a continuation token to start the process
Mule application receives the request and will:
- Retrieve the content metadata from all the configured organizations' Sitemaps
- Transform the results into the Data Cloud format and return the results

Initial Load/Full Refresh Asynchronous

This flow is triggered by an external application, such as Postman.

Mule application receives a request to perform an asynchronous refresh of all metadata and will:
- Retrieve the content metadata from all the configured organizations' Sitemaps
- Transform the results into the required format for the ingestion API
- Send the transformed data to the ingestion endpoint

Get Content

This flow is triggered by Data Cloud.

Data Cloud initiates the request to retrieve the content
Mule application receives the request to retrieve and stream the page content from a Sitemap

Success conditions

Upon successful completion, the following conditions will be met:

All metadata associated with unstructured content in the organization's Sitemaps is retrieved and processed.
The full load of metadata is retrieved on demand.
Retrieval of content is supported.

Type	Template
Organization	MuleSoft
Published by	MuleSoft Solutions
Published on	Jan 21, 2025

Version	Actions
1.0.12
1.0.11
1.0.10
1.0.9

Salesforce Data Cloud Ingestion from Sitemaps - Implementation Template

Application details

Technical considerations

Activity diagrams

Initial Load/Full Refresh Synchronous

Initial Load/Full Refresh Asynchronous

Get Content

Processing logic

Initial Load/Full Refresh Synchronous

Initial Load/Full Refresh Asynchronous

Get Content

Success conditions

Reviews

Asset versions for 1.0.x