Salesforce Data Cloud Ingestion from Sitemaps - Implementation Template
Setup instructions
The comprehensive instructions on this page show how to configure and deploy the provided integration application for Sitemaps to MuleSoft Anypoint Platform. Users are guided through the essential steps needed for successful deployments in two different scenarios as well as a section on troubleshooting common issues:
Data source configurations
None required.
Deploying via MuleSoft Direct
To enable the ingestion of unstructured data into Data Cloud this application must first be deployed using MuleSoft Direct. When enabling the integration, the following properties must be configured to connect with the data source and target system.
Property Name | Description |
---|---|
Sitemaps Hostname | The address or domain name used to access a Sitemaps host instance, such as example.com . |
Sitemaps List | The list of Sitemaps from an organization. For a complete list, leave robots.txt as the default value. |
Salesforce Hostname | The address or domain name used to access a Salesforce instance, such as login.salesforce.com . |
Salesforce Consumer Key | The consumer key of a connected app in Salesforce. The consumer key is used in conjunction with the consumer secret to authenticate and authorize API requests. |
Salesforce Consumer Secret | The Consumer Secret associated with the Consumer Key of the Connected App. |
Mule Configuration Environment | Target deployment environment configuration selector. |
Additional configuration steps
After the application has been deployed there are some additional configuration steps that should be done within Anypoint Runtime Manager. The following instructions are for CloudHub 2.0 deployments but the steps are similar for CloudHub deployments.
- Find the deployed application by entering the name in the
Search Applications
field. - Click on the application name or select the entry and click the Manage Application button on the right. For CloudHub 2.0 deployments, a screen appears with multiple tabs to configure the application (for CloudHub deployments you will need to click the Settings item on the left navigation bar).
- Update the
Runtime Version
to the latest patch release and select theUse Object Store V2
checkbox under theRuntime Options
section. - Consider increasing the number and size of the replicas (workers) if the source container being monitored has a large number of resources and/or a high volume of changes.
- Click the
Apply Changes
to deploy the new configuration
Deploying via Anypoint Platform
To support the ingestion of unstructured data into Data Cloud the application must first be deployed from MuleSoft Direct in order to create the required connector (in Data Cloud); once deployed the application can be updated in Anypoint Runtime Manager as needed.
Getting started
The Getting Started with MuleSoft Accelerators guide provides general information on getting started with the accelerator components. This includes instructions on setting up your local workstation for configuring and deploying the applications. |
Deployment
Each Accelerator implementation template in Exchange includes Bash and Windows scripts for building and deploying the APIs to CloudHub. These scripts depend on repositories, global settings, deployment profiles, and associated properties configured in the Maven settings.xml
file. In particular, make sure the common properties for your environment have been provided in the CloudHub-DEV
profile, like Anypoint Platform client ID and secret.
For additional details, please refer to the Application Deployment section of the Getting Started Guide.
Required property overrides
Many templates can also be run from Anypoint Studio without having to customize the Run/Debug profiles. However, some templates make use of hidden deployment properties to protect sensitive information, like passwords and secret keys. These properties must be supplied to the runtime by updating the configuration profile and adding them as VM arguments. At a minimum, the following properties must be customized to reflect the target deployment environment.
Property Name | Description |
---|---|
api.autodiscoveryID | Required if using API Manager to secure this API |
app.container-id | Sitemaps Hostname |
sitemap.list | List of Sitemap URLs. Use robots.txt for the complete list |
ingest-common.salesforce.host | Salesforce Hostname |
ingest-common.salesforce.client-id | Salesforce consumer key |
ingest-common.salesforce.client-secret | Salesforce consumer secret |
mule.env | Target deployment environment configuration selector |
Troubleshooting
- Sitemap index files are currently skipped during resource processing. To process any of the referenced sitemap files, they will need to be specified in the list of sitemaps to include.