Salesforce Large File Downloads: Chunked Integration Strategies

Handling Large Salesforce File Downloads in Integrations

When developing Salesforce integrations, especially for offline-first mobile applications, downloading large files presents a significant challenge. Standard "one big GET" requests are prone to failure on unstable networks, leading to lost progress and user frustration. This article outlines strategies for implementing robust, chunked, and resumable download mechanisms for Salesforce Files.

Salesforce Files, managed via ContentDocument and ContentVersion objects, can store files up to 2 GB. This capability necessitates a download strategy that accounts for unreliable network conditions common on mobile devices, such as hotel Wi-Fi or roaming cellular data.

Two Primary Download Methods

Salesforce offers two main approaches for retrieving file content:

Connect Files API: This REST API provides access to Files through endpoints like /connect/files/.... It typically involves fetching file metadata first, then making a separate call to retrieve the binary content. While suitable for backend integrations on stable networks, its default streaming pattern makes implementing chunking and resumability for mobile apps complex.
Shepherd Servlet Download Endpoint: The /sfc/servlet.shepherd/version/download/{ContentVersionId} or /sfc/servlet.shepherd/document/download/{ContentDocumentId} endpoints offer a more mobile-friendly approach. Crucially, this endpoint supports the HTTP Range header, which is essential for building chunked downloaders.

Alternatively, the ContentVersion/{Id}/VersionData blob-retrieve URL can also be used as the byte endpoint, requiring similar client-side logic for chunking and resumption.

Implementing a Chunked Download Strategy

The core of a resilient download mechanism lies in breaking large files into smaller, manageable chunks and implementing a resumable process.

Step 1: Prepare File Context

Before initiating a download, gather essential metadata into a compact file context object. This object should include:

contentDocumentId and contentVersionId: Identifiers for the file.
totalSize: The total size of the file in bytes.
checksum: An MD5 or similar hash for final file validation.
fileName/extension: For local file naming.
remoteUrl: The URL to download from (e.g., Shepherd Servlet or VersionData endpoint).
downloadingURL: The local temporary path for the downloaded file.
startByte: The byte offset to resume from (0 for a new download).
chunkSize: The size of each chunk (e.g., 2–10 MB).

Step 2: Download in Chunks with HTTP Range

To download in chunks, utilize the HTTP Range header. For each chunk, send a GET request to the remoteUrl with an Authorization header and a Range header specifying the byte range (e.g., bytes=offset-end).

offset = context.startByte // 0 for new download, >0 if resuming
chunkSize = context.chunkSize
totalSize = context.totalSize

open file at context.downloadingURL in append mode
seek(file, offset)

while offset < totalSize:
    end = min(offset + chunkSize - 1, totalSize - 1)
    response = HTTP GET context.remoteUrl with headers:
    - Authorization: Bearer <token>
    - Range: "bytes=offset-end"

    if response is not successful (timeout, 5xx, 429, etc):
        // Network problem: keep the partial file
        // and remember how far we got
        context.startByte = offset
        save context (status = "paused")
        return "resumable error"

    data = response.body
    write data to file
    offset += data.length
    context.startByte = offset

    // Persist progress so we can resume from here next time
    save context (status = "inProgress")
    report progress = offset / totalSize

close file

After each successful chunk, persist the startByte to allow resumption. The checksum calculation is deferred until the entire file is downloaded.

Step 3: Resume After Network Failures

Assume network failures will occur. Instead of discarding partial downloads, treat most failures as recoverable. Persist the startByte and mark the context as paused or failed_resumable. When resuming, load the FileDownloadContext, and if startByte > 0 and the partial file exists, open it in append mode and seek to startByte to continue the download.

This approach ensures that progress is not lost due to network interruptions or application backgrounding.

Step 4: Validate the File With MD5

Once the download loop completes ( startByte equals totalSize ), validate the integrity of the downloaded file using its MD5 checksum.

if context.startByte != context.totalSize:
    return "download not complete"

open file at context.downloadingURL for read
init MD5 calculator

while there is data to read:
    chunk = readNextBlock(file)
    if chunk is empty:
        break
    update MD5 with chunk

computed = finalize MD5

if computed == context.checksum:
    mark context as "completed"
    return success(context.downloadingURL)
else:
    delete file at context.downloadingURL
    delete context entry
    return "checksum mismatch"

If the checksums match, mark the file as completed and move it to its final location. If they do not match, delete the corrupted file and its context to trigger a fresh download on the next attempt.

Connect Files vs. Servlet: Optimal Use Cases

Connect Files API: Best suited for backend services and integrations on stable networks where complex resume behavior is not a primary concern. It offers a higher-level REST interface.
Servlet (or VersionData blob) with HTTP Range: Ideal for offline-first mobile applications where robust resuming capabilities and reliable performance on unreliable networks are critical. This approach requires managing metadata like size and checksum at the client level.

In many comprehensive solutions, a combination of both might be employed, leveraging the Connect Files API for simpler server-side operations and the Shepherd Servlet or VersionData endpoint with a custom FilesDownloader for mobile scenarios.

Key Takeaways

Large file downloads on unreliable networks require chunking and resumability.
The Shepherd Servlet or ContentVersion.VersionData endpoints, supporting HTTP Range, are ideal for building mobile downloaders.
Maintain a FileDownloadContext to track progress (startByte) and essential file metadata.
Treat network failures as recoverable, persisting progress to allow seamless resumption.
Validate downloaded files using checksums (e.g., MD5) after completion to ensure data integrity.

Salesforce Large File Downloads: Chunked Integration

Handling Large Salesforce File Downloads in Integrations

Two Primary Download Methods

Implementing a Chunked Download Strategy

Step 1: Prepare File Context

Step 2: Download in Chunks with HTTP Range

Step 3: Resume After Network Failures

Step 4: Validate the File With MD5

Connect Files vs. Servlet: Optimal Use Cases

Key Takeaways

Related Articles

Salesforce flat file integration for batch data exchange

Managing Salesforce Large Data Volumes - Performance Guide

Screen Flow File Preview - New Spring '26 Salesforce Feature

Comments

Leave a Comment