Data Cloud Insights: Calculated vs. Streaming Engine Selection
In Salesforce Data Cloud, the mechanisms chosen for computation—Calculated Insights (CI) or Streaming Insights (SI)—significantly impact data freshness, action latency, and operational cost. These engines solve distinct architectural problems, and mismatching the engine to the requirement leads to stale segment definitions or excessive credit consumption.
1. Core Philosophical Differences: Timing and Scope
The fundamental divergence between CI and SI lies in when the computation executes and the scope of the data available during evaluation.
Calculated Insights (CI): Durable, Historical Aggregation
Calculated Insights operate across the entire Data Cloud estate, joining Data Model Objects (DMOs), Data Lake Objects (DLOs), Unified Profiles, and historical engagement data. CIs are designed for relational, multi-object aggregation over long time horizons, resulting in durable, business-critical attributes.
Typical CI Use Cases (Stable Metrics):
- Engagement Consistency Score: Rolling measure aggregated over weeks/months.
- Cross-Channel Preference Index: Composite score derived from historical interaction records.
- Purchase Cadence Profile: Analyzing purchase intervals across years of data.
CI computations are scheduled (often daily), not real-time, targeting stable attributes required for consistent segmentation and scoring models.
CI Emulation Example (Apex Contextualization):
This pseudo-code demonstrates aggregating historical order data associated with Unified Profiles, mirroring a durable CI calculation:
// Step 1: Aggregate data from child object (Order__dlm)
List<AggregateResult> orderAggregates = [
SELECT ProfileId__c,
SUM(TotalAmount__c) LifetimeValue,
COUNT(Id) TotalOrders,
MAX(OrderDate__c) LastPurchaseDate
FROM Order__dlm
GROUP BY ProfileId__c
];
// Step 2: Map aggregates by ProfileId
Map<Id, AggregateResult> aggregatesByProfile = new Map<Id, AggregateResult>();
for (AggregateResult ar : orderAggregates) {
aggregatesByProfile.put((Id) ar.get('ProfileId__c'), ar);
}
// Step 3: Query parent Unified Profiles
List<UnifiedProfile__dlm> profiles = [
SELECT Id, Name
FROM UnifiedProfile__dlm
WHERE Id IN :aggregatesByProfile.keySet()
];
// Step 4: Combine and output results (example)
for (UnifiedProfile__dlm profile : profiles) {
AggregateResult ar = aggregatesByProfile.get(profile.Id);
System.debug('Profile: ' + profile.Name);
System.debug('Lifetime Value: ' + ar.get('LifetimeValue'));
System.debug('Total Orders: ' + ar.get('TotalOrders'));
System.debug('Last Purchase Date: ' + ar.get('LastPurchaseDate'));
}
Notes on CI Emulation: Order__dlm represents the child object mapping to DLOs/DMOs, with a lookup to UnifiedProfile__dlm. The use of SUM, COUNT, and MAX aggregates reflects the durable metrics generated by CIs, typically run on a defined schedule.
Streaming Insights (SI): Event-Driven, Windowed Evaluation
Streaming Insights evaluate events immediately as they arrive, using rolling or tumbling time windows. Crucially, SIs do not scan full historical data; their scope is restricted to recent activity.
Typical SI Use Cases (Real-Time Signals):
- Digital Behavior Events: Immediate calculation on web session activity (e.g., page views, session duration).
- Mobile Interaction Streams: In-app usage events requiring low latency response.
- External Telemetry: Near-real-time signals ingested from external transactional systems.
SI Example: Detecting High-Intent Users (15-Minute Tumbling Window):
This example queries streaming events (__e object) within a precise time constraint to identify users with significant recent activity:
// Step 1: Define the time window (15 minutes tumbling window)
Datetime windowEnd = System.now();
Datetime windowStart = windowEnd.addMinutes(-15);
// Step 2: Query Web Engagement Events in the window
List<AggregateResult> recentEvents = [
SELECT UserId,
COUNT(Id) eventCount
FROM WebEngagementEvent__e
WHERE EventType__c = 'ProductView'
AND CreatedDate >= :windowStart
AND CreatedDate <= :windowEnd
GROUP BY UserId
HAVING COUNT(Id) >= 3
];
// Step 3: Process results
for (AggregateResult ar : recentEvents) {
Id userId = (Id) ar.get('UserId');
Integer views = (Integer) ar.get('eventCount');
System.debug('User ' + userId + ' viewed products ' + views + ' times in the last 15 minutes.');
// Trigger real-time actions based on this high-intent signal
}
Notes on SI: The query relies strictly on the CreatedDate within the defined time boundaries (windowStart to windowEnd). The HAVING clause filters the aggregated results based on the windowed count.
2. Side-by-Side Comparison
| Feature | Calculated Insights (CI) | Streaming Insights (SI) |
|---|---|---|
| Data Scope | Full Data Cloud (Profiles, DMOs, DLOs, historical) | Incoming engagement events only |
| Latency | Scheduled (e.g., hourly, daily) | Seconds to minutes (event-driven) |
| Computation Style | Relational, multi-object, historical joins | Window-based (tumbling/rolling), event-driven |
| Profile Joins | Supported natively | Not supported |
| Primary Usage | Segmentation, durable scoring, baseline metrics | Real-time triggers, immediate alerts, time-sensitive actions |
| Cost Behavior | Predictable, scheduled credit usage | Scales directly with event velocity (continuous consumption) |
Critical Constraint: The inability of Streaming Insights to directly join Unified Profiles necessitates architectural design around this limitation. Real-time logic requiring static attributes (like loyalty tier) must use a pattern where SI detects the signal, and a subsequent Data Action or Flow enriches this signal with pre-calculated CI attributes.
3. Selection Criteria: Segment vs. Trigger
The decision matrix should be driven by the metric's purpose:
Choose Calculated Insights for Segmentation and Definition
Use CI when the requirement is to define stable, reusable metrics that represent long-term customer state or value. These metrics form the foundation of the customer model.
- Rule: If the metric needs to be referenced repeatedly across audience definitions, reporting, or long-term propensity models, it belongs in CI due to its durability and ability to handle complex relational joins across historical data.
Choose Streaming Insights for Immediate Action
Use SI when timeliness is paramount, and the data's value decays rapidly (e.g., within minutes).
- Rule: If the metric drives immediate, time-sensitive engagement or automated decisioning (e.g., abandoned cart), SI is the correct engine, despite its higher, velocity-dependent credit cost.
Key Takeaways
- CI (Batch): Handles durability, full data context, relational joins, and long-term metrics (e.g., CLV, tenure). Cost is predictable.
- SI (Stream): Handles immediacy, event velocity, and short time windows. It cannot perform profile joins.
- Architectural Synergy: Effective Data Cloud implementations assign CI the role of calculating durable profile attributes, which Streaming Insights (via subsequent Data Actions/Flows) can consume to enrich time-critical, event-driven responses.
Leave a Comment