What is “Data Skew” in Salesforce?

Understanding Data Skew in Salesforce

Data skew in Salesforce refers to situations where a disproportionate distribution of records (ownership or relationships) creates performance, sharing, or locking problems. In practice, skew happens when a large number of records are related to a single user or a single parent record, and routine DML or sharing recalculation operations cause contention, slowdowns, or errors such as UNABLE_TO_LOCK_ROW.

Why this matters

As orgs grow, uneven data distribution can cause bulk operations, integrations, and user actions to fail or become slow. Understanding, detecting, and mitigating data skew is essential for keeping large Salesforce implementations stable and performant.

Common types of Data Skew

There are three common skew patterns to watch for:

1. Ownership skew

When many records are owned by a single user or queue. This causes heavy sharing recalculations and locking when those records are updated or the owner record is changed.

2. Parent–child skew (or lookup/master-detail skew)

When a single parent record has a very large number of child records (for example, one Account with millions of Contacts). Updating the parent or performing operations on many children in parallel can cause record locking on the parent.

3. Lookup skew

When many records reference the same lookup value. Updating those child records in parallel can lock the referenced lookup record and lead to lock contention.

Symptoms of Data Skew

Look for these signs:

Frequent errors like UNABLE_TO_LOCK_ROW or Deadlock.
Slow or timed out bulk jobs and integration failures.
Long sharing recalculation times after ownership changes.
High queue lengths or CPU usage during batch updates.

How to detect Data Skew

Use aggregate queries and reports to find hotspots. Example SOQL queries:

SELECT OwnerId, COUNT(Id) cnt FROM Account GROUP BY OwnerId HAVING COUNT(Id) > 10000

To detect parent-child skew (example for Account->Contact):

SELECT AccountId, COUNT(Id) cnt FROM Contact GROUP BY AccountId HAVING COUNT(Id) > 10000

Adjust thresholds (e.g., 10k) to match your org size — any unexpectedly large counts warrant investigation.

Mitigation and best practices

Strategies to prevent or reduce the impact of data skew:

Design & ownership

Distribute record ownership: avoid single-user ownership for large datasets. Use multiple service users or queues where supported (Cases, Leads).
Use public sharing (OWD = Public Read/Write) where business rules allow — this reduces sharing recalculations.

Data model

Consider splitting very large parent records into multiple parent records to distribute child records.
Avoid design patterns that force many children to reference the same parent/lookup if possible.

Data load & integrations

When loading data, use Bulk API in serial mode (not parallel) to avoid simultaneous operations that target the same parent/owner.
Batch updates into smaller sizes (e.g., 200 or fewer) to reduce lock contention.
During large loads, temporarily disable non-essential triggers, workflows, and synchronous automations (use a controlled switch or a metadata flag).

Application & code

Catch and retry on UNABLE_TO_LOCK_ROW using an exponential backoff. Example Apex retry pattern:

Integer attempts = 0; Boolean success = false; while(attempts < 3 && !success){ try{ // perform DML update records; success = true; } catch (DmlException e){ if(e.getMessage().contains('UNABLE_TO_LOCK_ROW')){ attempts++; // small sleep / backoff in asynchronous context } else throw e; } }

Sharing & performance

Avoid frequent ownership changes for large data sets; each change can trigger sharing recalculation.
Use asynchronous sharing recalculation when possible and schedule large sharing changes during off-peak windows.

Monitoring

Schedule regular queries/reports to detect growing skews.
Monitor Apex logs, async job failures, and integration error patterns for lock-related exceptions.

Summary

Data skew is a common scalability issue in Salesforce where uneven data distribution causes locking, sharing recalculations, and performance problems. Prevention begins with a thoughtful data model, distributed ownership, and careful design of bulk processes and integrations. When it occurs, detect it with aggregate queries and address it using redistribution, batching, asynchronous processing, and retry logic.

Archives

Categories

Meta