Introduction
When standard Salesforce tools like Data Loader or Import Wizard fall short—perhaps due to complex multi-object relationships, custom validation logic, or the need for a seamless end-user UI—building a custom CSV import engine becomes necessary. In this guide, we’ll architect a resilient solution using Apex, Custom Metadata Types, and Batch processing to handle large datasets while respecting governor limits.
1. Defining the Mapping Layer
Hardcoding column-to-field mappings is a recipe for technical debt. Instead, we use Custom Metadata Types (CMT) to store mapping configurations. This allows admins to update field mappings without a code deployment.
First, define a custom metadata type named CSV_Field_Mapping__mdt with fields: Source_Column__c, Target_Field__c, and Target_Object__c.
public class MappingService {
public static Map<String, CSV_Field_Mapping__mdt> getMappings(String importType) {
Map<String, CSV_Field_Mapping__mdt> mappingMap = new Map<String, CSV_Field_Mapping__mdt>();
for (CSV_Field_Mapping__mdt mdt : [SELECT Source_Column__c, Target_Field__c, Target_Object__c
FROM CSV_Field_Mapping__mdt
WHERE Import_Context__c = :importType]) {
mappingMap.put(mdt.Source_Column__c, mdt);
}
return mappingMap;
}
}
2. Chunking Data with Batch Apex
For large files, loading the entire CSV into memory will result in a Heap Size Limit Exceeded exception. We must process the file in chunks. If the file is uploaded via a Lightning Web Component, store the file in a ContentVersion record and pass the ContentDocumentId to a Database.Batchable class.
public class CSVImportBatch implements Database.Batchable<String>, Database.Stateful {
private String content;
public CSVImportBatch(String content) {
this.content = content;
}
public Iterable<String> start(Database.BatchableContext bc) {
// Split by line, ignoring header row
return content.split('\n');
}
public void execute(Database.BatchableContext bc, List<String> lines) {
List<SObject> recordsToInsert = new List<SObject>();
Map<String, CSV_Field_Mapping__mdt> mappings = MappingService.getMappings('MyImport');
for (String line : lines) {
List<String> columns = line.split(',');
// Logic to transform columns to SObject based on mappings
recordsToInsert.add(transformRow(columns, mappings));
}
Database.insert(recordsToInsert, false); // Partial success allowed
}
public void finish(Database.BatchableContext bc) {
// Notify user via Platform Event or Email
}
}
3. Implementing Multi-Object Strategy
Often, a single CSV row maps to multiple objects (e.g., an Account and its related Contact). To maintain data integrity, use the Unit of Work pattern or a parent-child processing sequence. If you insert parent records first, capture the returned IDs to map to child records in the same transaction block.
- Parent-First Approach: Identify the parent record in the row, create it, and add it to a map using a unique external ID from the CSV.
- Relational Mapping: Use the captured external IDs to link child records during the final DML operation.
- Validation: Use
try-catchblocks within theexecutemethod to handle individual row failures without rolling back the entire batch.
4. Validation and Error Handling
Custom imports require robust validation before hitting the database. Implement a ValidationEngine class that evaluates business rules dynamically.
public class ValidationEngine {
public static Boolean isValid(SObject record, Map<String, Object> data) {
// Execute custom cross-field validation rules
if (data.get('AnnualRevenue') < 0) return false;
return true;
}
}
When a row fails, capture the error in a custom Import_Error_Log__c object. Provide the user with a downloadable CSV containing only the failed rows and their associated error messages.
5. Performance Considerations
- DML Optimization: Always perform bulk DML operations. Avoid DML inside loops.
- Governor Limits: If processing complex logic, be mindful of CPU time. If CPU limits are hit, consider moving logic to a Queueable Apex job or breaking the batch into smaller
Scopesizes. - Asynchronous Processing: Always run large imports in
Database.Batchableto ensure the user isn't stuck waiting for the process to complete.
Key Takeaways
- Use CMTs: Never hardcode field mappings; use Custom Metadata Types for flexibility.
- Batching is Mandatory: Use
Database.Batchableto manage memory and avoid heap size issues with large CSVs. - Partial Success: Use
Database.insert(records, false)to allow for partial imports, logging failures for later review. - Decouple Validation: Separate your parsing, mapping, and validation logic into distinct service classes for maintainability and testability.
- User Feedback: Implement a status tracking mechanism (e.g., via Platform Events) to keep users updated on the import progress.
Leave a Comment