Skip to main content
SFDC Developers
Apex

Custom CSV Import Architecture for Salesforce Developers

Vinay Vernekar · · 4 min read

Introduction

When standard Salesforce tools like Data Loader or Import Wizard fall short—perhaps due to complex multi-object relationships, custom validation logic, or the need for a seamless end-user UI—building a custom CSV import engine becomes necessary. In this guide, we’ll architect a resilient solution using Apex, Custom Metadata Types, and Batch processing to handle large datasets while respecting governor limits.

1. Defining the Mapping Layer

Hardcoding column-to-field mappings is a recipe for technical debt. Instead, we use Custom Metadata Types (CMT) to store mapping configurations. This allows admins to update field mappings without a code deployment.

First, define a custom metadata type named CSV_Field_Mapping__mdt with fields: Source_Column__c, Target_Field__c, and Target_Object__c.

public class MappingService {
    public static Map<String, CSV_Field_Mapping__mdt> getMappings(String importType) {
        Map<String, CSV_Field_Mapping__mdt> mappingMap = new Map<String, CSV_Field_Mapping__mdt>();
        for (CSV_Field_Mapping__mdt mdt : [SELECT Source_Column__c, Target_Field__c, Target_Object__c 
                                         FROM CSV_Field_Mapping__mdt 
                                         WHERE Import_Context__c = :importType]) {
            mappingMap.put(mdt.Source_Column__c, mdt);
        }
        return mappingMap;
    }
}

2. Chunking Data with Batch Apex

For large files, loading the entire CSV into memory will result in a Heap Size Limit Exceeded exception. We must process the file in chunks. If the file is uploaded via a Lightning Web Component, store the file in a ContentVersion record and pass the ContentDocumentId to a Database.Batchable class.

public class CSVImportBatch implements Database.Batchable<String>, Database.Stateful {
    private String content;
    
    public CSVImportBatch(String content) {
        this.content = content;
    }

    public Iterable<String> start(Database.BatchableContext bc) {
        // Split by line, ignoring header row
        return content.split('\n'); 
    }

    public void execute(Database.BatchableContext bc, List<String> lines) {
        List<SObject> recordsToInsert = new List<SObject>();
        Map<String, CSV_Field_Mapping__mdt> mappings = MappingService.getMappings('MyImport');
        
        for (String line : lines) {
            List<String> columns = line.split(',');
            // Logic to transform columns to SObject based on mappings
            recordsToInsert.add(transformRow(columns, mappings));
        }
        
        Database.insert(recordsToInsert, false); // Partial success allowed
    }
    
    public void finish(Database.BatchableContext bc) {
        // Notify user via Platform Event or Email
    }
}

3. Implementing Multi-Object Strategy

Often, a single CSV row maps to multiple objects (e.g., an Account and its related Contact). To maintain data integrity, use the Unit of Work pattern or a parent-child processing sequence. If you insert parent records first, capture the returned IDs to map to child records in the same transaction block.

  • Parent-First Approach: Identify the parent record in the row, create it, and add it to a map using a unique external ID from the CSV.
  • Relational Mapping: Use the captured external IDs to link child records during the final DML operation.
  • Validation: Use try-catch blocks within the execute method to handle individual row failures without rolling back the entire batch.

4. Validation and Error Handling

Custom imports require robust validation before hitting the database. Implement a ValidationEngine class that evaluates business rules dynamically.

public class ValidationEngine {
    public static Boolean isValid(SObject record, Map<String, Object> data) {
        // Execute custom cross-field validation rules
        if (data.get('AnnualRevenue') < 0) return false;
        return true;
    }
}

When a row fails, capture the error in a custom Import_Error_Log__c object. Provide the user with a downloadable CSV containing only the failed rows and their associated error messages.

5. Performance Considerations

  • DML Optimization: Always perform bulk DML operations. Avoid DML inside loops.
  • Governor Limits: If processing complex logic, be mindful of CPU time. If CPU limits are hit, consider moving logic to a Queueable Apex job or breaking the batch into smaller Scope sizes.
  • Asynchronous Processing: Always run large imports in Database.Batchable to ensure the user isn't stuck waiting for the process to complete.

Key Takeaways

  • Use CMTs: Never hardcode field mappings; use Custom Metadata Types for flexibility.
  • Batching is Mandatory: Use Database.Batchable to manage memory and avoid heap size issues with large CSVs.
  • Partial Success: Use Database.insert(records, false) to allow for partial imports, logging failures for later review.
  • Decouple Validation: Separate your parsing, mapping, and validation logic into distinct service classes for maintainability and testability.
  • User Feedback: Implement a status tracking mechanism (e.g., via Platform Events) to keep users updated on the import progress.

Share this article

Vinay Vernekar

Vinay Vernekar

Salesforce Developer & Founder

Vinay is a seasoned Salesforce developer with over a decade of experience building enterprise solutions on the Salesforce platform. He founded SFDCDevelopers.com to share practical tutorials, best practices, and career guidance with the global Salesforce community.

Get weekly Salesforce dev tutorials in your inbox

Comments

Loading comments...

Leave a Comment

Trending Now