Executing Batch Apex in Sequence - Archive of IC Blog

Executing Batch Apex in Sequence

Fellow Salesforce developers out there, I am going to fill you in on the best Force.com coding tip I heard at Dreamforce this year. It was one of the key points in the Apex Design Patterns and Best Practices breakout sessions, but the question (and solution) was brought up in many different sessions throughout the rest of the week.

How do you execute multiple Batch Apex in a particular sequence?

Suppose you need to run a nightly batch to update a field on all your Accounts. Once that has been completed, you have to run a separate batch on all your Contacts using the newly updated Account information. What do you do?

Well, first, you need to create your two batch scripts. It would probably look something like this:

global class AccountBatch implements Database.Batchable<sObject>{

    global Database.QueryLocator start(Database.BatchableContext BC){
        return Database.getQueryLocator('SELECT id FROM Account');
    }

    global void execute(Database.BatchableContext BC, List<sObject> scope){
        // do your Account updating here
    }

    global void finish(Database.BatchableContext BC){

    }

}

 

global class ContactBatch implements Database.Batchable<sObject>{

    global Database.QueryLocator start(Database.BatchableContext BC){
        return Database.getQueryLocator('SELECT id FROM Contact');
    }

    global void execute(Database.BatchableContext BC, List<sObject> scope){
        // do your Contact updating here
    }

    global void finish(Database.BatchableContext BC){

    }

}

And then to fire off the batches, you would run this schedule every night:

global class BatchScheduler implements Schedulable {
    global void execute(SchedulableContext sc){
         database.executeBatch(new AccountBatch());
         database.executeBatch(new ContactBatch());
    }
}

Seems easy enough. You are calling the Account batch before the Contact batch, so you would think that would be the order they execute in, right?

Well, not necessarily.

Batch Apex – as with any asynchronous call in Salesforce – is put on a queue to be fired off at a future time. That future time could be near-instant milliseconds after you call the executeBatch() function, or it could be an hour after you run the code. You really do not have any control of when your batch goes off.

Furthermore, despite what you learned in your Comp Sci 101 class, this queue is not first in, first out. Salesforce determines the order of Batch Apex execution based on a number of factors, such as available resources or the size of your code. This means that no matter what order you put them in the queue, either batch could fire off before the other one. In fact, in most cases they will run side-by-side each other at the same time.

So, if you can’t control when either batch begins or what order they will run in, how exactly do we solve this problem?

We only execute the first batch, then we utilize its finish() method to schedule the second batch.

To accomplish this, we need to move the second executeBatch() call to a brand new schedulable class, making our calls now look like this:

global class BatchScheduler implements Schedulable {
    global void execute(SchedulableContext sc){
         database.executeBatch(new AccountBatch());
    }
}

 

global class BatchScheduler2 implements Schedulable {
    global void execute(SchedulableContext sc){
         database.executeBatch(new ContactBatch());
    }
}

Then we have to set up a way of storing the Job_Id of that schedulable class (more on what this means later). We could store that ID in a custom field somewhere, but I recommend using a Custom Setting to store it.

Create a Hierarchical custom setting called BatchSchedule__c with just one field, a text field called scheduled_id__c with a default value of ‘0’. Set the Default Organization Level Value by clicking “Manage” on the BatchSchedule__c detail page, then click the top “New” button (the one above the Default Organization Level Value section), and then save the default values. Without a default value, you will get a null pointer error when you first try to run the batch.

Now that you have a place to store your Job_Id, let’s go back to the first batch we want to fire off, the AccountBatch, and schedule the second batch to queue within the finish() method. I recommend scheduling it two minutes into the future. Check out the SFDC documentation for more on how to schedule within apex by building up a cron string.

global class AccountBatch implements Database.Batchable<sObject>{

    global Database.QueryLocator start(Database.BatchableContext BC){
        return Database.getQueryLocator('SELECT id FROM Account');
    }

    global void execute(Database.BatchableContext BC, List<sObject> scope){
        // do your updating here
    }

    global void finish(Database.BatchableContext BC){
        BatchSchedule__c b = BatchSchedule__c.getOrgDefaults();
        DateTime n = datetime.now().addMinutes(2);
        String cron = '';

        cron += n.second();
        cron += ' ' + n.minute();
        cron += ' ' + n.hour();
        cron += ' ' + n.day();
        cron += ' ' + n.month();
        cron += ' ' + '?';
        cron += ' ' + n.year();

        b.scheduled_id__c = System.schedule('Batch 2', cron, new BatchScheduler2());

        update b;
    }

}

Now whenever the first batch finishes, it will schedule the second batch, thus ensuring that the second batch will never fire off before the first batch is done.

However, what are we going to do about that second job we scheduled?

If we leave it as-is, the second job will attempt to run again at the same time the next night no matter if the first batch is done or not.* Furthermore, you are going to get an error after the first batch finishes for attempting to schedule a job that is already there, which is why we saved the Job_Id of the scheduled job. We will use that Id to delete the job as soon as we’re done with it in the second batch.

global class ContactBatch implements Database.Batchable<sObject>{

    global Database.QueryLocator start(Database.BatchableContext BC){
        return Database.getQueryLocator('SELECT id FROM Contact');
    }

    global void execute(Database.BatchableContext BC, List<sObject> scope){
        // do your Contact updating here
    }

    global void finish(Database.BatchableContext BC){
        BatchSchedule__c b = BatchSchedule__c.getOrgDefaults();
        System.abortJob(b.scheduled_id__c);
    }

}

And that is how you run batches in a particular sequence.

Have questions or comments on how to execute multiple Batch Apex in a particular sequence? Leave a response below, or @ reply us on Twitter: @icsfdc

* Correction: Because we clearly define a month, day, and year in our cron string, the job will not attempt to run again the following day. However, the job will still remain active in your system, and so you will get the error message about scheduling a job that is already there.