Load Balancing for the Win!
Exago BI’s standard scheduler service takes a round-robin approach to load balancing. Users might create scheduled jobs 1 through 6, and Exago BI would then deal them out to scheduler servers A through C like a croupier passing out cards: Job 1 to Scheduler A, Job 2 to Scheduler B, and so on until all the jobs had been passed out. Once the job has been sent to a scheduler server, it gets stored in a file directory on that machine.
The catch is: scheduler jobs aren’t uniform like playing cards.
A scheduler job is essentially a set of instructions on what report to execute, where to send it, when to send it, and how many times the execution pattern should be repeated. Some jobs are set to execute only once while others may be set to execute every hour, indefinitely. The reports themselves also vary in length. A transaction report returning ten records one day might return thousands of records another day, requiring more time to execute.
To see how scheduler job variation can overload a system like the one illustrated above, let’s pretend Job 1 is a two-page sales summary that goes out to executives every Monday at 9 AM and Job 2 is a daily 5 PM transaction report containing thousands of pages of detail. If Server B gets assigned Job 2, as illustrated above, it is going to have a lot less bandwidth than Server A, even though they both only have one job, because Job 2’s executions are going to require more processing power and happen more frequently than Job 1’s executions. In this scenario, Server B’s performance will begin to suffer if it is assigned another job scheduled to execute at 5 PM because Job 2 is going to take several minutes to run.
Additionally, if any server goes offline for a period of time, the jobs set to run during that timeframe won’t get executed until that server has been brought back up, resulting in further delays.
For Exago clients who either use a single scheduler server or use their multiple scheduler servers infrequently, the round-robin approach works well. For everyone else, there’s the Scheduler Queue.
Instead of pushing jobs to the scheduler servers, the queue makes it possible for the servers to pull jobs from a central repository—when and if they have the bandwidth to take on the work.
In this configuration, the Scheduler Queue application could be a web service or installed on a local server; it need only be accessible to the other servers in the network. All scheduled jobs created by users of Exago BI get sent to the queue for processing and storage (typically in a database). The queue code calculates each job’s next execution time and status using the helper classes built into the Exago API.
Now, instead of being given ownership of whole jobs, these scheduler servers notify the queue when they’re ready to take on work, and the queue sends them individual executions to process, not whole jobs. Job 1’s first execution may happen on Server A, and its second may happen on Server B. It all depends on which server announces its availability at the at the appointed time.
Since job files stay with the queue, a scheduler node can go offline without interrupting service.
Senior Software Developer Dave Killough implemented the scheduler queue for SofterWare, producers of management solutions for nonprofits and educational sectors. “I encourage anyone planning a production capability to consider implementing a scheduler queue for maximum manageability, flexibility, and performance,” says Killough. “SofterWare has over 45,000 schedules licensed for its production environment and needs an implementation that can process 40 schedules per minute during off hours. Through extensive load testing, we've reached an environment that can scale up to 200 reports per minute with one central database. That would not have been possible without the scheduler queue.”
To learn more about the scheduler queue and view SofterWare's production-ready sample code implementation, visit our Knowledge Base.