Distributed optimization is now part of standard CPLEX releases. In fact, it’s been there since release 12.5. I decided to do some experiments and post on this after seeing Daniel Junglas’ talk at the German OR Conference in September, but writing the post has been delayed until recently. I summarize the results of my experiments and compare those results with what has been discussed at these presentations. In today’s post, I talk about some tests I ran with their distributed MIP algorithm. In the past months I made similar experiments with Gurobi 6.0.
Distributed CPLEX in a nutshell
Over the last several years, the CPLEX development team has achieved huge successes in adapting its mixed-integer programming (MIP) algorithms to the multi-core architectures that are now prevalent in personal and office computers. It is now possible – and in fact quite easy – to use multiple machines to work on solving any particular problem. Two algorithmic approaches are provided:
- In concurrent optimization, the problem is run independently on each machine, but with different parameter settings or random seeds;
- In distributed mode, the search is divided in two phases. In the first phase, also called ramp-up phase, the problem is started on each machine with different settings. At the end of this phase, the concurrent execution is stopped on all machine, and the MIP search tree from the machine that has made the most progress is divided among the machines, which then work collaboratively.
Note that unless instructed to do otherwise, as of CPLEX 12.6.1, the concurrent optimization mode (also called “infinite ramp-up”) is used as default.
How to do it?
Setting up a distributed MIP environment isn’t very complex. You need 2 or more machines, ideally of similar performance. The following steps
- Install CPLEX on each machine;
- Start a cplex worker process on each machine;
- Make sure cplex has access through the firewall;
- Create a configuration file on the master machine defining the pool of workers;
- Start CPLEX on the master machine, load a model, then solve!
How well does it perform?
Short answer: it depends. In his GOR 2014 talk, Daniel Junglas report significant speed-ups: from 11% to 82% speed-up on a wide range of models that take more than 10 seconds to solve on a single machine. I have tested the distributed MIP on two types of problems. The first model I ran was a large-sized deterministic supply chain design problem. It is based on the data provided by one of our partner companies at FORAC, the research center where I do most of my research. Running this difficult model on four machines resulted in a reduction in about 3% of the wall clock time compared to a single-computer version.
I also tested a group of large lot-sizing models (set P256-T64-C2, which you can download here). These require processing a large number of nodes to solve, which typically a sweet spot for distributed algorithms. As it turns out, the single-computer algorithm is faster on average. Out of the 10 models, 6 are solved faster on a single computer than on the four-machine version. Geometric mean is also 12.7% higher for the 4-machine distributed MIP than for the single computer version.
This is by no means a thorough comparison, but it shows that distributed algorithms may or may not be beneficial to the type of models you are working on. There is no substitute to testing it yourself to see how it works out.
Many thanks to Daniel Junglas, Ed Klotz and Jean-François Juget for their help in setting up the distributed MIP environment.