Distributed optimization is now part of the Gurobi 6.0 release. In today’s post, I talk about some tests I ran with the distributed MIP algorithm. The associated primal post discusses results presented by Gurobi at the INFORMS 2014 Annual conference. Don’t worry, the next set of posts will cover the same topic, this time with CPLEX.
Overall, it works. Three conclusions stand out from my early experiments:
- Use similar machines
- Don’t put slower machines in the worker pool
- Test your assumptions
The importance of similar machines
The first problem I ran was a large-sized deterministic supply chain design problem. It is based on the data provided by one of our partner companies. It is solvable by Gurobi 6 (and also by CPLEX), but it is difficult, as it takes about 17,5 hours (63,840 secs)on a 4-core machine. I set up a worker pool of 5 available machines : four recent 4-core machines and an older 8-core machine. After a few hours of computation, the 4 recent machines were idle while the slower machine was still working. I finally killed the process 24 hours later: the algorithm was still in the ramp-up phase, as the four fast machines were waiting for the slow counterpart to finish. I removed the old machine from the worker pool, and took the solving time down to 14,5 hours (52,300 secs), a decrease of about 3 hours. I ran some other tests on easy models, but the conclusion holds: you are better off with a single fast machine than with a fast machine teamed up with a slow machine. If you feel you absolutely have to setup this worker pool, then use concurrent MIP so your faster machine does not get idle waiting for the turtle to finish. To be fair, Gurobi strongly advises in their distributed MIP documentation to use similar machines, but the drag on performance is stronger than what I anticipated by reading the documentation.
Test your assumptions
The supply chain design I tested requires the exploration of less than 5000 nodes. That is not much, so I tested a group of large lot-sizing models (set P256-T64-C2, which you can download here). These typically require processing a large number of nodes to solve, which makes them good candidates for distributed MIP. My hypothesis was that the speedup would be greater than with the supply chain model.
As it turns out, the opposite is true. 7 of the 10 models solved faster on a single machine with default settings than on the pool of 4 machines (including the aforementioned machine). Geometric mean is also smaller (1793 seconds) for a single machine than for the distributed 4 machines (2338 seconds), which is 18,4% slower. It seems the models are solved while the distributed algorithm is still in the ramp-up phase. It could be good if there was a way to tell the algorithm when to swicth from ramp-up to distributed solving mode. Maybe changing the high-level settings such as MIP Emphasis towards improving bounds help in this regard, but I haven’t tried yet.
If you are like me and you ever implemented some distributed optimization algorithm on your own before, using the Gurobi distributed MIP feels like a walk at the park. If you use Windows machines, installing Gurobi on the worker machines and making sure Gurobi has access through the firewall is what will take the most time, which is a few minutes per machine. If you ghost the machines, it is even easier to do. Just remember that the algorithm seems to be designed for clusters of identical machines, so now matter how geek it looks, it might not be a good idea to use your laptop, your gaming desktop and your lab’s server to run Gurobi 6 in distributed mode.