Distributed MIP solving reaches Gurobi [primal]

Distributed optimization has now reached the point where it is incorporated into general-purpose mixed-integer programming (MIP) solvers such as CPLEX and Gurobi. In today’s post, I will summarize the implications of some results presented at the INFORMS 2014 annual conference by Dr. Ed Rothberg. You can see the full slide deck here. The associated dual post presents some of my own experiments with Gurobi 6.0.  Don’t worry, the next set of posts will cover the same topic, this time with CPLEX.

From parallel to distributed

Parallel MIP solvers have been around for several years. In my opinion, the ability to adapt mixed-integer programming algorithms to the multi-core processor architecture is one of the industry’s major successes, leading to huge average speedups [1][2] without needing any kind of parametrization on the user’s part. Starting from release 6.0, Gurobi now offers distributed algorithms, allowing to pool resources from multiple computers to solve difficult MIPs. Gurobi allows for two types of strategic:

  • In concurrent optimization, the problem is run independently on each machine, but with different parameter settings;
  • In distributed mode, after a ramp-up phase, the search tree is divided among the machines.

What is now possible?

Up to now, solving very difficult MIPs required either writing your own distributed algorithms or running the problem on a big machine with multiple processors and high amounts of memory. Unfortunately, high-performance workstations are quite expensive these days : a machine with 16 cores and 128 GB of RAM will cost you much more (about 12,000$) than four 4-core machines with 32 GB of ram each (6,000$ total). It is also much easier to scale up by adding low-cost machines than by adding processors on a single machine. For commercial users, the distributed MIP licence is more expensive than a multi-core single-machine licence, but it allows for up to 100 worker machines (which do not need a licence). Chances are that you won’t have more than 100 machines available anyway.

Which strategy works best?

Short answer: it depends. Concurrent works best on some problems, while distributed works better on others. Four 4-core machines and a single 16-core machine provide comparable average speed-ups, being approximately 2 times faster on average than a single 4-core machine for models that take longer than 100 seconds to solve. Experiencing longer solving times using this configuration is also unlikely (happens only 2 times out of 27 on their testbed for models that take longer than 100 secs). Using concurrent optimization on 4 machines yields a similar portrait, as speed is on average 1.5 times faster than a single 4-core machine. Using 4 machines is also faster 17 times out of 20. In each case, adding more machines lead to faster solving on average, albeit with diminishing returns. If you have a 16-core machine, results indicate that you are better off – on average – to run the one-machine algorithm on 16 cores than run the distributed code on 4 four-core virtual machines.

If you really need to solve a difficult MIP from time to time, using distributed algorithms on several machines is one more option in your arsenal. It is very easy to use, and the Gurobi team has done a good job at explaining what it is and how to use it. If you don’t own this kind of machines, you can also rent some time on Amazon or Google’s cloud platforms – it is relatively cheap and allows scaling up.

In the associated dual post, I will share my experiences with using Gurobi’s distributed algorithms on my own instances. Stay tuned!

Trackbacks

  1. […] post, I talk about some tests I ran with the distributed MIP algorithm. The associated primal post discusses results presented by Gurobi at the INFORMS 2014 Annual conference.  Don’t worry, […]

Speak Your Mind

*