Seeds and Parallel MIP solvers, part I

Over recent years, mixed-integer programming (MIP) solver developers worked really hard to provide parallel codes that are both fast and quite stable. A few years ago, using a parallel code resulted in huge variations in run times: successive runs of the same model on the same machine could result in drastically different run times. This behavior is now quite rare, as the codes have been tremendously improved.  This post compares the effect of randomness in solving mixed-integer programming (MIP) models with parallel commercial codes. To that end, I use a set of 30 lot-sizing models (with 40 products and 14 periods) that are available here[Continue reading]

Building the perfect MIPping machine [dual]

Yesterday’s post explained why I prefer run solver on fast machines over big machines and gave a short example why. In today’s post, I provide some very personal guidelines on how to determine what kind of machine you might need to solve MIPs.  Beware: this post is more based on personal experiences than hard data.

How much RAM do you need?

A small problem with 2000 variables can require more memory than another problem which is significantly larger. You only need a lot of RAM if the kind of models you solve requires storing a large MIP search tree. Two examples:

  • absH6high_5n30, an inventory-routing problem instance from Coelho & Laporte has about 31700 binary variables and around 250,000 nonzeros. The size of the search tree grows very quickly; let this problem run for about one hour and the tree will consume 32 GB or RAM.
  • S19_02_1999 is a supply chain design instance from M’Barek, Martel & D’Amours. It has 150k variables and around 500k nonzeros. Although the problem takes more than 10 hours to solve using CPLEX 12.6, the MIP tree never gets larger than 50 megabytes, so the whole optimization process never consumes more than 400 MB of RAM.

When the solver exceeds the allowed RAM on your machine (or sometimes even before), it will start storing the nodes in files on your hard drive. Disks are several orders of magnitude slower than memory, so this is likely to slow down the execution by a substantial margin.

Whether an instance consumes a lot of memory or not depends on many factors. Structure is more significant than problem size in this regard. I am not aware of any method that could accurately predict the amount of memory needed to solve a MIP. The best to do is to run the problem in the interactive solver and monitor tree size and memory usage. (If you’re using a CPLEX API, you can export the model through the ExportModel method). A good thing is that the pattern is usually quite similar for different instances of the same models, assuming much of the data is similar. That is, if you do production planning for a company, much of the information won’t change between daily runs, and chances are that you will encounter a similar behavior over and over.

Again, read the log to understand which resources are consumed at which point in the search.

What types of processors are best?

I’m not a microprocessor expert, but I’ve experienced that the choice of processors also significantly impact performance. Past are the times when CPU clock speed meant everything: don’t rely too much on it. Also important is the presence of proper math accelerators, as well as the amount of internal cache. According to my experience, here is what I would suggest:

  • Intel Xeon processors are the best in line for doing long optimization runs. As of today, the Ivy-Bridge generation of processors from the EP line (labeled as Xeon E5-xxxx) are a good choice. The EX line are probably even better (Xeon E7-xxxx) but at more than 2000$ apiece, they are a bit too costly for my budget. I am also not sure if they have enough cache to make good use of their 12 cores. If anyone has any data on that, I’d be glad to update this post.
  • If you can’t get your hands on Xeon, then core i7 processors are the next best choice. I’ve been surprised at the good performance of the new i7-3770 equipped workstations we have. They are also very inexpensive. Finally, while not all models make good use of 12 cores, 4 are more likely to be used throughout the run.
  • Processors designed for laptops and mobile systems are not suited to the extreme amount of number-crunching done by solvers. Most sacrifice proper cooling systems to reduce the weight and dimensions of the computer. Processors often make up for this by having sensors that reduce the energy consumption and performance of the processor to avoid overheating, thus slowing the computer progressively during the optimization.

If you absolutely have to run solvers on your laptop or tablet, check the power settings of your computer to make sure it is working at full speed.

How many cores?

While allowing the solver to use more cores will reduce the average clock time required to solve MIPs, this is not always the case; in fact, adding core sometimes increase solution times. I have studied this phenomenon before – see the data for CPLEX and Gurobi. Between 4 and 8 cores seem to be a sweet spot for efficiency, as not all models will make good use of 12, 16 or 24 cores.

These are my personal tastes. Feel free to comment on my recommendations!

Do you need a bigger computer to solve this MIP? [primal]

When I have difficult time solving a mixed-integer optimization problem (MIP), one of the most common reflexes is to run the problem on a bigger or faster machine.  However, when solving MIPs, throwing more processor power at a problem will not … [Continue reading]

Things you learn in a Ph.D. that are not written in textbooks

Those are things I learned during the first phase of my experimentation over the course of my Ph.D (year 2 of 4) There is a story behind each of this statements; I may blog about some of them if there is interest in it and I find the time to write … [Continue reading]

There and back again

If your academic work is related to computation, chances are that once in a while you have to scrap a set of experiments and start over. It just happened to me again about two weeks ago. While I and a colleague were doing what is mostly a computation … [Continue reading]

Supply chain network design in 500 words

I've spent more time working on supply chain network (SCN) design problems than any other. This post summarizes the topic. What is supply chain network design? SCN design is a strategic problem arising in logistics and supply chain management. … [Continue reading]

LP file format: uses and intricacies

LP is one of the most popular formats for explicit description of an optimization model (the other contender being MPS). It uses an algebraic format where you enter the problem's objective function and constraints line by line. For the last few years … [Continue reading]

Open question: tuning for hard instances

This is a rather short post on an open question, one I've been thinking about for some time. Automated tuning tools for MIP solvers have been around for the last few years. During that time, I have used these tools - and sometimes seen them used - to … [Continue reading]

Keeping a horse in the race

(This post was originally published on the INFORMS 2013 Blog). While attending this year's INFORMS Annual Conference, this phrase caught my attention: we need to keep a horse in the race.  It simply means that for a typical paper to be accepted in … [Continue reading]

Solving sets of similar instances, Part I

SLSM-100is

In an industrial application, one has often to solve similar instances of the same mixed-integer linear programming (MIP) model. Furthermore, from one model to another, a large proportion of the data is similar. Because of the heuristic nature of MIP … [Continue reading]