How problematic are your Big M constraints? an experiment [primal]

This post investigates whether it is still relevant to be careful about the coefficients used in so-called Big M constraints when formulating mixed-integer programming (MIP) models. I make some experiments with two-echelon supply chain design models to show that using too large values often cause the models to be harder to solve. The impact is much greater on less-performing solvers than on commercial solvers.  In the associated dual post, I explain what are Big M constraints and why they tend to cause problems for solving MIPs.

A bit of context

When I started learning about integer programming in the early 2000s, I often got the following advice regarding modelling using Big M constraints:

  1. Use as few Big M constraints as possible;
  2. If you need to use them, make the coefficients (the Ms) as small as possible.

I always kept this advice but I never really tested the assumptions in practice. It’s a fact that solvers are several orders of magnitude faster than they were when I learned modelling. So is this advice still relevant? I wanted to test it out.

The experiment

I put together three supply chain design instances. More precisely, they are Two-Echelon (hierarchical) Uncapacitated Facility Location models. These are of size 30 x 30 x 50 and are identical in structure to the models shown here except for the single source constraints which I did not use.  Each instance has thus 60 Big-M constraints, one for each facility, that ensures that the facility can process products only if it is built. I used three different levels of coefficients: (i) the smallest (tighest) possible value, (ii) a larger value but in the same order of magnitude as the smallest value, and (iii) a huge value that is at least 100 times larger than the best value.  The instances are identical except for these 60 coefficients. I run the models through three solvers: CPLEX, CBC and GLPK, with a time limit of 2 hours, and compute the geometric mean of run times (in seconds).
Keep in mind that these are the same supply chains! CBC is strongly affected by the change, needing on average 6.5 more time for option (ii) and a whopping 15.5 times for (iii). CPLEX is also affected but to a much lesser extent, by a factor of about 2 and 4. GLPK struggles with these models even when they are tightly formulated. When higher values are used for the constraints, GLPK simply can’t solve them within the allotted time, with a final gap between 5 and 18% depending on instances.

The commercial solver is not only faster, it is also less affected by the unnecessarily large coefficients. I didn’t post results with Gurobi but the performances are quite comparable with CPLEX. This also shows the canyon between free and commercial solvers in terms of performance.

If you found this post useful or insightful, please share it with your colleagues, friends or students!

Extra: Why choose this formulation?

For full disclosure, I mentioned in the dual post that the model I use is not the most efficient one for solving this problem. I used it anyways for two reasons:

  • The model is easy to understand, compared to more complex variants of supply chain design which have many types of binary variables.
  • The best (tightest) values of the Big M coefficients are very straightforward to obtain in this model.

 

Comments

  1. Speed is part of the picture — the big M constraints tend to result in somewhat diluted node bounds. The other part of the picture is numerical stability. If M is too big, Strange Things Happen (including infeasible solutions looking feasible to the solver). I think in some cases numerical instability also affects solver speed, because the solver spends time doing fancy things to the basis matrix to get it to factor properly.

    • … and immediately after making that comment that I found the “dual” post and discovered you’d said pretty much the same thing …

      • As I revised my first draft for the dual post, I looked for a publicly-available text providing better explanation on the implications of Big M than what I had written, and I found just that on your blog. I almost deleted my draft, as anyone who read your post doesn’t need to read my dual. The reason I posted it anyway was because it was in a primal-dual pair and I had no idea for a suitable replacement.

  2. Peter Cacioppi says:

    What about indicator variables? Can’t you nearly always use those in lieu of big M’s? If so, can you post those results?

    • It is a good alternative, but as far as I know indicator vars/constraints are offered in top of the class commercial solvers (CPLEX, Gurobi and maybe FICO?) but are not available in free solvers such as CBC and GLPK.
      It is definitely something worth trying, thank you for the suggestion! It should make a nice follow-up to the post.

  3. Peter Cacioppi says:

    What about indicator variables? Can’t you use those instead of big M?

Trackbacks

  1. […] on Twitter and the blogosphere in the past week. For example, see Marc-Andre Carle‘s posts (primal and dual) and followup tweets and also some older posts from Paul Rubin and Bob […]

Speak Your Mind

*