This post investigates whether it is still relevant to be careful about the coefficients used in so-called Big M constraints when formulating mixed-integer programming (MIP) models. I make some experiments with two-echelon supply chain design models to show that using too large values often cause the models to be harder to solve. The impact is much greater on less-performing solvers than on commercial solvers. In the associated dual post, I explain what are Big M constraints and why they tend to cause problems for solving MIPs.
A bit of context
When I started learning about integer programming in the early 2000s, I often got the following advice regarding modelling using Big M constraints:
- Use as few Big M constraints as possible;
- If you need to use them, make the coefficients (the Ms) as small as possible.
I always kept this advice but I never really tested the assumptions in practice. It’s a fact that solvers are several orders of magnitude faster than they were when I learned modelling. So is this advice still relevant? I wanted to test it out.
I put together three supply chain design instances. More precisely, they are Two-Echelon (hierarchical) Uncapacitated Facility Location models. These are of size 30 x 30 x 50 and are identical in structure to the models shown here except for the single source constraints which I did not use. Each instance has thus 60 Big-M constraints, one for each facility, that ensures that the facility can process products only if it is built. I used three different levels of coefficients: (i) the smallest (tighest) possible value, (ii) a larger value but in the same order of magnitude as the smallest value, and (iii) a huge value that is at least 100 times larger than the best value. The instances are identical except for these 60 coefficients. I run the models through three solvers: CPLEX, CBC and GLPK, with a time limit of 2 hours, and compute the geometric mean of run times (in seconds).
Keep in mind that these are the same supply chains! CBC is strongly affected by the change, needing on average 6.5 more time for option (ii) and a whopping 15.5 times for (iii). CPLEX is also affected but to a much lesser extent, by a factor of about 2 and 4. GLPK struggles with these models even when they are tightly formulated. When higher values are used for the constraints, GLPK simply can’t solve them within the allotted time, with a final gap between 5 and 18% depending on instances.
The commercial solver is not only faster, it is also less affected by the unnecessarily large coefficients. I didn’t post results with Gurobi but the performances are quite comparable with CPLEX. This also shows the canyon between free and commercial solvers in terms of performance.
If you found this post useful or insightful, please share it with your colleagues, friends or students!
Extra: Why choose this formulation?
For full disclosure, I mentioned in the dual post that the model I use is not the most efficient one for solving this problem. I used it anyways for two reasons:
- The model is easy to understand, compared to more complex variants of supply chain design which have many types of binary variables.
- The best (tightest) values of the Big M coefficients are very straightforward to obtain in this model.