Creation date: 2017-09-04
This is by far the longest series of blog articles about one topic. It's interesting to write while thinking, learning and programming just one step before I write it down for you here. I think there isn't enough easy to read information on the internet about the Simplex method.
What can I do with this algorithm? What types of problems can I solve? How to make it fast? How to make it accurate?
I tried to explain in the last couple of blog posts: How does it work and how can I implement it?
Again this post will give you a new part of answering that question. In the next article I want to face the question: How to make it accurate and will give you a hint why the current approach isn't the best in the end of this article.
What is the simplex algorithm and how to solve the easiest problems with it? Start here
How to transform other problems into the standard formulation to solve more problems? Convert to standard form
Transforming the problem in a different way using the dual simplex method. Making it faster this way. Dual problem
My goal is to understand the simplex method, linear programming and mixed integer programming by programming it on my own. I want to help you guys to understand this method and learn what's going on in the background if you use a real solver. It's hard to build a solver which is at least in some parts as good and fast as a solver where several of hundreds of people are working on for years. Anyway it's still a goal to come as close as I can get. It should still be simple to follow me on my journey and using Python here I think is a reasonable approach even if C++ would be faster.
This article is about adding constraints to a solved tableau.
Why should we do it?
Let's assume we have our diet problem again and we solved it for several hundreds or thousands of ingredients and some nutrition values and then there is a new constraint because a new study came out: We should eat at least an apple a day. Well we forgot to add that constraint in the first place. At the moment we would need to add this constraint and solve everything again. This might be time consuming.
Another use case is the one I used for the diet problem and will use here: We have a list of ingredients and therfore constraints and we have some constraints telling us that we shouldn't eat more than a specified amount of one ingredient. Using this approaches we have to add a lot of constraints.
If we have 100 ingredients and only three minimal requirements we would normally have three constraints but because of this specified maximum amount we would add another 100 constraints. This sounds a lot of overhead. Why overhead? We have to do it right? Well we remember that we have a basic feasible solution where we use only three variables out of the 100 if we have only three constraints.
Remember: We can't set more variables to a value greater than zero than we have constraints.
If we solve our problem without those extra 100 constraints we can only have three ingredients which might not fulfill the maxmimum constraint. It might be reasonable to add those at maximum three constraints afterwards if necessary.
Let's assume we have our solved tableau for the diet problem again (without those extra constraints)
\[ \begin{array}{rrrrrrrrrr|r} x_1 &x_2 & x_3 & x_4 & x_5 & x_6 & s_1 & s_2 & s_3 & b \\ \hline 1 & 9.9 & 3.4 & 0 & 0 & 3.3 & 0 & -0.3 & 0 & 6.47 \\ -0 & 0.1 & 0.2 & 1 & 0 & 0.3 & 0 & -0 & -0 & 2.60 \\ 0 & -2.2 & -0.6 & 0 & 1 & -0.3 & -0 & 0.1 & -0 & 2.08 \\ \hline 0 & 77.8 & 41.3 & 0 & 0 & 29.2 & 0.2 & 0.2 & 0.1 & -541.1 \end{array} \]This meant that we should eat 6.47 portions of oats, drinking 2.60 portions of milk and eating a bit mor than two portions of cake. We have an objective of ~541.
Let's have a look at our extra constraints:
# oats
m.add_constraint(x1 <= 4)
# chicken
m.add_constraint(x2 <= 3)
# egg
m.add_constraint(x3 <= 2)
# milk
m.add_constraint(x4 <= 8)
# cake
m.add_constraint(x5 <= 1)
# bean
m.add_constraint(x6 <= 2)
We dissatisfy the oats constraints and the cake constraint, but not the milk constraint. In the best solution without this constraint drinking milk isn't even close to the maximum constraint. Therefore it was probably a good strategy to not use it up-front.
Let's start with the oats constraint. How would the tableau look like if we add that one:
Remember \(x_1\) is the variable for oats.
\[ \begin{array}{rrrrrrrrrr|r} x_1 &x_2 & x_3 & x_4 & x_5 & x_6 & s_1 & s_2 & s_3 & s_4 & b \\ \hline 1 & 9.9 & 3.4 & 0 & 0 & 3.3 & 0 & -0.3 & 0 & 0 & 6.47 \\ -0 & 0.1 & 0.2 & 1 & 0 & 0.3 & 0 & -0 & -0 & 0 & 2.60 \\ 0 & -2.2 & -0.6 & 0 & 1 & -0.3 & -0 & 0.1 & -0 & 0 & 2.08 \\ 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 4 \\ \hline 0 & 77.8 & 41.3 & 0 & 0 & 29.2 & 0.2 & 0.2 & 0.1 & 0 & -541.1 \end{array} \]We just added a new row and a new column for the new slack variable. It looks weird to just add a constraint there, right? I mean we don't have a basic feasible solution anymore but we could simply use gaussian elimination to get a zero in the oats column on the last row.
Let's just do it:
\[ \begin{array}{rrrrrrrrrr|r} x_1 &x_2 & x_3 & x_4 & x_5 & x_6 & s_1 & s_2 & s_3 & s_4 & b \\ \hline 1 & 9.9 & 3.4 & 0 & 0 & 3.3 & 0 & -0.3 & 0 & 0 & 6.47 \\ -0 & 0.1 & 0.2 & 1 & 0 & 0.3 & 0 & -0 & -0 & 0 & 2.60 \\ 0 & -2.2 & -0.6 & 0 & 1 & -0.3 & -0 & 0.1 & -0 & 0 & 2.08 \\ 0 & -9.9 & 3.4 & 0 & 0 & -3.3 & 0 & 0.3 & 0 & 1 & -2.47 \\ \hline 0 & 77.8 & 41.3 & 0 & 0 & 29.2 & 0.2 & 0.2 & 0.1 & 0 & -541.1 \end{array} \]Now we have a problem. We have kind of a basic feasible solution but there is a negative value in our b column. This is an absolutely no go. Anyway it makes absolutely sense. This means that we set oats still to 6.47 as we can see in the first row but we have to set a slack variable to -2.47 to reduce it by 2.47 oat portions to fulfill our constraint. This obviously doesn't solve our problem.
We can use the so called dual simplex pivot here. Which is basically the same thing as our normal pivot we just do it somehow on the transposed matrix. Instead of looking for a negative value in our objective row we look for a negative value on our right hand side. Using this as our leaving row and the entering column is the minimal absolute quotient of the objective row with the negative values on our leaving row. Here the negative values would be -9.9 in row \(x_2\) and -3.3 in row \(x_6\). The absolute quotients would be:
\[ \begin{array}{ccc} \frac{77.8}{9.9} & \approx & 7.86 \\ \frac{29.2}{3.3} & \approx & 8.85 \\ \end{array} \]So we try to get a 1 where -9.9 is and have \(x_2\) as in our basis. We get this tableau:
\[ \begin{array}{rrrrrrrrrr|r} x_1 &x_2 & x_3 & x_4 & x_5 & x_6 & s_1 & s_2 & s_3 & s_4 & b \\ \hline 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 4.0 \\ 0 & 0 & 0.16 & 1 & 0 & 0.24 & 0 & 0 & 0 & 0.01 & 2.56 \\ 0 & 0 & 0.14 & 0 & 1 & 0.37 & 0 & 0.02 & 0 & -0.22 & 2.61 \\ 0 & 1 & 0.35 & 0 & 0 & 0.33 & 0 & -0.03 & 0 & -0.10 & 0.25 \\ \hline 0 & 0 & 14.29 & 0 & 0 & 3.35 & 0.20 & 2.78 & 0.06 & 7.88 & -560.57 \end{array} \]We have a basic feasible solution and as we can see on our objective row... It's optimal. We only have 4 portions of oats so it looks like we achieved our goal.
Unfortunately we have to eat more cake now and we aren't allowed to eat more than one portion of it. Therefore we have to add a new line and a new slack column to our tableau.
Whatever I want to think about something else now. Last time we achieved some huge speed ups with the dual problem. Do we have the option to add a new constraint there as well?
This would be our starting tableau:
\[ \begin{array}{rrrrrrrrr|r} c_1 & c_2 & c_3 & s_1 & s_2 & s_3 & s_4 & s_5 & s_6 & b \\ \hline 110 & 4 & 2 & 1 & 0 & 0 & 0 & 0 & 0 & 25 \\ 205 & 32 & 12 & 0 & 1 & 0 & 0 & 0 & 0 & 130 \\ 160 & 13 & 54 & 0 & 0 & 1 & 0 & 0 & 0 & 85 \\ 160 & 8 & 285 & 0 & 0 & 0 & 1 & 0 & 0 & 70 \\ 420 & 4 & 22 & 0 & 0 & 0 & 0 & 1 & 0 & 95 \\ 260 & 14 & 80 & 0 & 0 & 0 & 0 & 0 & 1 & 98 \\ \hline -2000 & -55 & -800 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ \end{array} \]I used the column description \(c_1,c_2,c_3\) for our three constraints here. If we want to add a constraint to this one we normally add a new column and not a new row. After solving this tableau it would look like this:
\[ \begin{array}{rrrrrrrrr|r} c_1 & c_2 & c_3 & s_1 & s_2 & s_3 & s_4 & s_5 & s_6 & b \\ \hline 0 & 1 & 0 & 0.33 & 0 & 0 & 0.00 & -0.09 & 0 & 0.19\\ 0 & 0 & 0 & -9.87 & 1 & 0 & -0.14 & 2.15 & 0 & 77.76\\ 0 & 0 & 0 & -3.42 & 0 & 1 & -0.21 & 0.60 & 0 & 41.25\\ 0 & 0 & 1 & -0.01 & 0 & 0 & 0.00 & 0.00 & 0 & 0.12\\ 1 & 0 & 0 & -0.00 & 0 & 0 & -0.00 & 0.00 & 0 & 0.22\\ 0 & 0 & 0 & -3.28 & 0 & 0 & -0.28 & 0.35 & 1 & 29.18\\ \hline 0 & 0 & 0 & 6.47 & 0 & 0 & 2.60 & 2.08 & 0 & 541.10\\ \end{array} \]As we can see all lines changed and it doesn't really make sense to just normally add a new column here because the new added column would have to change in the same way the other columns changed during each gauss step. We could keep track of every step we did and apply it to the new column.
This obviously wouldn't give us any speed up.
There is also another more mathematical approach:
Mathematics isn't the friend of everyone but you're reading this post so far, right? ;)
\[ \begin{array}{cccc|c} x_1 & x_2 & \dots & x_n & b \\ \hline a_{11} & a_{12} & \dots & a_{13} & b_1 \\ a_{21} & a_{22} & \dots & a_{23} & b_2 \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ a_{m1} & a_{m2} & \dots & a_{mn} & b_m \\ \hline -c_1 & -c_2 & \dots & -c_n & z \end{array} \]This is the start of a maximum problem, where z would be 0 in the beginning. If we write it as short as possible it would look like this:
\[ \begin{array}{c|c} x & b \\ \hline A & b \\ \hline -c & z \end{array} \]Here A is a matrix as described in the last tableau, b and c are vectors and z is just a number.
Let's assume A includes the slack variables already and therefore we have a basic feasible solution. We still write it as the matrix above. Anyway we can also write it in a different way:
\[ \begin{array}{cc|c} x_B & x_N & b \\ \hline B & N & b \\ \hline -c_B & -c_N & z \end{array} \]Here \(B\) is the matrix for the slack variables or later on for the basic variables. \(x_N\) are the variables we are interested in (Non basic yet). \(x_B\) are the basic variables (at the moment the slack variables). I'll use \(B\) on the most left part as they do it here. It's a good course about the simplex algorithm and you'll find a lot of stuff which I'll not mention here. Anyway I mention some stuff here which I find interesting which aren't mention there and my version is more compact in some way. I recommend to read both versions ;)
Now during the simplex method we do some gaussian elimination stuff and therefore we get another basic feasible solution and so on but we never really change the tableau. We only use elementary row operations.
If we look at our tableau in a different stage it would still look like this:
\[ \begin{array}{cc|c} x_B & x_N & b \\ \hline I & \tilde{N} & \tilde{b} \\ \hline 0 & \tilde{-c_N} & \tilde{z} \end{array} \]\(\tilde{x}\) means that this is the new value of the corresponding variable. We want to describe this new values in form of the values before to have a step from one BFS to another.
We start with \(Ax=b\). This is the most basic form we can describe the tableau normally. Then we added \(B\) and \(N\). Therfore it looks like this now:
\[ Ax = Bx_B+Nx_N = b \]In the other form:
\[ x_B + \tilde{N}x_N = \tilde{b} \]To get to this form we can multiply the first equation by \(B^{-1}\).
\[ B^{-1}Ax = x_B+B^{-1}Nx_N = B^{-1}b \]Therefore:
\[ \tilde{N} = B^{-1}N \\ \tilde{b} = B^{-1}b \]What about \(\tilde{-c_N}\) and \(\tilde{z}\) ?
To compute z we have to remember that we always start our last row (the objective row) using \(-c\) and \(z=0\). Why are we doing this or how is this really working. Actually the tableau doesn't look like this:
\[ \begin{array}{c|c} x & b \\ \hline A & b \\ \hline -c & z \end{array} \] \[ \begin{array}{cc|c} z & x & b \\ \hline 0 & A & b \\ \hline 1 & -c & z \end{array} \]it more looks like this. That's the reason why z is 0 at the beginning to have the "constraint":
\[ 1-c = 0 \]where \(c\) is the objective function. Therefore it holds that:
\[ z = c_Bx_B+c_Nx_N \]Now we had \(x_B + \tilde{N}x_N = \tilde{b}\) so \(x_B = -\tilde{N}x_N + \tilde{b}\). Consequently:
\[ \begin{array}{ccl} z & = & c_B(-\tilde{N}x_N + \tilde{b})+c_Nx_N \\ & = & c_B(-B^{-1}Nx_N + B^{-1}b)+c_Nx_N \\ & = & c_BB^{-1}b+(c_N-B^{-1}N)x_N \\ \end{array} \]\(x_N = 0\) so that \(z=c_BB^{-1}b\) but \(c_N-B^{-1}N\) is \(\tilde{c_N}\).
This gives us the tableau:
\[ \begin{array}{ccc|c} z & x_B & x_N & b \\ \hline 0 & I & B^{-1}N & B^{-1}b \\ \hline 1 & 0 & -c_N+c_BB^{-1}N & c_BB^{-1}b \end{array} \]Furthermore \(A=IN\) therfore \(A=N\) so that it's simply:
\[ \begin{array}{cc|c} z & x & b \\ \hline 0 & B^{-1}A & B^{-1}b \\ \hline 1 & -c+c_BB^{-1}A & c_BB^{-1}b \end{array} \]We use the matrix \(B^{-1}\) quite often as well as \(c_BB^{-1}\). Therefore we set:
\[ \tilde{y} = c_BB^{-1} \tilde{S} = B^{-1} \]to get
\[ \begin{array}{cc|c} z & x & b \\ \hline 0 & \tilde{S}A & \tilde{S}b \\ \hline 1 & -c+\tilde{y}A & \tilde{y}b \end{array} \]We can also split it into the form of our real variables and the slack variables to get:
\[ \begin{array}{ccc|c} z & x & s & b \\ \hline 0 & \tilde{S}A & \tilde{S} & \tilde{S}b \\ \hline 1 & -c+\tilde{y}A & \tilde{y} & \tilde{y}b \end{array} \]Maybe while you were totally suck into the mathematical part... You forgot why we actually did this. We still want to be able to add a column to our solved tableau.
We can describe our mathematical part using these equations:
\[ \begin{array}{ccccc} \tilde{b} &= & \tilde{S}b & \quad & (1) \\ \tilde{A} &= & \tilde{S}A & \quad & (2) \\ \tilde{c} &= & c-\tilde{y}A & \quad & (3) \\ \tilde{z} &= & \tilde{y}b & \quad & (4) \\ \end{array} \]This is our solved tableau:
\[ \begin{array}{rrrrrrrrr|r} c_1 & c_2 & c_3 & s_1 & s_2 & s_3 & s_4 & s_5 & s_6 & b \\ \hline 0 & 1 & 0 & 0.33 & 0 & 0 & 0.00 & -0.09 & 0 & 0.19\\ 0 & 0 & 0 & -9.87 & 1 & 0 & -0.14 & 2.15 & 0 & 77.76\\ 0 & 0 & 0 & -3.42 & 0 & 1 & -0.21 & 0.60 & 0 & 41.25\\ 0 & 0 & 1 & -0.01 & 0 & 0 & 0.00 & 0.00 & 0 & 0.12\\ 1 & 0 & 0 & -0.00 & 0 & 0 & -0.00 & 0.00 & 0 & 0.22\\ 0 & 0 & 0 & -3.28 & 0 & 0 & -0.28 & 0.35 & 1 & 29.18\\ \hline 0 & 0 & 0 & 6.47 & 0 & 0 & 2.60 & 2.08 & 0 & 541.10\\ \end{array} \]Again we want to add the oats constraint into it. Don't eat more than 4 portions of oats!
\[ \begin{array}{rrrrrrrrrr|r} c_1 & c_2 & c_3 & c_4 & s_1 & s_2 & s_3 & s_4 & s_5 & s_6 & b \\ \hline 0 & 1 & 0 & -1 & 0.33 & 0 & 0 & 0.00 & -0.09 & 0 & 0.19\\ 0 & 0 & 0 & 0 & -9.87 & 1 & 0 & -0.14 & 2.15 & 0 & 77.76\\ 0 & 0 & 0 & 0 & -3.42 & 0 & 1 & -0.21 & 0.60 & 0 & 41.25\\ 0 & 0 & 1 & 0 & -0.01 & 0 & 0 & 0.00 & 0.00 & 0 & 0.12\\ 1 & 0 & 0 & 0 & -0.00 & 0 & 0 & -0.00 & 0.00 & 0 & 0.22\\ 0 & 0 & 0 & 0 & -3.28 & 0 & 0 & -0.28 & 0.35 & 1 & 29.18\\ \hline 0 & 0 & 0 & 4 & 6.47 & 0 & 0 & 2.60 & 2.08 & 0 & 541.10\\ \end{array} \]We just added the column \(c_4\) here. Now we have to update our tableau it a way that it still fulfills our constraints of the diet problem. By adding a new column we kind of destroyed everything we achieved before.
We changed \(A\) and we changed \(c\) therefore we have to apply the equations \((2)\) and \((3)\).
\[ \begin{array}{ccccc} \tilde{A} &= & \tilde{S}A & \quad & (2) \\ \tilde{c} &= & c-\tilde{y}A & \quad & (3) \\ \end{array} \]The changed part of \(A\) is:
\[ \begin{pmatrix} -1 \\ 0 \\ 0 \\ 0 \\ 0 \\ \end{pmatrix} \]and \(\tilde{y}\) is:
\[ \begin{pmatrix} 6.47 & 0 & 0 & 2.60 & 2.08 & 0 \end{pmatrix} \]and \(\tilde{S}\) is:
\[ \begin{pmatrix} 0.33 & 0 & 0 & 0.00 & -0.09 & 0\\ -9.87 & 1 & 0 & -0.14 & 2.15 & 0 \\ -3.42 & 0 & 1 & -0.21 & 0.60 & 0 \\ -0.01 & 0 & 0 & 0.00 & 0.00 & 0 \\ -0.00 & 0 & 0 & -0.00 & 0.00 & 0 \\ -3.28 & 0 & 0 & -0.28 & 0.35 & 1 \\ \end{pmatrix} \]To compute the new column we have to compute:
\[ \begin{pmatrix} 0.33 & 0 & 0 & 0.00 & -0.09 & 0\\ -9.87 & 1 & 0 & -0.14 & 2.15 & 0 \\ -3.42 & 0 & 1 & -0.21 & 0.60 & 0 \\ -0.01 & 0 & 0 & 0.00 & 0.00 & 0 \\ -0.00 & 0 & 0 & -0.00 & 0.00 & 0 \\ -3.28 & 0 & 0 & -0.28 & 0.35 & 1 \\ \end{pmatrix} \begin{pmatrix} -1 \\ 0 \\ 0 \\ 0 \\ 0 \\ \end{pmatrix} = \begin{pmatrix} -0.33 \\ 9.87 \\ 3.42 \\ 0.01 \\ 0.00 \\ 3.28 \end{pmatrix} \]We also compute the new \(c\) value using \(c-\tilde{y}A\).
\[ -4- \begin{pmatrix} 6.47 & 0 & 0 & 2.60 & 2.08 & 0 \end{pmatrix} \begin{pmatrix} -1 \\ 0 \\ 0 \\ 0 \\ 0 \\ \end{pmatrix} = -4+6.47 = 2.47 \]Therefore our new value is \(-2.47\) as we always have \(-c\) in our tableau.
The new tableau is:
\[ \begin{array}{rrrrrrrrrr|r} c_1 & c_2 & c_3 & c_4 & s_1 & s_2 & s_3 & s_4 & s_5 & s_6 & b \\ \hline 0 & 1 & 0 & -0.33 & 0.33 & 0 & 0 & 0.00 & -0.09 & 0 & 0.19\\ 0 & 0 & 0 & 9.87 & -9.87 & 1 & 0 & -0.14 & 2.15 & 0 & 77.76\\ 0 & 0 & 0 & 3.42 & -3.42 & 0 & 1 & -0.21 & 0.60 & 0 & 41.25\\ 0 & 0 & 1 & 0.01 & -0.01 & 0 & 0 & 0.00 & 0.00 & 0 & 0.12\\ 1 & 0 & 0 & 0.00 & -0.00 & 0 & 0 & -0.00 & 0.00 & 0 & 0.22\\ 0 & 0 & 0 & 3.28 & -3.28 & 0 & 0 & -0.28 & 0.35 & 1 & 29.18\\ \hline 0 & 0 & 0 & -2.47 & 6.47 & 0 & 0 & 2.60 & 2.08 & 0 & 541.10\\ \end{array} \]We can see in our objective row that this tableau isn't solved yet therefore we have to make at least one more pivot step.
After we run our pivot algorithm until it's solved the tableau looks like the following:
\[ \begin{array}{rrrrrrrrrr|r} c_1 & c_2 & c_3 & c_4 & s_1 & s_2 & s_3 & s_4 & s_5 & s_6 & b \\ \hline 0 & 1 & 0 & 0 & 0 & 0.03 & 0 & -0.00 & -0.02 & 0 & 2.78\\ 0 & 0 & 0 & 1 & -1 & 0.10 & 0 & -0.01 & 0.22 & 0 & 7.88\\ 0 & 0 & 0 & 0 & 0 & -0.35 & 1 & -0.16 & -0.15 & 0 & 14.29\\ 0 & 0 & 1 & 0 & 0 & -0.00 & 0 & 0.00 & -0.00 & 0 & 0.06\\ 1 & 0 & 0 & 0 & 0 & -0.00 & 0 & -0.00 & 0.00 & 0 & 0.20\\ 0 & 0 & 0 & 0 & 0 & -0.33 & 0 & -0.24 & -0.37 & 1 & 3.35\\ \hline 0 & 0 & 0 & 0 & 4 & 0.25 & 0 & 2.57 & 2.61 & 0 & 560.57\\ \end{array} \]We can do the same stuff for our cake constraint. I hope you enjoyed the mathematical part or at least think that you got a better understanding of the whole structure. I'm currently working on making the Simplex algorithm (ok my implementation of it :D) more accurate.
You might think: What? Why is it not accurate? Well computers aren't accurate.
Open your console and type python
to get into the python console. Now type round(2.675,2)
. This should round \(2.675\) to two decimal places.
Give it a go! I'll talk about it next time.
You can have a look at how I implemented everything at my github repository. Yes a lot of stuff have changed since last time. I'm sorry about that.
Unfortunately this series came to an end as I weren't able to solve some of the problems maybe I should give it a go another time.
If you're interested in another series check out my posts on constraint programming: Solving Sudokus again but also much more, faster and in depth: Constraint Programming Solver from Scratch.
Thanks for reading.
If you've enjoyed this and some other posts on my blog I would appreciate a small donation either via PayPal or Patreon whereas the latter is a monthly donation you can choose PayPal for a one time thing. For the monthly donation you'll be able to read a new post earlier than everyone else and it's easier to stay in contact.