Creation date: 2017-03-28
Last time we solved Sudokus. You missed it? Definitely check it out before you read this one. It's here!
Was it simple? Not really, because we had to build everything up. At least for me it was a lot of fun and I learnt much stuff during the coding part and even more while I wrote it in my blog.
Therefore, I decided to give it another try and solve Str8ts this time. I don't really know whether they are harder than Sudokus for humans but they are definitely harder for my constraint programming model. I thought it will be easy but it wasn't that easy. Let's start: What is a Str8ts?
First of all it is quite similar to a Sudoku. It is a 9x9 grid and you should fill the white spaces with the digits 1-9. Now there are black squares which divides the Str8ts into several parts. Between two black squares or a black square and the border (in a row or column) there should be a straight. Each row and each column has a alldifferent
constraint which is the same as for the Sudoku.
In the given example above we have a lot of digits already given and it is quite an easy Str8ts which is nice for explaining. One of the straights is marked here in blue.
In this three fields there should be a straight because it is blocked by one black square on the left and one on the right. We can construct a straight here by using one of the digits 6 or 9. In the row there is already a 6 so we can add a 9 there. This would result in an 8 on top of it to build the next straight and so on.
You can't wait to dive deep into the code right? We can simply find all the straights and add a new constraint of the type straight
which we define later on. Oh no wait... How do we define our grid now? We have to know where the black squares are.
# parse the input file
grid, grid_c = parse_input(sys.argv[1])
grid = np.array(grid)
grid_c = np.array(grid_c,dtype=bool)
This time we save all the Str8ts in files to make our code structure a bit easier. The Str8ts shown above will be represented as this in our file:
0! 5! 0. 0. 0. 0! 3. 0. 0!
0. 7. 0. 0. 6. 4. 2. 0. 0!
0. 9. 0. 0! 0! 0. 4. 5. 0.
0! 8. 0. 0! 4. 0. 1! 0. 3.
0! 0! 4. 0. 5. 0. 0. 0! 0!
0. 0. 0! 4. 0. 0! 0. 0. 0!
1. 0. 0. 0. 0! 8! 0. 0. 0.
9! 0. 0. 0. 0. 0. 0. 7. 0.
0! 0. 0. 0! 0. 7. 8. 0! 6!
So as before we have the digit 0 for empty fields. Now we also have a .
or a !
after each digit. The .
marks that it's a white field each black field has a !
.
To parse the file we use:
def parse_input(filename):
# input file line looks like
# 0! 5! 0. 0. 0. 0! 3. 0. 0!
# ! means black field . means white field
# 0 means blank
grid = []
grid_c = []
with open('./data/'+filename) as f:
for line in f:
split_line = line.split(' ')
grid_line = []
grid_c_line = []
for part in split_line:
grid_line.append(int(part[0]))
# ! means black => add a 1 in the color array
grid_c_line.append(1 if part[1] == "!" else 0)
grid.append(grid_line)
grid_c.append(grid_c_line)
return grid, grid_c
The idea is to have two grids. The grid
is the same as before and grid_c
is the color grid which is boolean. 1 if black 0 if white.
Okay now we can find all the straights and add new constraints:
# find all straights (between black boxes)
# per row
for r in range(len(grid_c)):
straights = []
c_straight = []
for c in range(len(grid_c[r])):
if grid_c[r][c]:
if len(c_straight):
straights.append(c_straight)
c_straight = []
else:
c_straight.append(c)
if len(c_straight):
straights.append(c_straight)
So for each row we define a list of straights and the current straight as empty. If the current field is black we check whether we have a straight if yes we add the current straight to the list of all straights. If the current field is white we can simply add the index to the array c_straight
. At the end we have to check if there is a straight missing (if the last field is not black) Then we add it to straights
. Now for every straight we subscribe to the indices of the straight and add the constraint straight
.
for straight in straights:
idx = np.full(grid.shape, False, dtype=bool)
idx[r,straight] = True
model.subscribe({'idx':idx},model.check_constraint,{'idx':idx},"straight")
We do the same stuff for all the columns. That part is just copy and paste. Before we define the straight
constraint I want to mention that I changed check_constraint
function a little bit so that every constraint type has it's own function.
def check_constraint(self,opts,operator):
ss_idx = opts['idx']
values = self.search_space[ss_idx]
breaked = False
for v in values:
if 'values' in v:
breaked = True
break
if not breaked:
return
try:
if operator == "notInSet":
new_knowledge, new_possible = self.not_in_set(ss_idx,values,opts['against'])
elif operator == "alldifferent":
new_knowledge, new_possible = self.alldifferent(ss_idx,values)
elif operator == "straight":
new_knowledge, new_possible = self.straight(ss_idx,values,opts)
except InfeasibleError as e:
raise e
# update changed and search space
old_changed = self.changed.copy()
self.changed[ss_idx] = new_knowledge
self.changed = np.logical_or(self.changed,old_changed)
self.search_space[ss_idx] = new_possible
At the beginning we check if there are unknown values. If not we don't have to do anything. Then we have a switch
for each operator. At the moment I use the operators notInSet
,alldifferent
and straight
. Stop here for a moment to think about our straight
constraint... It's nice to know what the smallest number of the straight can be and what the biggest number. Let's do that first before we think about more stuff.
def straight(self, ss_idx, values, opts):
"""
the straight constraint
ss_idx: the indices where there has to be a straight
values: the actual values on these indices
opts: all options
"""
fixed_values = []
min_val = self.highest_value
max_val = self.lowest_value
len_values = len(values)
In our example the highest value
is set to 9 and the lowest value
is set to 1. This is done in build_search_space
if you want to have a look. I can't explain every line in this blog because it's just to massive but everything is on GitHub and explain all of the more complicated stuff here.
# build and array of already fixed values
# and get the highest and lowest fixed value
for i in range(len(values)):
if 'value' in values[i]:
fixed_values.append(values[i]['value'])
if values[i]['value'] > fixed_max_val:
fixed_max_val = values[i]['value']
if values[i]['value'] < fixed_min_val:
fixed_min_val = values[i]['value']
We just get all values which are already fixed if any and the maximum and minimum fixed value. If there is no fixed value:
min_val = self.lowest_value
max_val = self.highest_value
for i in range(len(values)):
if 'values' in values[i]:
c_min = min(values[i]['values'])
c_max = max(values[i]['values'])
else:
c_min = values[i]['value']
c_max = values[i]['value']
if c_max+len_values-1 < max_val:
max_val = c_max+len_values-1
if c_min-len_values+1 > min_val:
min_val = c_min-len_values+1
lowest_possible = min_val
highest_possible = max_val
The lowest possible value is defined as the overall maximum of the minimal value in each position minus the length of the straight plus 1. Let's assume we have a straight of length 3 and the possible values are [[1,2,3],[2,3,4],[5,7]]
. The maximum of the minimal values is 5 and the minimum of the maximum values is 3. Therefore we have a possible range between 5-3+1=3
and 3+3-1=5
. That would define the possible values to [[3],[3,4],[5]]
which looks pretty promising. If we already have a fixed value it looks like the following:
fixed_lowest_possible = max(fixed_max_val-len_values+1,self.lowest_value)
lowest_possible = max(lowest_possible,fixed_lowest_possible)
fixed_highest_possible = min(fixed_min_val+len_values-1,self.highest_value)
highest_possible = min(highest_possible,fixed_highest_possible)
Now let's remove all the values which aren't possible anymore:
new_possible = [False]*len_values
new_knowledge = [False]*len_values
arr_of_values = []
i = 0
for e in values:
if 'value' in e:
arr_of_values.append(np.array([e['value']]))
else:
c_new_values = [x for x in e['values'] if lowest_possible <= x <= highest_possible]
arr_of_values.append(np.array(c_new_values))
if len(c_new_values) < len(e['values']):
new_knowledge[i] = True
i += 1
We define an array of values arr_of_values
where we add the possible values and update new_knowledge
if something changed. Actually this is enough for the easy Str8ts mentioned above. This is the solution:
Okay I told you at the beginning that Str8ts isn't that easy... right? Let's have a look at a really hard one:
Here we have really not that many digits given and a lot of really long straights. That's what we get with our current approach:
This reminds us a bit about the hard Sudoku. Let's think about it for a moment. Maybe we can try to eliminate some more stuff by simply generating all possible straights and check if we can remove some more values. Unfortunately this is exactly what we have to do. At least that was the best option I found. It is quite cost expensive...
straights, new_values, def_used = self.get_straights(arr_of_values)
We want to have all straights, the new possible values and the definitely used values. Actually the definitely used values are the only important ones on this problem but I didn't find a way to get them faster. Hold on for a second... Why do we need the digits that are definitely used? Well because in one row or column there might be two straights and they are linked by the alldifferent
constraint so if a digit is definitely used in one straight (even if we don't know at which position) we can't use it anywhere else in that row or column.
Let's get all straights:
def get_straights(self,arr_of_values,len_straight=False,possible=False,used=False,
used_pattern=False,all_straights=False,def_used=False):
"""
get all straights for an array of values
eg. [[4], [1, 2, 3, 5, 7, 8, 9], [1, 2, 5, 6, 8, 9], [1, 3, 5, 6, 8, 9], [2, 3, 5, 6, 7, 8, 9]]
return all_straights, new_values and definitely used values
all_straights: [[4, 1, 2, 3, 5], [4, 1, 2, 5, 3], ... [4, 8, 5, 6, 7], [4, 8, 6, 5, 7]]
new_values: [{4}, {1, 2, 3, 5, 7, 8}, {8, 1, 2, 5, 6}, {8, 1, 3, 5, 6}, {2, 3, 5, 6, 7, 8}]
def_used: {1: False, 2: False, 3: False, 4: True, 5: True, 6: False, 7: False, 8: False, 9: False}
"""
This will be our function. There are a lot of parameters and at the beginning we only use a single one. The other ones are simply used during recursion. len_straight
is the length of the straight, in possible
we save all the values which are possible, used
consists of all used or not possible values in the current straight, used_pattern
is the current straight we are building. all_straights
is just a list for all possible straights and the definitely used digits are saved in def_used
.
We want to return straights, new_values, def_used
. straights
is the list of straights, new_values
is the same as possible
and def_used
is as above.
Let's assume we have the following value for arr_of_values
:
[[4], [1, 2, 3, 5, 7, 8, 9], [1, 2, 5, 6, 8, 9], [1, 3, 5, 6, 8, 9], [2, 3, 5, 6, 7, 8, 9]]
Then all_straights
will look like this:
[[4, 1, 2, 3, 5], [4, 1, 2, 5, 3], ... [4, 8, 5, 6, 7], [4, 8, 6, 5, 7]]
new_values
:
[{4}, {1, 2, 3, 5, 7, 8}, {8, 1, 2, 5, 6}, {8, 1, 3, 5, 6}, {2, 3, 5, 6, 7, 8}]
and def_used
:
{1: False, 2: False, 3: False, 4: True, 5: True, 6: False, 7: False, 8: False, 9: False}
The last one just means that 4 and 5 are used in every straight.
We define the parameters mentioned above for the first call:
if not used:
used = self.array_to_obj(self.possible_range,False)
if not used_pattern:
used_pattern = []
if not all_straights:
all_straights = []
if not len_straight:
len_straight = len(arr_of_values)
if not possible:
possible = [set() for x in range(len(arr_of_values))]
if not def_used:
# assume every digit has to be used
def_used = self.array_to_obj(self.possible_range,True)
used
will be defined to an object similar to def_used
just that every value is set to False
. We have no straights and no first straight yet. The length of the straight is defined as the length of arr_of_values
. The new possible values are defined as an empty set at the beginning. At the end we assume that every digit has to be used and we will update that later on.
# take the first entry and iterate over the possible values
# for each build a new straight (recursive)
for value in arr_of_values[0]:
# only use this value if it's not already part of the straight
if not used[value]:
copy_used = dict(used)
copy_used[value] = True
That's mostly explained in the comments. We also say that values which aren't possible are already used in our straight, that isn't true but makes it easier to use.
# set all values which aren't possible anymore to used
# for example if the straight has a length of 2 and one value is 2
# only 1 and 3 are possible but not 2 and not 4-...
for i in range(self.lowest_value,max(self.lowest_value,value-len_straight+1)):
copy_used[i] = True
for i in range(min(self.highest_value+1,value+len_straight),self.highest_value+1):
copy_used[i] = True
Then we add the current value to our pattern and use a recursive call:
copy_used_pattern = used_pattern[:]
copy_used_pattern.append(value)
# if there are 2 or more values left
if len(arr_of_values) > 1:
all_straights,possible,def_used = self.get_straights(arr_of_values[1:],len_straight,possible,copy_used,copy_used_pattern,all_straights,def_used)
else:
If we don't need a recursive call because we only have one value left:
# add the used pattern to all the straights
all_straights.append(copy_used_pattern)
# check if there are values which are not used in the straight
not_used = [x for x in self.possible_range if x not in copy_used_pattern]
for nu in not_used:
def_used[nu] = False
xi = 0
# every digit which is used is possible for the defined position
for x in copy_used_pattern:
possible[xi].add(x)
xi += 1
If a value is not used in a straight we know that it will not be used in every straight so we delete it from def_used
and if a value is used we add it to possible
.
Afterwards we just have to return it:
return all_straights, possible, def_used
Let's go back to our call:
straights, new_values, def_used = self.get_straights(arr_of_values)
if len(straights) == 0:
raise InfeasibleError("No straight possible")
If we try to solve our Str8ts now it actually doesn't do something else so we weren't able to reduce the new_values
array in a reasonable fashion.
Okay, I talked about linking our two constraints together, right? When do we have to link two constraints? We can link two straights constraints if they are in the same row or column. More abstract we can link them together when they are both part of an alldifferent
constraint.
Before we solve the model we simply add the following line:
model.link()
Unfortunately we have to code that function as well :D
def link(self):
"""
Links different constraint types together
"""
for si in range(len(self.subscribe_list_on)):
on = self.subscribe_list_on[si]
args = self.subscribe_list_func[si][1]
func = args[1]
# if a part is a straight and part of alldifferent
# get all other straights and link them together
# => the straight constraints considers what to do with it
if func == "straight":
temp_links = self.get_linked_idx(si,"alldifferent")
links = []
for t in temp_links:
links.extend(self.get_linked_idx(t,"straight"))
real_links = []
for l in links:
if not np.any(np.logical_and(self.subscribe_list_on[si],self.subscribe_list_on[l])):
real_links.append(l)
What this function does is iterating over every constraint, check if the operator is straight
and if it is we get a list of all alldifferent
constraint to which it is linked:
temp_links = self.get_linked_idx(si,"alldifferent")
After that we do basically the same by iterating of the alldifferent
functions and check to which straight
constraint they are linked. There the current straight
constraint is obviously part of it so we want to get only the constraints where the linked constraint is totally different from the current one. At the end we just add it to the argument options of the current straights
constraint.
self.subscribe_list_func[si][1][0]['links'] = real_links
We called the function get_linked_idx
several times so let's have a short look on that function as well.
def get_linked_idx(self,linked_to,constraint_type):
on = self.subscribe_list_on[linked_to]
nof_true_values = np.sum(on)
links = []
for si in range(len(self.subscribe_list_on)):
args = self.subscribe_list_func[si][1]
func = args[1]
if func == constraint_type:
sum_of_true = np.sum(np.logical_and(self.subscribe_list_on[si],on))
if sum_of_true == nof_true_values or sum_of_true == np.sum(self.subscribe_list_on[si]):
links.append(si)
return links
We iterate again over the list of constraints and wait for the correct constraint type. Then we sum up the number of indices were both constraints are relevant and if that sum is the sum of all true values in either the current constraint or the possible linked on then we link them together.
In the straights
function we add the following:
if 'links' in opts and len(opts['links']):
true_keys = get_true_keys_in_dict(def_used)
if len(true_keys):
for l in opts['links']:
self.check_constraint({'idx':self.subscribe_list_on[l],'against':set(true_keys)},"notInSet")
Let's solve:
At least we found a total number of 24 values. Juhuuuuu! Unfortunately it doesn't even look nearly solved, right? Can you as a human solve it? Actually I wasn't able to.
It is time for the last option: BACKTRACKING!
This is our solve
function:
def solve(self,idx,no_val=0):
try:
self.fire(np.full(self.search_space.shape, True, dtype=bool))
if not self.solved(idx):
print("Need to backtrack")
print("found %d values" % self.nof_found)
print("%d values are missing" % self.nof_missing)
feasible,counter = self.backtrack(idx)
print("Number of backtracks", counter)
if not feasible:
raise InfeasibleError("Infeasible checked backtracking")
except InfeasibleError as e:
print(e)
exit(2)
Most of it are print
statements so the real stuff is just
feasible,counter = self.backtrack(idx)
We basically want to know whether it's feasible and just for the statistics how many backtracks we needed. I'll actually count how deep we have to backtrack...
The backtrack function looks like this:
def backtrack(self,solved_idx,counter_bt=0):
counter_bt += 1
# save first search_space
copy_search_space = np.copy(self.search_space)
# get candidate list
candidate_list = [self.get_backtrack_candidates()]
# while candidates in list
while candidate_list:
candidate_entry = candidate_list.pop(0)
cands = candidate_entry['cands']
row, col = candidate_entry['row'],candidate_entry['col']
for cand in cands:
self.search_space = np.copy(copy_search_space)
self.search_space[row][col] = {'value':cand}
try:
idx = np.full(self.search_space.shape, False, dtype=bool)
idx[row,col] = True
self.fire(idx)
if not self.solved(solved_idx):
feasible, counter = self.backtrack(solved_idx,counter_bt)
if feasible:
return True,counter
else:
return True,counter_bt
except InfeasibleError as e:
continue
return False, counter_bt
We never want to loose what our current search space is therefore we copy it. Then we have a look what a backtrack candidate is. This is simply a field and a list of values this field can have. Maybe it's worth to pick a nice candidate but at the moment it's simply the first one with the lowest number of possible values.
While we have a candidate in our list we just check in which row and column this entry is and afterwards we pick the first value and just set it as fixed.
self.search_space = np.copy(copy_search_space)
self.search_space[row][col] = {'value':cand}
It's as simple as that. We just have to tell our constraints that a value has changed and fire everything which is affected by that. If it is solved now we can return that it is feasible and how many backtracks we had to do to find the solution.
Okay normally it isn't solved directly so we just call the function again:
feasible, counter = self.backtrack(solved_idx,counter_bt)
if feasible:
return True,counter
Sometimes it is also possible that we picked the wrong value and therefore it is now infeasible which doesn't mean that the whole problem is infeasible so we can continue.
At the beginning we had to get a backtrack candidate and this is done by this function:
def get_backtrack_candidates(self):
nof_values = sys.maxsize
best_id = [0,0]
for r in range(len(self.search_space)):
for c in range(len(self.search_space[r])):
if 'values' in self.search_space[r][c] and self.backtracking_space[r][c] == False:
l_nof_values = len(self.search_space[r][c]['values'])
if l_nof_values < nof_values:
nof_values = l_nof_values
best_id = [r,c]
cands = self.search_space[best_id[0]][best_id[1]]['values']
self.backtracking_space[best_id[0]][best_id[1]] = True
return {'row':best_id[0],'col':best_id[1],'cands':cands}
For every field in our search space we check if the value isn't fixed and if we didn't use this candidate already for backtracking:
if 'values' in self.search_space[r][c] and self.backtracking_space[r][c] == False:
Then we get the field with the minimum number of values.
We did it!
It took the program about 15 seconds. What the... Yeah exactly it is slow. It should be faster with a computer to solve this, right?
Okay why is it so slow? It takes an enormous amount of time to produce all these possible straights. Can we do better?
Well if we have a straight of length 9 it is just unnecessary to get all the definitely used values because we use every value and it's also not reasonable because we have no linking connection.
At the beginning we can just reduce the number of function calls by just saying that we only want to compute all straights if there aren't that many possible. Therefore we add this to our function straight
before we call the get_straights
function.
estimated_nof_straights = np.prod(np.array([len(x) for x in arr_of_values]))
if estimated_nof_straights > 10000:
return new_knowledge, self.list2values_structure(arr_of_values)
Then we change the link
part to
if 'links' in opts and len(opts['links']):
straights, new_values, def_used = self.get_straights(arr_of_values)
if len(straights) == 0:
raise InfeasibleError("No straight possible")
true_keys = get_true_keys_in_dict(def_used)
if len(true_keys):
for l in opts['links']:
self.check_constraint({'idx':self.subscribe_list_on[l],'against':set(true_keys)},"notInSet")
else:
return new_knowledge, self.list2values_structure(arr_of_values)
This means that we only compute the straights if we have a linking constraint otherwise we use the current best values we have. list2values_structure
is defined as:
def list2values_structure(self,list_of_values):
values_struct = []
for l in list_of_values:
if len(l) == 1:
values_struct.append({'value':l[0]})
else:
values_struct.append({'values':l})
return values_struct
That's it. This solves the puzzle in about 3 seconds. There are probably ways to speed things up by not generating all the straights and I have some ideas for it but this entry is long enough. I'll write an update if I am able to improve the speed.
And as always feel free to see the full code on GitHub
Of course don't hesitate to make comments or donate ;)
If you've enjoyed this and some other posts on my blog I would appreciate a small donation either via PayPal or Patreon whereas the latter is a monthly donation you can choose PayPal for a one time thing. For the monthly donation you'll be able to read a new post earlier than everyone else and it's easier to stay in contact.
Reading this post and hopefully the one about Sudoku I assume you might be interested in:
Solving the Megaron-Cube
The Simplex method series about linear programming