I’m sure most of my readers have heard of Dr. Fill, the Matt Ginsberg solving computer program that competed in this past ACPT. Now Matt is an artificial intelligence expert, so I’m sure Dr. Fill does about as well as a computer program could at solving crosswords. My question is: how hard would it be to write a computer program that would give about 80% of the solving capability of Dr. Fill? And to make it especially easy on ourselves, let’s presume we already have a large database of crossword clues and entries, and a relatively fast, effective way of ranking entries with a clue and a letter pattern (e.g. given [Melodic passages] and the letter pattern ?R???? it would return ARIOSI with score 19 and ARIOSO/ARIOSE with score 10). Since I have this setup already, I thought this might be a good starting point. 🙂
What would the algorithm look like from this point? Here are my initial thoughts.
I think it would need to be a recursive/iterative algorithm. At each step we have a partially filled-in grid. The algorithm would:
- Go to each position in the grid and apply the above process to get possibilities for that position. If no possibilities are returned, simply select all possibilities that fit and assign them a low score.
- Take the highest-scoring match and place it in the grid.
- Repeat with the new grid.
If we get to a point where there are no possibilities we have to backtrack. That subroutine would go to (I guess) the most recently filled-in answer, remove it and go to the next possibility. Meanwhile, once the grid is completed we stop, unless there’s some way to “check” the grid.
Is there a better way to do this, or some tweaks to the above I can add? I kind of want to do this to see how well it would perform.
March 25, 2012 at 11:31 pm
Your algorithm is reasonable. I assume that you would include in the “no-possibilities” subroutine a mechanism that, under certain conditions, allows non-recognized letter strings (theme entries and other database holes) rather than backtracking.
I admire the algorithm design of Dr. Fill and other clue-and-answer-solvers, but I’m even more impressed by the amount of data mining required to make such a computer program viable in, say, the ACPT environment. I’m excited to see how your data list fares in crossword solving.
Pingback: FillBot Jr.: Initial Results | Box Relatives