While I’m at it, I might as well write up the problem that inspired the last post. Here it is:
Flip a coin until you get consecutive heads, then stop. Let X represent the number of flips it took. Then flip another coin until you get three consecutive heads. Let Y represent the total number of flips there. Find Pr(X>Y).
I stumbled across a result yesterday that I thought was really cool. I went to check the googles to find out if it was a very well-known result, but I couldn’t find it proved anywhere, so I thought I’d show it here.
If you are afraid of math, this is your cue to exit. See you at my next post!
The Crosswords LA 2012 puzzles are here! Get them while they’re hot!
CROSSWORDS LA 2012
Puzzles should be e-mailed to you almost immediately. If you don’t get them right away, check your spam folder. If you still don’t see them, me and I’ll send them to you myself (once I verify your purchase).
If you want to help promote Crosswords LA (and why wouldn’t you? All proceeds go to a very worthy cause) please consider adding the Paypal button to your own site. Here’s the code:
As I mentioned in the comments of the last post, FillBot Jr. has successfully solved a crossword! Now admittedly, it was low-hanging fruit — a Monday Newsday by Gail Grabowski — but I’m happy to have that result as a sort of “proof of concept.” So far the algorithm is ultra-simple: it looks for the answer it has the most confidence in and fills it in, ignoring any potential problems it may have down the road.
Buoyed by this result, I decided to tackle the following day’s puzzle. And it did great — filling in the entire puzzle except for one blank: the crossing of:
NO_E [Not any] and
MI_E [Not yours]
Wait, really? That’s what tripped up my bot? What gives? This is especially infuriating because I have both of those EXACT CLUES in my database.
Well, as you may recall, I am leaning heavily on MySQL’s full-text search which, for a given clue, quickly scours my database finding similar clues and even gives them a numerical value according to how well the clue matches. Well, it turns out that this has a big limitation in the form of stopwords — words that aren’t indexed by MySQL. And, you guessed it — “not”, “any”, and “yours” are on that list. I’m not really sure how to get around this; like I said, I am relying very heavily on this capability of MySQL. Maybe once I add some logic for “checking the crossings” this problem will be mitigated.
In any case, I’m happy with the results so far. Once I add some cross-checking logic, I think I might just be left with optimization improvements. And if it turns out to be decent, I’d love to add it to Crossword Nexus to allow users to upload .puz files to see how the bot handles them.
If you’re a regular on this blog, you may remember that I floated the idea of making a crossword-filling algorithm just to see how hard it would be. I wasn’t sure I’d ever get around to making it but today I was sick and bedridden and bored (yay?) so I spent a few hours coding something up. If you’d like to see more or less what I’ve done and how it performs on a sample themeless puzzle, follow me to the rest of this post.
I’m sure most of my readers have heard of Dr. Fill, the Matt Ginsberg solving computer program that competed in this past ACPT. Now Matt is an artificial intelligence expert, so I’m sure Dr. Fill does about as well as a computer program could at solving crosswords. My question is: how hard would it be to write a computer program that would give about 80% of the solving capability of Dr. Fill? And to make it especially easy on ourselves, let’s presume we already have a large database of crossword clues and entries, and a relatively fast, effective way of ranking entries with a clue and a letter pattern (e.g. given [Melodic passages] and the letter pattern ?R???? it would return ARIOSI with score 19 and ARIOSO/ARIOSE with score 10). Since I have this setup already, I thought this might be a good starting point.
What would the algorithm look like from this point? Here are my initial thoughts.
I wanted to see my permissions on SQL Server and Googling it was proving very hard. So I’m adding this here in case someone might find it useful. This will look at all the databases on the current server and list all of your permissions for each. As a bonus, it will list the current server as well.
DECLARE @myserver sysname;
SET @myserver = (
s.server_id = 0
@myserver as 'Server'
, d.name AS 'Database'
, fbp.permission_name AS Permission
'DATABASE' AS mytype
JOIN sys.fn_builtin_permissions(null) fbp
ON d.mytype = fbp.class_desc
Has_perms_by_name(quotename(d.name) , 'database' , fbp.permission_name) = 1
All right, the new Wikipedia sort is live on CrosswordNexus.com and this time I am allowing users to download the original list to play with offline. Now that I’ve played with it a bit, I have some ideas for next time that I’m going to gather here. If you have some ideas too, feel free to chip in.
First observation: I don’t like the distribution of scores. I actually made a histogram of the distribution which you can see here:
The most interesting part of my Crossword Nexus website is the Wikipedia Regex search, and the most interesting part of that is the ordering of results. I didn’t want it to return results alphabetically — I wanted it to return results based on relevance. But how can you automatically determine if a Wikipedia article is relevant? Well, the method I implemented was ordering by inlinks. The more links there were to an article, the more interesting it should be, right?
For the most part, this works pretty well. But let’s ask the site for the best results of the form ??E?L??.
Before we go on, try to think of some good crossword entries that would fit this pattern, preferably ones with Wikipedia pages. Continue Reading →
This past Wednesday my two year-old son was diagnosed with leukemia. At this age the chances of a cure are surprisingly high and the doctors have been extremely pleased with his progress so far. Still, it is an incredibly scary process which is draining in so many ways.
I have been hesitant to write about this, but I now think it is necessary to do so. It doesn’t hurt to raise a little awareness about cancer. And writing this now will allow me to write more about it in the coming weeks, months and years as his treatment progresses. I’m sure I will have lots to say on the subject.
Many people have asked how they can help. I would ask you to consider giving blood or (especially) platelets. A timely transfusion of platelets early on in the process may have saved my son’s life. If you are eligible and can find the time, please consider donating.
Also: I don’t know what this will do to my schedule quite yet, but I am guessing that a lot of non-essential activities will fall by the wayside. Don’t expect much from me on Twitter or here for a little while. I also expect @FakeWillShortz to slow down his rate of tweeting, though he’ll be taking on some new writers soon.