Here’s this past week’s NPR puzzle:
Name a well-known movie of the past — two words, seven letters in total. These seven letters can be rearranged to spell the name of an animal plus the sound it makes. What animal is it?
Brute-forcing this puzzle is easy but ineffective. We can get a list of movies from IMDB and see which of the seven-letter ones can anagram into two words. But you don’t want to do this, because it will force you to sift through a ton of results. Can we do better? (Yes, we can.)
We don’t want to use *all* movies, and *all* words if we can avoid it. Can we get a list of “well-known” movies, animals, or animal sounds? The first is the easiest; instead of downloading the movies.list file from the IMDB link above, download the ratings.list file and only consider movies that have above a certain threshold of votes. I actually rated the movies on a scale of 1-100 using this method, so that the cutoffs make more sense. Now, how about the list of animals and animal sounds? Well, thanks to Python/NLTK and a terrific hint from Stack Overflow, we can actually get the members of any category. Check this out:
from nltk.corpus import wordnet as wn def get_category_members(name): members = set() synsets = wn.synsets(name) for synset in synsets: members = members.union(set([w for s in synset.closure(lambda s:s.hyponyms(),depth=10) for w in s.lemma_names])) return members
Easy, right? If you type get_category_members('animal')
it will give you everything that Wordnet thinks is an animal.
Animal sounds are a little harder to come by. Typing get_category_members('animal_sounds')
gets you the empty set, so we have to be a little more creative. Let’s check out the hypernyms for a few animal sounds to give us an idea of what to do.
sounds = ('meow','moo','baa') for s in sounds: syn = wn.synsets(s) for x in syn: print x, x.hypernyms()
Synset('meow.n.01') [Synset('cry.n.05')] Synset('meow.v.01') [Synset('utter.v.02')] Synset('moo.n.01') [Synset('cry.n.05')] Synset('moo.v.01') [Synset('utter.v.02')] Synset('baa.n.01') [Synset('cry.n.05')] Synset('bleat.v.02') [Synset('utter.v.02')]
Now we’re on to something! Each of these of course can be a noun or a verb, but they all fall under the umbrella of “cry” or “utter.” So if we take category members of those two words — well, it’ll give us too much, but that’s okay. At least we should have all the relevant animal sounds in there.
Put it all together and you get something like this:
import re from nltk.corpus import wordnet as wn def is_good_movie(m): ''' Count a movie as "good" if it only has 7 letters and one space ''' m2 = re.sub(r'[^A-Za-z]+','',m) if len(m2) == 7 and m.count(' ') == 1: return True else: return False def get_category_members(name): ''' Use NLTK to get members of a category ''' members = set() synsets = wn.synsets(name) for synset in synsets: members = members.union(set([w for s in synset.closure(lambda s:s.hyponyms(),depth=10) for w in s.lemma_names])) return members def w1_in_w2(w1,w2): ''' Return leftover letters if all the letters from w1 are in w2 Return False otherwise ''' for c in w1: if c not in w2: return False else: pos = w2.find(c) w2 = w2[:pos] + w2[pos+1:] return w2 # Get a set of movies movies = [] with open('RankedMovies.txt','rb') as fid: for x in fid.readlines(): x = x.rstrip('\r\n') movie, score = x.split('\t') score = int(score) if score >= 90: movie = movie[:-7] # Remove the year if is_good_movie(movie): movies.append(movie) # Let's get a list of things that might be sounds sounds_tmp = set() for x in ('utter','cry'): sounds_tmp = sounds_tmp.union(get_category_members(x)) sounds = frozenset([z for z in sounds_tmp if len(z) < 7]) # Dictionary to help find anagrams sorted_to_word = dict() for word in sounds: s = ''.join(sorted(word)) try: sorted_to_word[s].append(word) except KeyError: sorted_to_word[s] = [word] # Get a set of animals animals_tmp = get_category_members('animal') animals = frozenset([y for y in animals_tmp if len(y) < 7]) # Go through and give results for movie in movies: for animal in animals: leftover = w1_in_w2(animal,movie.lower()) if leftover: l = leftover.replace(' ','') s = ''.join(sorted(l)) try: new_words = sorted_to_word[s] except KeyError: new_words = [] for n in new_words: print "%s => %s %s" % (movie, animal, n)
Which produces the following result:
Das Boot => toad sob Red Dawn => wren add Red Dawn => wren add Big Stan => bat sign Big Stan => bat sing La Bamba => lamb baa Rio Lobo => loir boo Dear God => dog read Rapa Nui => arui pan Grey Owl => grey low
So there you have it. In about a second you get only 10 results and it’s easy to pick out the right one. I’m stoked about this new method to find members of a category; I sure wish I had it when I was working on the challenge to name a tree whose letters can be rearranged to spell two herbs or spices. It took me forever to get good lists for trees and herbs and spices, but with this approach the problem is completely trivialized.
September 3, 2014 at 10:40 am
What does the fox say? http://notalwaysrelated.com/viral-side-effects/28000
I mean really you’re looking for an animal and a noise that it makes in seven letters that also make a movie title. On the other hand, four of those eight movie titles, including the winner, aren’t in English.
But there’s one sound that no one knows. What if it’s “Eirf”.
October 29, 2014 at 2:12 am
What a cool tip! Never would have thought to consider “hyponyms.” I just started learning Py (very basic C++ background is not super helpful) and have found it intriguing.
Pingback: NPR Puzzle for 2015-05-17: Solving With Python | Box Relatives
May 12, 2018 at 6:36 am
I have noticed you don’t monetize your website, don’t waste your traffic,
you can earn additional bucks every month because you’ve got high quality content.
If you want to know how to make extra money, search for:
Ercannou’s essential adsense alternative
August 15, 2018 at 1:35 pm
I see you don’t monetize your site, don’t waste your traffic, you can earn additional bucks every month.
You can use the best adsense alternative for any type of website (they approve all websites), for more details
simply search in gooogle: boorfe’s tips monetize your website
October 21, 2018 at 12:06 pm
I have noticed you don’t monetize your page, don’t waste your traffic, you can earn additional cash every month.
You can use the best adsense alternative for any type of
website (they approve all websites), for more info
simply search in gooogle: boorfe’s tips monetize your website
July 23, 2019 at 4:05 am
I see you don’t monetize alexboisvert.com, don’t waste your traffic,
you can earn extra bucks every month with new monetization method.
This is the best adsense alternative for any type
of website (they approve all sites), for more details simply search in gooogle:
murgrabia’s tools
September 2, 2019 at 1:40 am
Understanding the fundamentals will probably pay off in the end, and will give you a
definitely better knowledge.
August 20, 2024 at 11:06 am
Po rankų ir pažastų depiliacijos jūsų oda ilgai išlieka švelni ir be plaukelių. Labai patiko procedūra ir profesionalus požiūris, labai rekomenduoju! Registruokis dabar.