Box Relatives

Thoughts about puzzles, math, coding, and miscellaneous

NPR Puzzle for August 10, 2014: solving with Python!

| 8 Comments

Here’s this past week’s NPR puzzle:

Name a well-known movie of the past — two words, seven letters in total. These seven letters can be rearranged to spell the name of an animal plus the sound it makes. What animal is it?

Brute-forcing this puzzle is easy but ineffective. We can get a list of movies from IMDB and see which of the seven-letter ones can anagram into two words. But you don’t want to do this, because it will force you to sift through a ton of results. Can we do better? (Yes, we can.)

We don’t want to use *all* movies, and *all* words if we can avoid it. Can we get a list of “well-known” movies, animals, or animal sounds? The first is the easiest; instead of downloading the movies.list file from the IMDB link above, download the ratings.list file and only consider movies that have above a certain threshold of votes. I actually rated the movies on a scale of 1-100 using this method, so that the cutoffs make more sense. Now, how about the list of animals and animal sounds? Well, thanks to Python/NLTK and a terrific hint from Stack Overflow, we can actually get the members of any category. Check this out:

from nltk.corpus import wordnet as wn
def get_category_members(name):
    members = set()
    synsets = wn.synsets(name)
    for synset in synsets:
        members = members.union(set([w for s in synset.closure(lambda s:s.hyponyms(),depth=10) for w in s.lemma_names]))
    return members

Easy, right? If you type get_category_members('animal') it will give you everything that Wordnet thinks is an animal.

Animal sounds are a little harder to come by. Typing get_category_members('animal_sounds') gets you the empty set, so we have to be a little more creative. Let’s check out the hypernyms for a few animal sounds to give us an idea of what to do.

sounds = ('meow','moo','baa')
for s in sounds:
    syn = wn.synsets(s)
    for x in syn:
        print x, x.hypernyms()
Synset('meow.n.01') [Synset('cry.n.05')]
Synset('meow.v.01') [Synset('utter.v.02')]
Synset('moo.n.01') [Synset('cry.n.05')]
Synset('moo.v.01') [Synset('utter.v.02')]
Synset('baa.n.01') [Synset('cry.n.05')]
Synset('bleat.v.02') [Synset('utter.v.02')]

Now we’re on to something! Each of these of course can be a noun or a verb, but they all fall under the umbrella of “cry” or “utter.” So if we take category members of those two words — well, it’ll give us too much, but that’s okay. At least we should have all the relevant animal sounds in there.

Put it all together and you get something like this:

import re
from nltk.corpus import wordnet as wn

def is_good_movie(m):
    '''
    Count a movie as "good" if it only has 7 letters and one space
    '''
    m2 = re.sub(r'[^A-Za-z]+','',m)
    if len(m2) == 7 and m.count(' ') == 1:
        return True
    else:
        return False
        
def get_category_members(name):
    '''
    Use NLTK to get members of a category
    '''
    members = set()
    synsets = wn.synsets(name)
    for synset in synsets:
        members = members.union(set([w for s in synset.closure(lambda s:s.hyponyms(),depth=10) for w in s.lemma_names]))
    return members

def w1_in_w2(w1,w2):
    '''
    Return leftover letters if all the letters from w1 are in w2
    Return False otherwise
    '''
    for c in w1:
        if c not in w2:
            return False
        else:
            pos = w2.find(c)
            w2 = w2[:pos] + w2[pos+1:]
    return w2

# Get a set of movies
movies = []
with open('RankedMovies.txt','rb') as fid:
    for x in fid.readlines():
        x = x.rstrip('\r\n')
        movie, score = x.split('\t')
        score = int(score)
        if score >= 90:
            movie = movie[:-7] # Remove the year
            if is_good_movie(movie):
                movies.append(movie)

# Let's get a list of things that might be sounds
sounds_tmp = set()
for x in ('utter','cry'):
    sounds_tmp = sounds_tmp.union(get_category_members(x))
sounds = frozenset([z for z in sounds_tmp if len(z) < 7])
# Dictionary to help find anagrams
sorted_to_word = dict()
for word in sounds:
    s = ''.join(sorted(word))
    try:
        sorted_to_word[s].append(word)
    except KeyError:
        sorted_to_word[s] = [word]

# Get a set of animals
animals_tmp = get_category_members('animal')
animals = frozenset([y for y in animals_tmp if len(y) < 7])

# Go through and give results
for movie in movies:
    for animal in animals:
        leftover = w1_in_w2(animal,movie.lower())
        if leftover:
            l = leftover.replace(' ','')
            s = ''.join(sorted(l))    
            try:
                new_words = sorted_to_word[s]
            except KeyError:
                new_words = [] 
            for n in new_words:
                print "%s => %s %s" % (movie, animal, n)
                        

Which produces the following result:

Das Boot => toad sob
Red Dawn => wren add
Red Dawn => wren add
Big Stan => bat sign
Big Stan => bat sing
La Bamba => lamb baa
Rio Lobo => loir boo
Dear God => dog read
Rapa Nui => arui pan
Grey Owl => grey low

So there you have it. In about a second you get only 10 results and it’s easy to pick out the right one. I’m stoked about this new method to find members of a category; I sure wish I had it when I was working on the challenge to name a tree whose letters can be rearranged to spell two herbs or spices. It took me forever to get good lists for trees and herbs and spices, but with this approach the problem is completely trivialized.

8 Comments

  1. What does the fox say? http://notalwaysrelated.com/viral-side-effects/28000

    I mean really you’re looking for an animal and a noise that it makes in seven letters that also make a movie title. On the other hand, four of those eight movie titles, including the winner, aren’t in English.

    But there’s one sound that no one knows. What if it’s “Eirf”.

  2. What a cool tip! Never would have thought to consider “hyponyms.” I just started learning Py (very basic C++ background is not super helpful) and have found it intriguing.

  3. Pingback: NPR Puzzle for 2015-05-17: Solving With Python | Box Relatives

  4. I have noticed you don’t monetize your website, don’t waste your traffic,
    you can earn additional bucks every month because you’ve got high quality content.
    If you want to know how to make extra money, search for:
    Ercannou’s essential adsense alternative

  5. I see you don’t monetize your site, don’t waste your traffic, you can earn additional bucks every month.

    You can use the best adsense alternative for any type of website (they approve all websites), for more details
    simply search in gooogle: boorfe’s tips monetize your website

  6. I have noticed you don’t monetize your page, don’t waste your traffic, you can earn additional cash every month.

    You can use the best adsense alternative for any type of
    website (they approve all websites), for more info
    simply search in gooogle: boorfe’s tips monetize your website

  7. I see you don’t monetize alexboisvert.com, don’t waste your traffic,
    you can earn extra bucks every month with new monetization method.
    This is the best adsense alternative for any type
    of website (they approve all sites), for more details simply search in gooogle:
    murgrabia’s tools

  8. Understanding the fundamentals will probably pay off in the end, and will give you a
    definitely better knowledge.

Leave a Reply to Jeff Chen Cancel reply

Required fields are marked *.


This site uses Akismet to reduce spam. Learn how your comment data is processed.