Joon recently posted the following on Twitter:
wordplay #puzzler: think of two unrelated phrases, both with enumeration 3, 4. the 3-letter words are synonyms; so are the 4-letter words.
Well, I had just recently downloaded the unbelievably awesome natural language toolkit for Python so I thought I’d test it out on this problem. Here’s what I came up with:
#!/usr/bin/python from nltk.corpus import wordnet as wn import re import sys myfile = sys.argv[1] word1_length = int(sys.argv[2]) word2_length = int(sys.argv[3]) def get_synonyms(word, length): ''' Gets all synonyms of `word` of length `length` ''' syns = list() synsets = wn.synsets(word) for synset in synsets: for w in synset.lemma_names: if len(w) == length and w != word and w not in syns: syns.append(w) return syns fid = open(myfile,'r') dict = [x.rstrip('\r\n') for x in fid.readlines()] fid.close() # Slim this down a bit my_pattern = r'^[a-z]{%i}_[a-z]{%i}$' % (word1_length, word2_length) my_phrases = [x for x in dict if re.match(my_pattern,x)] # Go through and check for p in my_phrases: w1 = p[:word1_length] w2 = p[word1_length+1:] # Get synonyms of the appropriate length l1 = get_synonyms(w1,word1_length) l2 = get_synonyms(w2,word2_length) if len(l1) != 0 and len(l2) != 0: for t1 in l1: for t2 in l2: test_phrase = t1 + '_' + t2 if test_phrase in my_phrases: print p + ' -> ' + test_phrase
To use it you will need a fairly comprehensive list of phrases — the data dump from Wiktionary is what I used here. Just run it as
puzzler.py enwiki.txt 3 4
and it will spit out:
bad_lots -> big_deal bad_lots -> big_band bad_mind -> big_head big_head -> bad_mind bum_rush -> rat_race had_best -> get_well hit_home -> off_base off_base -> hit_home rat_race -> bum_rush
(Yes, WordNet thinks that “big” and “bad” are synonyms.)
You may notice that Joon’s intended answer isn’t among the results — one of those phrases wasn’t on Wiktionary … until I added it (it will probably be in the next data dump).
Anyway! You may notice that the code offers you room to work with other word lengths. Is there anything interesting good there? Well, I kind of like “light switch” and “short-change”. “Well liquors” and “good spirits” is a good one too. Anything else? Feel free to experiment with the code and let me know in the comments.