{"id":250,"date":"2012-11-20T17:48:50","date_gmt":"2012-11-20T17:48:50","guid":{"rendered":"http:\/\/alexboisvert.com\/musings\/?p=250"},"modified":"2014-02-16T11:04:51","modified_gmt":"2014-02-16T19:04:51","slug":"wikipedia-regex-search-updated","status":"publish","type":"post","link":"https:\/\/alexboisvert.com\/musings\/2012\/11\/20\/wikipedia-regex-search-updated\/","title":{"rendered":"Wikipedia Regex Search Updated"},"content":{"rendered":"<p>The news first: I&#8217;ve updated the <a href=\"http:\/\/crosswordnexus.com\/wiki.php\">Wikipedia Regex Search<\/a> to include Wiktionary in its results.  The Wikipedia results have also been updated to be current as of November 1st.<\/p>\n<p>Now the problem: to test it out, I attempted to solve the most recent <a href=\"http:\/\/www.crosswordfiend.com\/blog\/2012\/11\/20\/mgwcc-233\/\">Matt Gaffney Contest<\/a> using the search, but it <a href=\"http:\/\/crosswordnexus.com\/wiki.php?regex=o*+of+o*&#038;searchtype=regular&#038;searchin=Both&#038;first=0&#038;atleast=any&#038;atmost=any\">didn&#8217;t turn anything up<\/a>.  Why?  Because &#8220;Oracle of Omaha&#8221; isn&#8217;t a full-fledged Wikipedia page, just a redirect, and I exclude redirects from my results.<\/p>\n<p>So what&#8217;s the fix here?  The obvious fix is to include redirects in my results, but I can&#8217;t just include all of them wholesale.  Just look at all <a href=\"http:\/\/toolserver.org\/~dispenser\/cgi-bin\/rdcheck.py?page=Condoleezza_Rice\">the pages that redirect to &#8220;Condoleezza Rice&#8221;<\/a> to see why.  No thanks.<\/p>\n<p>So is there a way to be more judicious about choosing which redirects to use?  There must be; after all, Onelook seems to handle it just fine.  I&#8217;m thinking for now to compare each redirect to a list of known &#8220;good&#8221; results, maybe from my clue database or <a href=\"http:\/\/alexboisvert.com\/xwordlist\">the collaborative word list<\/a>.  If a redirect page appears in one of those, then maybe I could include it and just give it the same score as the page it redirects to.  (Incidentally, it is in my clue database, but not the collaborative word list &#8212; I&#8217;ll have to add it.)<\/p>\n<p>Is there another way to determine which redirects to use?  I&#8217;d love to hear suggestions.  Anything I can do to improve my tool would be great.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The news first: I&#8217;ve updated the Wikipedia Regex Search to include Wiktionary in its results. The Wikipedia results have also been updated to be current as of November 1st. Now the problem: to test it out, I attempted to solve &hellip; <a href=\"https:\/\/alexboisvert.com\/musings\/2012\/11\/20\/wikipedia-regex-search-updated\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[],"class_list":["post-250","post","type-post","status-publish","format-standard","hentry","category-coding"],"_links":{"self":[{"href":"https:\/\/alexboisvert.com\/musings\/wp-json\/wp\/v2\/posts\/250","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/alexboisvert.com\/musings\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/alexboisvert.com\/musings\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/alexboisvert.com\/musings\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/alexboisvert.com\/musings\/wp-json\/wp\/v2\/comments?post=250"}],"version-history":[{"count":2,"href":"https:\/\/alexboisvert.com\/musings\/wp-json\/wp\/v2\/posts\/250\/revisions"}],"predecessor-version":[{"id":345,"href":"https:\/\/alexboisvert.com\/musings\/wp-json\/wp\/v2\/posts\/250\/revisions\/345"}],"wp:attachment":[{"href":"https:\/\/alexboisvert.com\/musings\/wp-json\/wp\/v2\/media?parent=250"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/alexboisvert.com\/musings\/wp-json\/wp\/v2\/categories?post=250"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/alexboisvert.com\/musings\/wp-json\/wp\/v2\/tags?post=250"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}