Fenux.Net - The Life of a Geek
Working Backwards
code.NET
Posted on 11/18/2006 5:20 pm in code.NET

It's come to my attention that the possible combinations of letters possible with 7 letter tiles is just shy of a million. That's just with those seven letters. It doesn't take into account the combinations with what's already on the board. Running a spell check against all of those combinations is way too costly in terms of CPU cycles. I'll have to work it another way.

Instead of starting from a list of over a million possible combinations of letters, I can start from a list of valid words. There's good news here.

From http://personal.riverusers.com/~thegrendel/software.html

Updated millennial edition of ENABLE (ENABLE2K): The Enhanced North American Benchmark LEexicon package provides a 173,000+ word standardized tournament-level master list, more carefully researched than "official" and proprietary lists. It is not restricted to words of an arbitrary length and it features all "official" updates. The ENABLE list has been placed in the Public Domain and is both free and freely distributable. This is the list used by the WORDY word study system, above, and by various online Scrabble® servers and software developers (at least two commercial word games use it as its dictionary). It is highly recommended that the (updated) ENABLE SUPPLEMENT [785K], and the ABLE SUPPLEMENT [559K], additional lists and documentation packages for ENABLE be obtained. Freeware.

...

The YAWL LIST (updated Y2K version) is a combination of the ENABLE list and all the above supplementary lists, effectively a superset of the international SOWPODS list. It contains over 264,000 words and is probably the largest and most comprehensive list of its kind. This free Public Domain list is targeted toward Linux, BSD, Be OS, and other 'nix systems and is in standard UNIX ASCII format. It may require conversion before it can be viewed, edited, or accessed by Windows or Mac users. [704K]

That shortens our starting list from over a million to just over a quarter of a million. My idea, which is currently untested, is to go through this list in a series of iterations. Each iteration through the list would delete words we can't use from the list. This could get complicated, but it still looks like it will be faster than working at it from the other direction.

©2009, Jason Burgess
All Rights Reserved