September 04, 2006Sudoku Generator
It is a nice example of the website fun you can have with 250 lines of Python over a Labor day weekend; it also makes a handy command-line Sudoku solver... What is Sudoku? Have you ever played Sudoku? A few months ago my dad raved about a new game in the newspaper that he liked solving better than the daily crossword, and so I picked up one of the five hundred or so Sudoku books at the bookstore. I was quickly hooked. The rules of Sudoku are simple: finish filling in the squares of a 9x9 grid so that the digits 1-9 appear exactly once in each of the nine rows, columns, and 3x3 blocks. Puzzles are designed so that there is only one correct way to fill in the 81 squares, and they tend to be just hard enough to be satisfying: not too easy, not impossible. Ways to Play Sudoku is a good solo game; if you are having trouble getting into it, you can read about various solution strategies on the web. But, like crosswords, it can be even more fun to work Sudoku puzzles together with somebody else. Heidi and I sometimes race each other on adjacent puzzles in a Sudoku book - it can be nerve-wracking to try to catch up when you're behind, while the other person is rapidly filling in squares. Either player always has a chance to win a race, because backtracking in Sudoku is hard. You can be dealt a blow at the finish line when you encounter a contradiction and then realize you made a mistake long ago. (The best strategy in Sudoku is not to guess, yet it's still good to play with an erasable pencil.) It can also be fun to cooperate on a Sudoku puzzle - the entertainment there is to be stymied together, and then question the other person's clever move to get out of a logjam. There are many Sudoku variants, for example versions of the game where the 9-square blocks come in different shapes, or smaller versions of the game that are designed for children. I realized Sudoku was a major phenomenon when I saw the Dora Sudoku books for preschoolers. Since they are for the preliterate set, you fill in simplified 4x4 puzzles using stickers instead of letters or numbers. The amazing thing is that my pre-K daughter Piper was completely enthralled by the books. We brought a Dora Sudoku book on our trip to Taiwan, and solving Sudokus became the most reliably fun activity for Piper whenever she was waiting around, remarkably beating out her Nintendo DS. I guess it's no wonder that they recently came out with Sudoku for the Nintendo DS too. Counting Sudoku Boards But if you read the advertisements for Sudoku Gridmaster for DS, you will notice that it is said to come with "over 400" Sudoku puzzles. With crosswords, I can understand why your software might have only a few hundred puzzles, since each hint needs to be done by a human editor. But with Sudokus, I wonder why they don't just let the software generate millions of puzzles? Imagine how many millions of puzzles "included" they could claim. Solved Sudokus are a special kind of Latin square, which were studied by Euler. Euler discovered that there did not seem to be very many Latin squares of a certain type, and in 1782 he proposed that the 36 Officers Problem had no solution for 6x6, 10x10, 14x14 squares, and so on. His conjecture stood until the 20th century. But in 1959, a set of 10x10 "Euler's Spoiler" squares were discovered by Parker, Bose, and Shrikhande (in work popularized by Martin Gardner) disproving Euler's famous conjecture. Mathematicians continue to play with and theorize about Latin squares today. How many different ways are there to completely fill in the 81 squares of a Sudoku board? In May 2005, Felgenhauer and Jarvis counted the Sudoku boards carefully and concluded that there are exactly 6,670,903,752,021,072,936,960 solved boards. These thousands of billions of billions are so large that there would really be no hope of using a supercomputer to enumerate all the boards: the only way to count them all is to understand the symmetries of the game. If you have one solved board, you can get another equivalent one by relabeling digits, swapping rows within a 3-row-band, swapping bands, swapping columns within a 3-column-stack, swapping stacks, or by rotating the whole board. How many "really different" boards are there which cannot be created from each other by just swapping things around this way? Jarvis and Russell concluded that the kernel of really distinct boards is still pretty large, numbering 5,472,730,538. Yet this number is small enough for a computer, and it is possible (although tricky) to write a program to enumerate all these boards quickly. Counting Sudoku Hints For puzzlers, a solved board is not as interesting as an unsolved one! And among unsolved puzzles, the interesting ones are the minimal ones: the puzzles which uniquely determine a single solution, and for which you cannot erase any hint without permitting extra solutions. How many minimal Sudoku puzzles are there? Certainly the number must be far more than the number of solved Sudoku boards, but the total count is unknown. In fact, it is not even known for sure how small a minimal puzzle can be! My simple program for generating minimal puzzles typically generates Sudoku puzzles with 24 or more hints. I have never seen it generate a puzzle with fewer than 22 hints. But there are some puzzles that are much smaller. Gordon Royle mainains a list of (currently) 36628 nontrivially distinct and fully constrained Sudoku puzzles with 17 hints. Some mathematicians suspect that the smallest number of hints you can give in a Sudoku puzzle that uniquely determines a fully solved board is 17, because concerted searches for a fully constrained puzzle with only 16 hints have come up dry. In fact, it is hard to find 17-hint puzzles too, and Gordon's list of 17-hint puzzles might be nearly complete. If this could be proved, it would also prove that there are no 16-hint puzzles. So if you find a 17-hint puzzle, you should check to see if Gordon already has it. And remember that there is no proof that fully determined 16-hint puzzles don't exist. If you get a 16-hint puzzle out of the automatic puzzle generator, don't throw it out. It would be publication-worthy. Sudoku Generation Algorithm How can you generate Sudoku puzzles? I do not know what the 'best' algorithm is, but here is what I have done in my simple little python program. First, we need a Sudoku solver. My solver uses three simple strategies to solve a board; they are simple to implement and seem fast enough.
The last bit about trying all the guesses will require some backtracking. When we are selecting a place to guess, we choose one randomly among the most-constrained places, and we shuffle the choices to try them in a random order. The result is if we tell the solver to solve an empty board, we get a nice random fully-solved Sudoku board. To generate Sudoku puzzles, we start with a solved board, and we choose some minimal hints to reveal as follows.
Notice that generating a minimal puzzle this way requires us to do the work of solving a Sudoku board about twenty times, so it takes a lot more work to generate a minimal Sudoku puzzle than it does to solve one. If there are ways of generating Sudoku puzzles more quickly, I'd love to know. Rating Puzzles Any Sudoku fan will tell you that there are hard puzzles and easy puzzles. The newspapers like to save the really diabolical ones for the Friday edition, so you can stew over them all weekend. The hardest puzzles require you to look several moves ahead, or make lots of good guesses to find a solution. Unfortunately, the algorithm here does not always create hard puzzles - I don't know how to create a more difficult puzzle on purpose. So in the links below, you may get a hard or an easy puzzle, randomly. The best we can do is tell you what kind of puzzle you've got. The ratings that go along with the puzzles estimate the amount of guessing (or thinking ahead more than one move) that you might need to do to solve a puzzle. This is done by counting the depth of the search stack when the computer finds a solution. Since, depending on order of the guesses, the search might be deeper or shallower, we try a few different paths and average the search depth to get the rating. A rating of zero means that you can completely solve the puzzle without any looking ahead or guessing. Ratings of three or more tend to be pretty hard. Here's the Source Luke Python code for the Sudoku solver and generator is here. The python script works as both a commandline tool and as a cgi script. Every time you visit the links below, you should get a fresh, new, never-seen-before Sudoku puzzle. We also print the difficulty of each puzzle so you know what you're getting into before you send it to the printer. Enjoy. Get a generated Sudoku PDF here. Get a generated Sudoku in HTML here. Get a generated Sudoku in plaintext here. Get a generated Sudoku in Postscript here. Update: for more printable puzzle fun - I've also put together a 3d printable maze generator here. Posted by David at September 4, 2006 03:36 AMComments
This isn't actually how you solve Sudokus when you're doing them by hand, is it? I have one Sudoku book where the instructions say that the "hard" puzzles require guessing and backtracking, and another one which says "you should never need to guess". The puzzles in the "no guessing" book are still very hard, but I find them much more fun to solve... I suppose the answer is to extend your simple solver to have some more sophisticated strategies before it starts to guess. (and then perhaps the rating could be based on some subjective ranking of the difficulties of the strategies the solver ends up needing to use?) Posted by: David Maymudes at September 4, 2006 01:12 PMYou are right. When you play Sudoku by hand, you basically never guess - instead you try to look ahead and then just write numbers when you are absolutely sure. Maybe it's like a breadthfirst versus a depthfirst search. Even though the computer's depthfirst strategy is plenty fast for solving puzzles, I'm not sure it's the best way to rate the difficulty of a puzzle. There are pretty simple Sudoku strategies that my algorithm does not know. But I am also running into the opposite problem: apparently the algorithm finds it easy to deduce squares (without guessing) that I find it very difficult to locate. I have been stuck on a particular "Level 0" board all morning, wondering where the heck the computer saw a simple deduction. I am not sure why it is so hard for me to see what I am missing. Any ideas on the right way to rate difficulty? Posted by: David at September 4, 2006 01:27 PMI also did a small development in Excel to solve the first part of a puzzle, which is the "guessing" of all possible figures in each cell, following the three basic rules: Nine different figures on each row, column and 3x3 matrix. After that, the real "guessing" process becomes very enjoying and takes much less of our relaxing time per puzzle (5 minutes for an easy puzzle is the average for an average IQ user). The next step is to get an automatic figures generator with difficulty choice inside the worksheet using script, to avoid typing every new puzzle. After reading you article I realized that possibly you have the right solution for my intent. If you accept doing me such please, I would like to send you my worksheet to enable a better understanding of my idea. (Some writings are in Portuguese but I think this will not be a concern). I believe that the worksheet engine (formulas) will be automatically translated to English inside your Excel. Thank you Valerio Posted by: Valerio Deo at September 10, 2006 09:17 PMYou should never need to guess. You can create a true solver in Excel, but as it is an iterative process (which is not Excel's strong suit) it becomes a very intensive application. Better is to set it up in MATLAB. As long as you watch out for the possiblity of an endless loop you can create a solver pretty quickly. Either way, the algorithm goes like this: Repeat these steps enough times and any solvable sudoku will be solved. That is to say, I have not yet found a published puzzle that could not be solved in this manner. Posted by: cosmicfish at November 7, 2006 01:32 PMi didn't see the steps algorithm of saving sudoku thank you Posted by: moon at March 23, 2007 06:41 PMValerio Deo's solution is not quite right. Whey you say "You don't have to guess" you really mean you should not write down guesses. Thinking about what number would fit is nothing but making a lot of guesses in your head and looking whether they will lead to a conflict... Posted by: sky at April 27, 2007 09:57 PMI know a better way to generator: 1. get a valid one like: 2. then randomly switch two lines(row) in a group or switch two big groups. 3. randomly switch all positions for two numbers, like switch all 1 and 2 Now you have a valid solution, you just need to get rid of number from it to create a sudoku. Posted by: Shawn at May 14, 2007 09:23 PMNice programming that you have here. Posted by: Ian Kree at July 2, 2007 10:34 AMThere is another strategy I have found, which I use for hard sudoku's on paper. Once you have worked out all the possibilities for each uncertain cell, look for a (row/column/block) that contains (2/3/4/5/6/7) different possibilities for the same number of cells in that row/column/block. Eliminate ALL of these options from all OTHER cells in that row/column/block. For example, if you have 3 cells in the same row, empty, which have the possibilities of: Hi, can somebody give me an efficient algorithm to develop a Sudoku Software where in a problem is generated automatically and asks player solve it.. Posted by: Ratnesh at November 29, 2007 03:22 AMHi, i need help coding a sudoku generator and solution checker, which also provides a gui that allows the user to input numbers into the puzzle, check their input, and generate a new puzzle at any time, using a neural network and random number generators. Posted by: Zuckerman at December 18, 2007 09:26 PMGood algorithm, David. I wrote a sudoku solver some time back that employs a few techniques: looking for cells with unique candidates, looking for the only position for a value along a row, column, or block. I also added another technique that looks at subcolumns within a block and comparing to other subcolumns. These techniques seemed to be able to solve all the puzzles from local newspapers, but that's only because thos puzzles were pretty simple. I found Stephen Ostermiller's source code (http://ostermiller.org) and realzed there are more techniques. In addition to the ones he implemented, there are others such as X-wing and sword fish, and probably many more. I believe it can probably be proven (or maybe it is already proven) that as long as the solution to a sudoku puzzle is guaranteed to be unique, there must be a logical series of steps that one can take to solve the puzzle. These steps which may include really complex technqiues -- even ones that have not even been discovered. Using algorithms that make guesses is probably just a shortcut for not having to develop and implement complex techniques. But if you implement as many of the known techniques as possible, it would help make the generator more likely to generate a harder puzzle. Post a comment
|
| Copyright 2006 © David Bau. All Rights Reserved. |