Problem Correctness

How sure are we of the correctness of problems? Oftentimes I don’t really see why my opponent would make a specific move, even with the other defense options considered, or that the moves listed in the extended defense don’t make sense themselves.
At other times, there are some very free moves available, with only 1 marked as valid. This doesn’t seem to be a solvable problem, since there are some extra steps to get to the intended problem.

Would it be possible to have some user input to mark these problem solutions as ‘needs another check’. Then some further analysis will make it a set solution.

2 Likes

The problem generator has a lot of moving parts and I have more confidence in some parts than others.

At the bottom of the stack is Ed Gilbert’s KingsRow engine. So while the engine basically always plays the objectively best move that move is often not the best human try. For instance it will immediately give up two checkers to delay the loss by one move whereas a human would keep the material and make their opponent prove they see the win. The engine doesn’t understand that it’s effectively resigning.

Besides playing the ‘best’ move the engine also (usually) outputs a ‘principal variation’ which is shown in the ‘details’ panel. This is the engine’s best attempt at predicting what should happen in the future but it is not being run at every step of the way. So especially by the end of the variation it is sometimes wrong. Even when it is technically correct it may be repeating the above weirdness.

Guiding the engine and interpreting its output to build puzzles is my program. This I have less faith in, it’s not as well tested or as long running as the engine but it seems to be pretty good. It chooses the defense based on either the engine’s best move or the defending player’s move if it is available.

I am the least confident in things like which moves get awarded ‘that’s a good move’ and which don’t, which positions get turned into puzzles and which are rejected, where puzzles end, etc. For instance there are still puzzles that are impossible to get wrong; ones where there are only two legal moves and one of them wins and the other gets ‘thats a good move but the engine claims there is better.’

I’m planning on adding problem comments and tags. Comments to let users ask and receive help on specific problems when they are on the problem page. And tags to make the problems more searchable and reportable. But this remains relatively low on the priority list so for now the forum is the best place for these kinds of questions and bug reports. Still, thanks for the suggestion that is how the priority list gets arranged.

If you go to account -> attempt history and remember which problem was weird, or bump into another one please post it here. Maybe you found a bug, maybe it’s a bad problem, or maybe one of the stronger players can explain the confusion. If it a bad puzzle I can easily disable or shorten it and add it to my test cases to help improve the generator.

Thanks! I hope this helps.

Are you running Kingsrow with the 10 piece end-game database?

No, I’m using the cake 8 piece database which is already more than a gigabyte in size. I never found a download link for the 10 piece database. I’m not sure it would even fit on my laptop.

http://edgilbert.org/EnglishCheckers/KingsRowEnglish.htm

I’m going to do a new install of Checkerboard and Kingsrow on a faster computer soon and I’m going to add the 10 piece end-game database.

Thanks for the link. 100 GB is smaller than I expected. But the major limitation for me is the fact that KingsRow only outputs ‘draw’ when it finds a database draw, it doesn’t give any principal variation info at that point. The 10 piece database would mean more moves in more puzzles don’t have any analysis. I think these variations may be a better learning tool than the increased playing strength of the engine.

In the future I could modify the problem generator to construct these variations “manually”. Once that works, then the 10 piece database makes more sense. I’ll add both of these ideas to my to-do list.