Post by ***@gmail.comHas it ever happened that advances in bot technology have invalidated
(or at least thrown into doubt) official solutions to Othello quizzes?
Or are the equity differences large enough that this never happens?
I'm pretty sure that no Othello quiz answer has been overturned.
It's not just that the equity differences in Othello quiz problems are
large. He skillfully chooses positions that are "easy" for bots. There
are no wild backgames or containment positions where you might suspect
that the bot has gone astray. Othello also shies away from positions
in which where a 3-ply evaluation differs from a rollout verdict. The
chances that a verdict will be overturned by future bots is very slim
in my opinion.
I went through all the questions at some point, and the closest I got
to an overturned verdict was Problem 10 in 2016. The official rollout
gave an equity difference of 0.079 between the 1st and 2nd plays, but
I did a longer rollout with stronger settings and the equity difference
dropped to 0.037. One reason that I singled this one out for an
extended rollout was that I noticed that the XG 3-ply evaluation of
8/5 7/5 was actually 0.009 ahead of that of 10/8 6/3. As I said,
normally Othello avoids such positions, but in this case he took the
risk of including it.
XGID=-b--B-DBAAA-bB----bc-bbbB-:0:0:1:32:2:3:0:7:10
Score is X:2 O:3 7 pt.(s) match.
+13-14-15-16-17-18------19-20-21-22-23-24-+
| X O | | O O O O X |
| X O | | O O O O X |
| | | O |
| | | |
| | | |
| |BAR| |
| | | |
| | | X |
| | | X |
| O X | | X X O |
| O X X X X | | X X O |
+12-11-10--9--8--7-------6--5--4--3--2--1-+
Pip count X: 147 O: 124 X-O: 2-3/7
Cube: 1
X to play 32
1. Rollout¹ 10/8 6/3 eq:-0.274
Player: 46.01% (G:12.03% B:0.73%)
Opponent: 53.99% (G:24.33% B:5.12%)
Confidence: ±0.007 (-0.281..-0.267) - [100.0%]
2. Rollout¹ 8/5 7/5 eq:-0.311 (-0.037)
Player: 45.04% (G:12.51% B:0.64%)
Opponent: 54.96% (G:25.41% B:5.14%)
Confidence: ±0.006 (-0.317..-0.305) - [0.0%]
¹ 5184 Games rolled with Variance Reduction.
Dice Seed: 271828
Moves: 4-ply, cube decisions: XG Roller+
Search interval: Large
eXtreme Gammon Version: 2.19.211.pre-release, MET: Kazaross XG2
---
Tim Chow