hanabi.rs/README.md

# Simulations of Hanabi strategies

Hanabi is a cooperative card game of incomplete information.
Despite relatively [simple rules](https://boardgamegeek.com/article/10670613#10670613),
the space of Hanabi strategies is quite interesting.
This project provides a framework for implementing Hanabi strategies in Rust.
It also explores some implementations, based on ideas from
[this paper](https://d0474d97-a-62cb3a1a-s-sites.googlegroups.com/site/rmgpgrwc/research-papers/Hanabi_final.pdf).
In particular, it contains an improved version of their "information strategy",
which achieves the best results I'm aware of for games with more than 2 players ([see below](#results)).

Please feel free to contact me about Hanabi strategies, or this framework.

Most similar projects I am aware of:
- https://github.com/rjtobin/HanSim (written for the paper mentioned above)
- https://github.com/Quuxplusone/Hanabi

## Setup

Install rust (rustc and cargo), and clone this git repo.

Then, in the repo root, run `cargo run -- -h` to see usage details.

For example, to simulate a 5 player game using the cheating strategy, for seeds 0-99:
```
cargo run -- -n 100 -s 0 -p 5 -g cheat
```

Or, if the simulation is slow, build with `--release` and use more threads:
```
time cargo run --release -- -n 10000 -o 1000 -s 0 -t 4 -p 5 -g info
```

Or, to see a transcript of the game with seed 222:
```
cargo run -- -s 222 -p 5 -g info -l debug | less
```

## Strategies

To write a strategy, you simply [implement a few traits](src/strategy.rs).

The framework is designed to take advantage of Rust's ownership system
so that you *can't cheat*, without using stuff like `Cell` or `Arc` or `Mutex`.

Generally, your strategy will be passed something of type `&BorrowedGameView`.
This game view contains many useful helper functions ([see here](src/game.rs)).
If you want to mutate a view, you'll want to do something like
`let mut self.view = OwnedGameView::clone_from(borrowed_view);`.
An OwnedGameView will have the same API as a borrowed one.

Some examples:

- [Basic dummy examples](src/strategies/examples.rs)
- [A cheating strategy](src/strategies/cheating.rs), using `Rc<RefCell<_>>`
- [The information strategy](src/strategies/information.rs)!

## Results (auto-generated)

To reproduce:
```
time cargo run --release -- --results-table
```

To update this file:
```
time cargo run --release -- --write-results-table
```

On the first 20000 seeds, we have these average scores and win rates:

|         |   2p    |   3p    |   4p    |   5p    |
|---------|---------|---------|---------|---------|
| cheat   | 24.8594 | 24.9785 | 24.9720 | 24.9557 |
|         | 90.59 % | 98.17 % | 97.76 % | 96.42 % |
| info    | 22.2908 | 24.7171 | 24.8875 | 24.8957 |
|         | 09.40 % | 79.94 % | 91.32 % | 91.83 % |
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00			`# Simulations of Hanabi strategies`

improve to 24.78 (for 5 players) 2016-03-30 07:24:29 +02:00			`Hanabi is a cooperative card game of incomplete information.`
			`Despite relatively [simple rules](https://boardgamegeek.com/article/10670613#10670613),`
			`the space of Hanabi strategies is quite interesting.`
update readme 2016-04-02 23:29:46 +02:00			`This project provides a framework for implementing Hanabi strategies in Rust.`
improve to 24.78 (for 5 players) 2016-03-30 07:24:29 +02:00			`It also explores some implementations, based on ideas from`
			`[this paper](https://d0474d97-a-62cb3a1a-s-sites.googlegroups.com/site/rmgpgrwc/research-papers/Hanabi_final.pdf).`
update readme 2016-04-02 23:29:46 +02:00			`In particular, it contains an improved version of their "information strategy",`
			`which achieves the best results I'm aware of for games with more than 2 players ([see below](#results)).`
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00
update readme 2016-04-02 23:29:46 +02:00			`Please feel free to contact me about Hanabi strategies, or this framework.`
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00
update readme 2016-04-02 23:29:46 +02:00			`Most similar projects I am aware of:`
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00			`- https://github.com/rjtobin/HanSim (written for the paper mentioned above)`
			`- https://github.com/Quuxplusone/Hanabi`

			`## Setup`

update readme 2016-04-02 23:29:46 +02:00			`Install rust (rustc and cargo), and clone this git repo.`
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00
update readme 2016-04-02 23:29:46 +02:00			Then, in the repo root, run `cargo run -- -h` to see usage details.
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00
update readme 2016-04-02 23:29:46 +02:00			`For example, to simulate a 5 player game using the cheating strategy, for seeds 0-99:`
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00			```
update readme 2016-04-02 23:29:46 +02:00			`cargo run -- -n 100 -s 0 -p 5 -g cheat`
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00			```

update readme 2016-04-02 23:29:46 +02:00			Or, if the simulation is slow, build with `--release` and use more threads:
make color = char 2016-03-31 19:17:22 +02:00			```
smart hinting, silencing/configuring of progress output 2016-04-01 11:08:46 +02:00			`time cargo run --release -- -n 10000 -o 1000 -s 0 -t 4 -p 5 -g info`
			```

update readme 2016-04-02 23:29:46 +02:00			`Or, to see a transcript of the game with seed 222:`
smart hinting, silencing/configuring of progress output 2016-04-01 11:08:46 +02:00			```
update readme 2016-04-02 23:29:46 +02:00			`cargo run -- -s 222 -p 5 -g info -l debug \| less`
make color = char 2016-03-31 19:17:22 +02:00			```
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00
update readme 2016-04-02 23:29:46 +02:00			`## Strategies`

minor cleanups 2016-04-04 09:26:42 +02:00			`To write a strategy, you simply [implement a few traits](src/strategy.rs).`
update readme 2016-04-02 23:29:46 +02:00
			`The framework is designed to take advantage of Rust's ownership system`
			so that you can't cheat, without using stuff like `Cell` or `Arc` or `Mutex`.

			Generally, your strategy will be passed something of type `&BorrowedGameView`.
			`This game view contains many useful helper functions ([see here](src/game.rs)).`
			`If you want to mutate a view, you'll want to do something like`
			`let mut self.view = OwnedGameView::clone_from(borrowed_view);`.
			`An OwnedGameView will have the same API as a borrowed one.`

			`Some examples:`

			`- [Basic dummy examples](src/strategies/examples.rs)`
			- [A cheating strategy](src/strategies/cheating.rs), using `Rc<RefCell<_>>`
			`- [The information strategy](src/strategies/information.rs)!`

Auto-update README.md with `cargo run --release -- --write-results-table` 2019-03-07 19:12:31 +01:00			`## Results (auto-generated)`
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00
various cleanups, fixes 2016-04-01 09:14:13 +02:00			`To reproduce:`
			```
Auto-update README.md with `cargo run --release -- --write-results-table` 2019-03-07 19:12:31 +01:00			`time cargo run --release -- --results-table`
			```

			`To update this file:`
			```
			`time cargo run --release -- --write-results-table`
various cleanups, fixes 2016-04-01 09:14:13 +02:00			```
Auto-update README.md with `cargo run --release -- --write-results-table` 2019-03-07 19:12:31 +01:00
			`On the first 20000 seeds, we have these average scores and win rates:`

			`\| \| 2p \| 3p \| 4p \| 5p \|`
			`\|---------\|---------\|---------\|---------\|---------\|`
			`\| cheat \| 24.8594 \| 24.9785 \| 24.9720 \| 24.9557 \|`
			`\| \| 90.59 % \| 98.17 % \| 97.76 % \| 96.42 % \|`
Refactor out a "public information object" One important change is that now, when deciding which questions to ask, they can see the answer to the last question before asking the next one. Some design choices: - Questions now take a BoardState instead of an OwnedGameView. - When deciding which questions to ask (in ask_questions), we get an immutable public information object (representing the public information before any questions were asked), and a mutable HandInfo<CardPossibilityTable> that gets updated as we ask questions. That HandInfo<CardPossibilityTable> was copied instead of taken. - In ask_questions, we also get some &mut u32 representing "info_remaining" that gets updated for us. This will later allow for cases where "info_remaining" depends on the answers to previous questions. - Both get_hint_sum and update_from_hint_sum change the public information object. If you want to compute the hint sum but aren't sure if you actually want to give the hint, you'll have to clone the public information object! - Over time, in the code to decide on a move, we'll be able to build an increasingly complicated tree of "public information object operations" that will have to be matched exactly in the code to update on a move. In order to make this less scary, I moved most of the code into "decide_wrapped" and "update_wrapped". If the call to update_wrapped (for the player who just made the move) changes the public information object in different ways than the previous call to decide_wrapped, we detect this and panic. This commit should be purely refactoring; all changes to win-rates are due to bugs. 2019-03-04 17:24:24 +01:00			`\| info \| 22.2908 \| 24.7171 \| 24.8875 \| 24.8957 \|`
			`\| \| 09.40 % \| 79.94 % \| 91.32 % \| 91.83 % \|`