hanabi.rs/README.md

# Simulations of Hanabi strategies

Hanabi is a cooperative card game of incomplete information.
Despite relatively [simple rules](https://boardgamegeek.com/article/10670613#10670613),
the space of Hanabi strategies is quite interesting.
This project provides a framework for implementing Hanabi strategies in Rust, and also implements extremely strong strategies.

The best strategy is based on the "information strategy" from
[this paper](https://d0474d97-a-62cb3a1a-s-sites.googlegroups.com/site/rmgpgrwc/research-papers/Hanabi_final.pdf).  See results ([below](#results)).
It held state-of-the-art results (from March 2016) until December 2019, when [researchers at Facebook](https://arxiv.org/abs/1912.02318) surpassed it by extending the idea further with explicit search.

Please feel free to contact me about Hanabi strategies, or this framework.

## Setup

Install rust (rustc and cargo), and clone this git repo.

Then, in the repo root, run `cargo run -- -h` to see usage details.

For example, to simulate a 5 player game using the cheating strategy, for seeds 0-99:
```
cargo run -- -n 100 -s 0 -p 5 -g cheat
```

Or, if the simulation is slow, build with `--release` and use more threads:
```
time cargo run --release -- -n 10000 -o 1000 -s 0 -t 4 -p 5 -g info
```

Or, to see a transcript of the game with seed 222:
```
cargo run -- -s 222 -p 5 -g info -l debug | less
```

## Strategies

To write a strategy, you simply [implement a few traits](src/strategy.rs).

The framework is designed to take advantage of Rust's ownership system
so that you *can't cheat*, without using stuff like `Cell` or `Arc` or `Mutex`.

Generally, your strategy will be passed something of type `&BorrowedGameView`.
This game view contains many useful helper functions ([see here](src/game.rs)).
If you want to mutate a view, you'll want to do something like
`let mut self.view = OwnedGameView::clone_from(borrowed_view);`.
An OwnedGameView will have the same API as a borrowed one.

Some examples:

- [Basic dummy examples](src/strategies/examples.rs)
- [A cheating strategy](src/strategies/cheating.rs), using `Rc<RefCell<_>>`
- [The information strategy](src/strategies/information.rs)!

## Results (auto-generated)

To reproduce:
```
time cargo run --release -- --results-table
```

To update this file:
```
time cargo run --release -- --write-results-table
```

On the first 20000 seeds, we have these scores and win rates (average ± standard error):

|         |   2p    |   3p    |   4p    |   5p    |
|---------|------------------|------------------|------------------|------------------|
| cheat   | 24.8594 ± 0.0036 | 24.9785 ± 0.0012 | 24.9720 ± 0.0014 | 24.9557 ± 0.0018 |
|         | 90.59 ± 0.21 % | 98.17 ± 0.09 % | 97.76 ± 0.10 % | 96.42 ± 0.13 % |
| info    | 22.5218 ± 0.0125 | 24.7942 ± 0.0039 | 24.9352 ± 0.0022 | 24.9224 ± 0.0024 |
|         | 12.55 ± 0.23 % | 84.44 ± 0.26 % | 94.99 ± 0.15 % | 94.04 ± 0.17 % |

## Other work

Most similar projects I am aware of:
- https://github.com/rjtobin/HanSim (written for the paper mentioned above which introduces the information strategy)
- https://github.com/Quuxplusone/Hanabi

Some researchers are trying to solve Hanabi using machine learning techniques:
- [Initial paper](https://arxiv.org/abs/1902.00506) from DeepMind and Google Brain researchers. See [this Wall Street Journal coverage](https://www.wsj.com/articles/why-the-card-game-hanabi-is-the-next-big-hurdle-for-artificial-intelligence-11553875351)
- [This paper](https://arxiv.org/abs/1912.02318) from Facebook, code at https://github.com/facebookresearch/Hanabi_SPARTA which includes their machine-learned agent
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00			`# Simulations of Hanabi strategies`

improve to 24.78 (for 5 players) 2016-03-30 07:24:29 +02:00			`Hanabi is a cooperative card game of incomplete information.`
			`Despite relatively [simple rules](https://boardgamegeek.com/article/10670613#10670613),`
			`the space of Hanabi strategies is quite interesting.`
updates for FAIR work 2019-12-09 19:12:31 +01:00			`This project provides a framework for implementing Hanabi strategies in Rust, and also implements extremely strong strategies.`
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00
updates for FAIR work 2019-12-09 19:12:31 +01:00			`The best strategy is based on the "information strategy" from`
			`[this paper](https://d0474d97-a-62cb3a1a-s-sites.googlegroups.com/site/rmgpgrwc/research-papers/Hanabi_final.pdf). See results ([below](#results)).`
			`It held state-of-the-art results (from March 2016) until December 2019, when [researchers at Facebook](https://arxiv.org/abs/1912.02318) surpassed it by extending the idea further with explicit search.`
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00
updates for FAIR work 2019-12-09 19:12:31 +01:00			`Please feel free to contact me about Hanabi strategies, or this framework.`
bragging 2019-05-27 08:59:42 +02:00
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00			`## Setup`

update readme 2016-04-02 23:29:46 +02:00			`Install rust (rustc and cargo), and clone this git repo.`
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00
update readme 2016-04-02 23:29:46 +02:00			Then, in the repo root, run `cargo run -- -h` to see usage details.
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00
update readme 2016-04-02 23:29:46 +02:00			`For example, to simulate a 5 player game using the cheating strategy, for seeds 0-99:`
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00			```
update readme 2016-04-02 23:29:46 +02:00			`cargo run -- -n 100 -s 0 -p 5 -g cheat`
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00			```

update readme 2016-04-02 23:29:46 +02:00			Or, if the simulation is slow, build with `--release` and use more threads:
make color = char 2016-03-31 19:17:22 +02:00			```
smart hinting, silencing/configuring of progress output 2016-04-01 11:08:46 +02:00			`time cargo run --release -- -n 10000 -o 1000 -s 0 -t 4 -p 5 -g info`
			```

update readme 2016-04-02 23:29:46 +02:00			`Or, to see a transcript of the game with seed 222:`
smart hinting, silencing/configuring of progress output 2016-04-01 11:08:46 +02:00			```
update readme 2016-04-02 23:29:46 +02:00			`cargo run -- -s 222 -p 5 -g info -l debug \| less`
make color = char 2016-03-31 19:17:22 +02:00			```
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00
update readme 2016-04-02 23:29:46 +02:00			`## Strategies`

minor cleanups 2016-04-04 09:26:42 +02:00			`To write a strategy, you simply [implement a few traits](src/strategy.rs).`
update readme 2016-04-02 23:29:46 +02:00
			`The framework is designed to take advantage of Rust's ownership system`
			so that you can't cheat, without using stuff like `Cell` or `Arc` or `Mutex`.

			Generally, your strategy will be passed something of type `&BorrowedGameView`.
			`This game view contains many useful helper functions ([see here](src/game.rs)).`
			`If you want to mutate a view, you'll want to do something like`
			`let mut self.view = OwnedGameView::clone_from(borrowed_view);`.
			`An OwnedGameView will have the same API as a borrowed one.`

			`Some examples:`

			`- [Basic dummy examples](src/strategies/examples.rs)`
			- [A cheating strategy](src/strategies/cheating.rs), using `Rc<RefCell<_>>`
			`- [The information strategy](src/strategies/information.rs)!`

Auto-update README.md with `cargo run --release -- --write-results-table` 2019-03-07 19:12:31 +01:00			`## Results (auto-generated)`
improvements, cleanup, readme 2016-03-19 22:14:29 +01:00
various cleanups, fixes 2016-04-01 09:14:13 +02:00			`To reproduce:`
			```
Auto-update README.md with `cargo run --release -- --write-results-table` 2019-03-07 19:12:31 +01:00			`time cargo run --release -- --results-table`
			```

			`To update this file:`
			```
			`time cargo run --release -- --write-results-table`
various cleanups, fixes 2016-04-01 09:14:13 +02:00			```
Auto-update README.md with `cargo run --release -- --write-results-table` 2019-03-07 19:12:31 +01:00
make readme slightly nicer 2019-03-11 06:26:24 +01:00			`On the first 20000 seeds, we have these scores and win rates (average ± standard error):`
Auto-update README.md with `cargo run --release -- --write-results-table` 2019-03-07 19:12:31 +01:00
			`\| \| 2p \| 3p \| 4p \| 5p \|`
Report standard errors in README.md 2019-03-09 14:41:18 +01:00			`\|---------\|------------------\|------------------\|------------------\|------------------\|`
make readme slightly nicer 2019-03-11 06:26:24 +01:00			`\| cheat \| 24.8594 ± 0.0036 \| 24.9785 ± 0.0012 \| 24.9720 ± 0.0014 \| 24.9557 ± 0.0018 \|`
			`\| \| 90.59 ± 0.21 % \| 98.17 ± 0.09 % \| 97.76 ± 0.10 % \| 96.42 ± 0.13 % \|`
Update results table Changes are probably due to updated rand dependency 2023-01-20 04:33:28 +01:00			`\| info \| 22.5218 ± 0.0125 \| 24.7942 ± 0.0039 \| 24.9352 ± 0.0022 \| 24.9224 ± 0.0024 \|`
			`\| \| 12.55 ± 0.23 % \| 84.44 ± 0.26 % \| 94.99 ± 0.15 % \| 94.04 ± 0.17 % \|`
updates for FAIR work 2019-12-09 19:12:31 +01:00
			`## Other work`

			`Most similar projects I am aware of:`
			`- https://github.com/rjtobin/HanSim (written for the paper mentioned above which introduces the information strategy)`
			`- https://github.com/Quuxplusone/Hanabi`

			`Some researchers are trying to solve Hanabi using machine learning techniques:`
			`- [Initial paper](https://arxiv.org/abs/1902.00506) from DeepMind and Google Brain researchers. See [this Wall Street Journal coverage](https://www.wsj.com/articles/why-the-card-game-hanabi-is-the-next-big-hurdle-for-artificial-intelligence-11553875351)`
			`- [This paper](https://arxiv.org/abs/1912.02318) from Facebook, code at https://github.com/facebookresearch/Hanabi_SPARTA which includes their machine-learned agent`