diff --git a/README.md b/README.md index e452100..175f699 100644 --- a/README.md +++ b/README.md @@ -3,73 +3,68 @@ Hanabi is a cooperative card game of incomplete information. Despite relatively [simple rules](https://boardgamegeek.com/article/10670613#10670613), the space of Hanabi strategies is quite interesting. - -This repository provides a framework for implementing Hanabi strategies. +This project provides a framework for implementing Hanabi strategies in Rust. It also explores some implementations, based on ideas from [this paper](https://d0474d97-a-62cb3a1a-s-sites.googlegroups.com/site/rmgpgrwc/research-papers/Hanabi_final.pdf). +In particular, it contains an improved version of their "information strategy", +which achieves the best results I'm aware of for games with more than 2 players ([see below](#results)). -In particular, it contains a variant of their "information strategy", with some improvements. -This strategy achieves the best results I am aware of for n > 2 (see below). +Please feel free to contact me about Hanabi strategies, or this framework. -Please contact me if: -- You know of other interesting/good strategy ideas! -- Have questions about the framework or existing strategies - -Some similar projects I am aware of: +Most similar projects I am aware of: - https://github.com/rjtobin/HanSim (written for the paper mentioned above) - https://github.com/Quuxplusone/Hanabi ## Setup -Install rust/rustc and cargo. Then, +Install rust (rustc and cargo), and clone this git repo. -`cargo run -- -h` +Then, in the repo root, run `cargo run -- -h` to see usage details. +For example, to simulate a 5 player game using the cheating strategy, for seeds 0-99: ``` -Usage: target/debug/rust_hanabi [options] - -Options: - -l, --loglevel LOGLEVEL - Log level, one of 'trace', 'debug', 'info', 'warn', - and 'error' - -n, --ntrials NTRIALS - Number of games to simulate (default 1) - -t, --nthreads NTHREADS - Number of threads to use for simulation (default 1) - -s, --seed SEED Seed for PRNG (default random) - -p, --nplayers NPLAYERS - Number of players - -g, --strategy STRATEGY - Which strategy to use. One of 'random', 'cheat', and - 'info' - -h, --help Print this help menu +cargo run -- -n 100 -s 0 -p 5 -g cheat ``` -For example, - -``` -cargo run -- -n 10000 -s 0 -p 5 -g cheat -``` - -Or, if the simulation is slow (as the info strategy is), - +Or, if the simulation is slow, build with `--release` and use more threads: ``` time cargo run --release -- -n 10000 -o 1000 -s 0 -t 4 -p 5 -g info ``` -Or, to see a transcript of a single game: +Or, to see a transcript of the game with seed 222: ``` -cargo run -- -s 2222 -p 5 -g info -l debug | less +cargo run -- -s 222 -p 5 -g info -l debug | less ``` +## Strategies + +To write a strategy, you simply [implement a few traits](src/simulator.rs). + +The framework is designed to take advantage of Rust's ownership system +so that you *can't cheat*, without using stuff like `Cell` or `Arc` or `Mutex`. + +Generally, your strategy will be passed something of type `&BorrowedGameView`. +This game view contains many useful helper functions ([see here](src/game.rs)). +If you want to mutate a view, you'll want to do something like +`let mut self.view = OwnedGameView::clone_from(borrowed_view);`. +An OwnedGameView will have the same API as a borrowed one. + +Some examples: + +- [Basic dummy examples](src/strategies/examples.rs) +- [A cheating strategy](src/strategies/cheating.rs), using `Rc>` +- [The information strategy](src/strategies/information.rs)! + ## Results -On seeds 0-9999, we have: +On seeds 0-9999, we have these average scores and win rates: - | 2p | 3p | 4p | 5p | -----------|---------|---------|---------|---------| -cheating | 24.8600 | 24.9781 | 24.9715 | 24.9583 | -info | 18.5909 | 24.1655 | 24.7922 | 24.8784 | + | 2p | 3p | 4p | 5p | +-------|---------|---------|---------|---------| +cheat | 24.8600 | 24.9781 | 24.9715 | 24.9570 | + | 90.52 % | 98.12 % | 97.74 % | 96.57 % | +info | 18.5915 | 24.1672 | 24.7924 | 24.8783 | + | 00.03 % | 48.59 % | 84.37 % | 90.33 % | To reproduce: diff --git a/src/strategies/information.rs b/src/strategies/information.rs index b480423..7831c00 100644 --- a/src/strategies/information.rs +++ b/src/strategies/information.rs @@ -766,6 +766,7 @@ impl PlayerStrategy for InformationPlayerStrategy { - (view.board.num_players * view.board.hand_size); // make a possibly risky play + // TODO: consider removing this, if we improve information transfer if view.board.lives_remaining > 1 && view.board.discard_size() <= discard_threshold { @@ -800,6 +801,8 @@ impl PlayerStrategy for InformationPlayerStrategy { let info = self.get_hint_sum_info(public_useless_indices.len() as u32, view); return TurnChoice::Discard(public_useless_indices[info.value as usize]); } else if useless_indices.len() > 0 { + // TODO: have opponents infer that i knew a card was useless + // TODO: after that, potentially prefer useless indices that arent public return TurnChoice::Discard(useless_indices[0]); } }