update readme

2016-04-02 14:29:46 -07:00 · 2016-04-02 14:29:46 -07:00 · b19e6ff615
commit b19e6ff615
parent 81427e2dd5
2 changed files with 41 additions and 43 deletions
--- a/README.md
+++ b/README.md
@ -3,73 +3,68 @@
 Hanabi is a cooperative card game of incomplete information.
 Despite relatively [simple rules](https://boardgamegeek.com/article/10670613#10670613),
 the space of Hanabi strategies is quite interesting.
-
-This repository provides a framework for implementing Hanabi strategies.
+This project provides a framework for implementing Hanabi strategies in Rust.
 It also explores some implementations, based on ideas from
 [this paper](https://d0474d97-a-62cb3a1a-s-sites.googlegroups.com/site/rmgpgrwc/research-papers/Hanabi_final.pdf).
+In particular, it contains an improved version of their "information strategy",
+which achieves the best results I'm aware of for games with more than 2 players ([see below](#results)).

-In particular, it contains a variant of their "information strategy", with some improvements.
-This strategy achieves the best results I am aware of for n > 2 (see below).
+Please feel free to contact me about Hanabi strategies, or this framework.

-Please contact me if:
- You know of other interesting/good strategy ideas!
- Have questions about the framework or existing strategies
-
-Some similar projects I am aware of:
+Most similar projects I am aware of:
 - https://github.com/rjtobin/HanSim (written for the paper mentioned above)
 - https://github.com/Quuxplusone/Hanabi

 ## Setup

-Install rust/rustc and cargo. Then,
+Install rust (rustc and cargo), and clone this git repo.

-`cargo run -- -h`
+Then, in the repo root, run `cargo run -- -h` to see usage details.

+For example, to simulate a 5 player game using the cheating strategy, for seeds 0-99:
 ```
-Usage: target/debug/rust_hanabi [options]
-
-Options:
-    -l, --loglevel LOGLEVEL
-                        Log level, one of 'trace', 'debug', 'info', 'warn',
-                        and 'error'
-    -n, --ntrials NTRIALS
-                        Number of games to simulate (default 1)
-    -t, --nthreads NTHREADS
-                        Number of threads to use for simulation (default 1)
-    -s, --seed SEED     Seed for PRNG (default random)
-    -p, --nplayers NPLAYERS
-                        Number of players
-    -g, --strategy STRATEGY
-                        Which strategy to use. One of 'random', 'cheat', and
-                        'info'
-    -h, --help          Print this help menu
+cargo run -- -n 100 -s 0 -p 5 -g cheat
 ```

-For example,
-
-```
-cargo run -- -n 10000 -s 0 -p 5 -g cheat
-```
-
-Or, if the simulation is slow (as the info strategy is),
-
+Or, if the simulation is slow, build with `--release` and use more threads:
 ```
 time cargo run --release -- -n 10000 -o 1000 -s 0 -t 4 -p 5 -g info
 ```

-Or, to see a transcript of a single game:
+Or, to see a transcript of the game with seed 222:
 ```
-cargo run -- -s 2222 -p 5 -g info -l debug | less
+cargo run -- -s 222 -p 5 -g info -l debug | less
 ```

+## Strategies
+
+To write a strategy, you simply [implement a few traits](src/simulator.rs).
+
+The framework is designed to take advantage of Rust's ownership system
+so that you *can't cheat*, without using stuff like `Cell` or `Arc` or `Mutex`.
+
+Generally, your strategy will be passed something of type `&BorrowedGameView`.
+This game view contains many useful helper functions ([see here](src/game.rs)).
+If you want to mutate a view, you'll want to do something like
+`let mut self.view = OwnedGameView::clone_from(borrowed_view);`.
+An OwnedGameView will have the same API as a borrowed one.
+
+Some examples:
+
+- [Basic dummy examples](src/strategies/examples.rs)
+- [A cheating strategy](src/strategies/cheating.rs), using `Rc<RefCell<_>>`
+- [The information strategy](src/strategies/information.rs)!
+
 ## Results

-On seeds 0-9999, we have:
+On seeds 0-9999, we have these average scores and win rates:

-          |   2p    |   3p    |   4p    |   5p    |
----------|---------|---------|---------|---------|
-cheating  | 24.8600 | 24.9781 | 24.9715 | 24.9583 |
-info      | 18.5909 | 24.1655 | 24.7922 | 24.8784 |
+       |   2p    |   3p    |   4p    |   5p    |
+-------|---------|---------|---------|---------|
+cheat  | 24.8600 | 24.9781 | 24.9715 | 24.9570 |
+       | 90.52 % | 98.12 % | 97.74 % | 96.57 % |
+info   | 18.5915 | 24.1672 | 24.7924 | 24.8783 |
+       | 00.03 % | 48.59 % | 84.37 % | 90.33 % |


 To reproduce:
--- a/src/strategies/information.rs
+++ b/src/strategies/information.rs
@ -766,6 +766,7 @@ impl PlayerStrategy for InformationPlayerStrategy {
            - (view.board.num_players * view.board.hand_size);

        // make a possibly risky play
+        // TODO: consider removing this, if we improve information transfer
        if view.board.lives_remaining > 1 &&
           view.board.discard_size() <= discard_threshold
        {
@ -800,6 +801,8 @@ impl PlayerStrategy for InformationPlayerStrategy {
                let info = self.get_hint_sum_info(public_useless_indices.len() as u32, view);
                return TurnChoice::Discard(public_useless_indices[info.value as usize]);
            } else if useless_indices.len() > 0 {
+                // TODO: have opponents infer that i knew a card was useless
+                // TODO: after that, potentially prefer useless indices that arent public
                return TurnChoice::Discard(useless_indices[0]);
            }
        }