This projects builds upon the fancy_gym library and aims to provide a framework for researchers to collect human feed back for preference-based RL efficiently.
conda -n RLHF python=3.8.18
conda activate RLHF
Note: Any Python version $\geq 3.8$ should work, but it has been successfully tested on $3.8.18$, so use that unless you have a reason to use a newer or older one.
pip install RLDuels
Note: some of the required packages might cause trouble in the installation process. Frequent issues persist with: Fancy_Gym, specifically the mujoco dependency.
Installation from master
git clone git@github.com:theonetruekn/RLDuels.git
cd RLDuels pip install -e .