Difference between Finetune vs RLHF RLHF RLHF DataSet - Preference Data : { input text, summary 1, summary 2, human preference} Example input_text : I live right next to a huge university, and have been applying for a variety of jobs with them through their faceless electronic jobs portal (the "click here to apply for this job" type thing) for a few months. The very first job I applied for, I go..