RLHF and Alignment

3 selectedDifficulty 4-63 unseenView topic

IntermediateNew

0 answered

3 intermediateAdapts to your performance

Question 1 of 3

120sintermediate (4/10)compare

Standard RLHF (as used for InstructGPT and early ChatGPT) has three stages. Which sequence is correct?