RLHF Explained: How Human Feedback Trains the World's Best AI Models