100

Constitutional AI vs. human feedback training

 title: 'What is Constitutional AI? Principles and Alignment | Ultralytics'

Constitutional AI differs from traditional reinforcement learning from human feedback (RLHF) primarily in its reliance on AI-generated feedback rather than extensive human labor[3][5]. While RLHF uses human crowdworkers to rate model outputs, Constitutional AI uses a predefined set of principles, or a constitution, to guide the model in critiquing and revising its own behavior[3][5]. This approach increases scalability, improves transparency through explicit reasoning, and reduces the need for costly human annotation[4][5].