Manual Evaluating in Langfuse UI
Collaborate with your team and add scores
manually in Langfuse UI.
Common use cases:
- Collaboration: Enable team collaboration by inviting other internal members to review a subset of traces. This human-in-the-loop evaluation can enhance the overall accuracy and reliability of your results by incorporating diverse perspectives and expertise.
- Evaluating new product features: This feature can be useful for new use cases where no other scores have been allocated yet.
- Benchmarking of other scores: Establish a human baseline score that can be used as a benchmark to compare and evaluate other scores. This can provide a clear standard of reference and enhance the objectivity of your performance evaluations.
Get in touch
Looking for a specific way to score your executions in Langfuse? Join the Discord and discuss your use case!