LangChain Reference home pageLangChain ReferenceLangChain Reference
  • GitHub
  • Main Docs
Deep Agents
LangChain
LangGraph
Integrations
LangSmith
  • Overview
  • Client
  • AsyncClient
  • Run Helpers
  • Run Trees
  • Evaluation
  • Schemas
  • Utilities
  • Wrappers
  • Anonymizer
  • Testing
  • Expect API
  • Middleware
  • Pytest Plugin
  • Deployment SDK
⌘I

LangChain Assistant

Ask a question to get started

Enter to send•Shift+Enter new line

Menu

OverviewClientAsyncClientRun HelpersRun TreesEvaluationSchemasUtilitiesWrappersAnonymizerTestingExpect APIMiddlewarePytest PluginDeployment SDK
Language
Theme
PythonlangsmithEvaluation

Evaluation

Tools for evaluating functions and models on datasets. Includes evaluators, scoring utilities, and dataset management.

Classes

Class

ExperimentResults

Results container for experiment data with stats and examples.

Class

ExperimentResultRow

Class

ComparativeExperimentResults

Represents the results of an evaluate_comparative() call.

Class

AsyncExperimentResults

Class

EvaluationResult

Evaluation result.

Class

EvaluationResults

Batch evaluation results.

Class

RunEvaluator

Evaluator interface class.

Class

DynamicRunEvaluator

A dynamic evaluator that wraps a function and transforms it into a RunEvaluator.

Class

ComparisonEvaluationResult

Feedback scores for the results of comparative evaluations.

Class

DynamicComparisonRunEvaluator

Compare predictions (as traces) from 2 or more runs.

Class

StringEvaluator

Grades the run's string input, output, and optional answer.

Class

LLMEvaluator

A class for building LLM-as-a-judge evaluators.

Functions

Function

evaluate

Evaluate a target system on a given dataset.

Function

evaluate_existing

Evaluate existing experiment runs.

Function

evaluate_comparative

Evaluate existing experiment runs against each other.

Function

aevaluate

Evaluate an async target system on a given dataset.

Function

aevaluate_existing

Evaluate existing experiment runs asynchronously.

Function

run_evaluator

Create a run evaluator from a function.

Function

comparison_evaluator

Create a comaprison evaluator from a function.