ESGenius

An expert benchmark for evaluating LLM knowledge of ESG and sustainability standards.

Source-grounded questions, reproducible evaluation code, and question-level model analysis for sustainability reporting, climate, governance, and standards-driven reasoning.

1,136 questions
50 models evaluated
7 framework families
A-D + Z answer protocol

Expert ESG knowledge, packaged for repeatable model evaluation.

The benchmark covers sustainability reporting, climate disclosure, biodiversity, energy, governance, and standards-driven ESG reasoning across IPCC, GRI, SASB, ISO, IFRS/ISSB, TCFD, and CDP sources.

query_idStable question identifier
queryQuestion stem
A-DAnswer options
ZNot sure option
ref_docSource document in reference file
source_textSupporting excerpt in reference file

Model performance at a glance, with question-level drill-down.

The homepage loads quickly, while the full Plotly heatmap remains available as a dedicated report for deep inspection.

Main ESGenius benchmark results
Main ESGenius benchmark results.

Inspect every model-question outcome without leaving the benchmark story.

The full report covers 50 evaluated models across 1,136 ESGenius questions, sorted by model rank and question difficulty for fast error-pattern analysis.

Correct Wrong Invalid Not sure Missing
Open full Plotly heatmap
50 models ranked top to bottom Hardest questions Easiest questions