This is our living database of AI benchmarks and their corresponding evaluations according to our assessment framework BetterBench. If you click on any row, you'll get a detailed score breakdown and explanations for each of our criteria for that benchmark. You can also sort most columns in ascending or descending order and filter by name, tags, or rating. This repository will be updated over time – feel free to suggest new benchmarks or leave feedback here.
| Benchmark | Tested Concept | Overall Rating | Design Score | Usability Score | Year | Tags |
|---|