Blog.

May 12, 2024

The Benchmark Trap: Why LLM Metrics Mislead and Evals Enlighten

The multitude of issues with LLM benchmarks, and the problem of documenting the complexities to accurately test LLMs, and using evals as alternatives

Apr 16, 2024

The discrepancy between the potential impact of AI, its current limited use in the educational sphere, and the reasons behind it

Feb 2, 2024

The problems with AI alignment, and how instrumental misalignment threatens to worsen it.