A prominent approach to AI safety goes under the name of "evals" or "evaluations". These are a critical component of plans that various major labs have, such as Anthropic&
The judge in the Musk v OpenAI[1] case out of the northern district of California has issued an order on Musk's motion for preliminary injunction, which asked for an order
Identifiability in IRL
One of my favorite papers is this one, titled "Occam's razor is insufficient to infer the preferences of irrational agents". It relates to an area of
Machine learning researchers and engineers love baselines. Baselines serve as an important starting point to make improvements, and the ability to check new ideas against baselines helps measure and incentivize progress. For hard
Whenever the topic of advanced machine learning systems comes up, people seem to inevitably end up discussing whether certain systems are or could be "intelligent" or whether a system is or