A prominent approach to AI safety goes under the name of "evals" or "evaluations". These are a critical component of plans that various major labs have, such as Anthropic&
I've written previously about factors impacting cooperation, especially in the presence of large disagreements. One factor that relates to cooperation that has been on my mind lately is trust. There are
I'm interested in how people can cooperate despite large disagreements. Part of this is because I believe such cooperation may be necessary for tackling issue in AI safety (e.g. some