AISC8: Research Summaries

Uncertainty -> Soft Optimization Research lead: Jeremy Gillen Team Participants’ names: Benjamin Kolb, Simon Fischer, Juan Azcarreta Ortiz, Danilo Naiff, Cecilia Wood Short write-up: For AISC 2023, our team looked into the theoretical foundations for soft optimization. Our goal at the beginning was to investigate variations of the original quantilizer algorithm, in particular by following …

AISC8: Research Summaries Read More »

AISC6: Research Summaries

Impact of Human Dogmatism on Training Team members:  Jan Czechowski, Pranav Gade, Leo Mckee-Reid, Kevin WangExternal collaborators:  Daniel Kokotajlo (mentor) The human world is full of dogma, and therefore dogmatic data. We are using this data to train increasingly advanced ML systems, and for this reason, we should understand how dogmatic data affects the training …

AISC6: Research Summaries Read More »

AISC5: Research Summaries

Modularity Loss Function Team members:  Logan Smith, Viktor Rehnberg, Vlado Baca, Philip Blagoveschensky, Viktor PetukhovExternal collaborators:  Gurkenglas Making neural networks (NNs) more modular may improve their interpretability. If we cluster neurons or weights together according to their different functions, we can analyze each cluster individually. Once we better understand the clusters that make up a …

AISC5: Research Summaries Read More »

AISC4: Research Summaries

The fourth AI Safety Camp took place in May 2020 in Toronto. Due to COVID-19, the camp was held virtually. Six teams participated and worked on the following topics: Survey on AI risk scenarios Options to defend a vulnerable world Extraction of human preferences Transferring reward functions across environments to encourage safety for agents in …

AISC4: Research Summaries Read More »

AISC3: Research Summaries

The third AI Safety Camp took place in April 2019 in Madrid. Our teams worked on the projects summarized below: Categorizing Wireheading in Partially Embedded Agents: Team: Embedded agents – Arushi, Davide, Sayan They presented their work at the AI Safety Workshop in IJCAI 2019. Read their paper here. AI Safety Debate and Its Applications: Team: Debate – …

AISC3: Research Summaries Read More »

AISC2: Research Summaries

The second AI Safety Camp took place this October in Prague. Our teams have worked on exciting projects which are summarized below:   AI Governance and the Policymaking Process: Key Considerations for Reducing AI Risk: Team: Policymaking for AI Strategy – Brandon Perry, Risto Uuk Our project was an attempt to introduce literature from theories on the …

AISC2: Research Summaries Read More »

The first AI Safety Camp & onwards

by Remmelt Ellen and Linda Linsefors Summary Last month, 5 teams of up-and-coming researchers gathered to solve concrete problems in AI-alignment at our 10-day AI safety research camp in Gran Canaria. This post describes      the event format we came up with      our experience & lessons learned in running it in Gran Canaria      how you can contribute …

The first AI Safety Camp & onwards Read More »

The participants of the first AI safety camp in Gran Canaria

AISC 1: Research Summaries

The 2018 Gran Canaria AI safety camp teams have worked hard in the preparation of the camp and in the 10 day sprint. Each team has written a brief summary of the work they did during the camp: Irrationality Team: Christopher Galias, Johannes Heidecke, Dmitrii Krasheninnikov, Jan Kulveit, Nandi Schoots Our team worked on how …

AISC 1: Research Summaries Read More »