Multiagent alignment: game theory, emotions, human values
While focusing on the alignment of LLMs in the traditional NLP sense and advancing well-established benchmarks, researchers often overlook a crucial aspect of natural human interaction: emotions.
Given that LLMs are trained on vast amounts of emotionally charged human data, they should - and, as research shows, do - exhibit some ability to emulate emotions.
Currently, evaluating and enhancing the quality of emotional emulation in LLMs is an active area of research we are engaged in.
Our research direction focuses on how emotional reasoning influences strategic decision-making in both LLMs and humans within controlled environments.
Main direction: Alignment. The goal is to collect Human-Human, LLM-LLM, and Human-LLM data to enhance our benchmark, which currently relies on psychological research statistics as a human baseline.
Main Tasks:
Gather and analyze data from Human-Human, LLM-LLM, and Human-LLM experiments using the HL-EAI framework.
Develop alignment measurement methodology.
Study emerging behavioural patterns (e.g. manipulation, bluffing, etc.).
Participants
Ilya Makarov
Team lead
Mikhail Mozikov
ML Researcher