Tune in for the live stream on YouTube or Twitter.
Abstract: Large language models (LLM) are impressive. But do they have coherent "beliefs" about the world? And do they form anything like a mental model when reasoning? In the first part of this talk, I'll describe our explorations into these questions using a set of question-answering probes to map out the LLM's latent knowledge as "belief graphs". We find the LLM's world views are only partially coherent, and often contain blatant inconsistencies. Taking this further, I'll describe how we can assemble these model-generated fragments into more coherent representations, creating more consistent views of the world. Finally, I'll describe how similar techniques can reveal what the model views as valid reasoning steps, how model-believed explanations can be assembled from them, and how users can "teach" the system when such explanations contain errors. Together, these suggest a new style of system architecture in which a cognitive layer is added on top of the language model, overcoming some of the latent inconsistencies present the LLM alone.