Collins Conference Room
Seminar
  US Mountain Time

Our campus is closed to the public for this event.

Ishanu Chattopadhyay (University of Chicago)

The classical scientific method proceeds via making hypotheses, with subsequent validation against empirical evidence. Despite prevalence of helpful computational tools, the fundamental step of making new hypotheses has always been ultimately driven by human insight. It has been the human scientist at the center stage doing the “science”, with machines assisting by running set protocols and carrying out routine calculations. The notion of automating scientific discovery is based on the possibility of reversing these roles we ask if it is possible for unsupervised machines to duplicate human intuition can machines go beyond simple predictions and superficial correlations, and distill new scientific insight from data? In other words, can we automate the scientific method? In this talk, I will attempt to make a case in the light of new breakthroughs in automated reasoning, zeroknowledge inference, and the new computable metrics for universal similarity and statistical causality.

● Universal Similarity: Central to scientific inquiry is the ability to compare and contrast data. The discriminating characteristics to look for is determined by expert designed heuristics, e.g., shapes of “folded” lightcurves used as “features” to classify stars, and Fourier coefficients of brainwaves to identify anomaly. Finding good features is nontrivial, and driven by insight. We propose a universal solution: quantifying similarity between data sources (some restrictions apply), without a priori knowledge, features or training. Uncovering a universal algebraic structure on a space of symbolic stochastic models for quantized data, we show that such generators may be added and uniquely inverted; and that a model and its inverse always sum to the generator of flat white noise. Thus, every data stream has an antistream: data generated by the inverse model. Similarity between two streams, then, is the degree to which one, when summed to the other’s antistream, mutually annihilates all statistical structure to noise.

● Statistical Causality: While correlation is widely used to discern statistical relationships, we are really interested in causal dependence. Designing a causality test, that may be carried out in the absence of restrictive presuppositions on the dynamical structure of the data, is nontrivial. I present a new nonparametric test of Granger causality for quantized data from ergodic stationary sources. In contrast to stateofart, this approach makes precise and computes the degree of causal dependence between streams, without making any restrictive assumptions, linearity or otherwise. Additionally, without any a priori imposition of specific dynamical structure, we infer explicit generative models of causal crossdependence, represented as generalized probabilistic automata.

Purpose: 
Research Collaboration
SFI Host: 
David Wolpert

More SFI Events