Symposium on AI in Contemporary and Future Science

We are organizing a symposium at the British Society for Philosophy of Science Annual Meeting, July 5 - 7 in Bristol, UK.

AI techniques are increasingly used in science, with striking and remarkable results. Yet philosophers of science are only beginning to grapple with this development as it pertains to contemporary science and its ramifications for future science. This symposium will bring together philosophers of science and working scientists to address some of these issues. It will pay particular attention to how AI techniques are used in various scientific subdisciplines, including radio astronomy and gravitational wave astrophysics, and the resultant challenges for future science in these areas. Topics to be considered include supervised vs unsupervised AI techniques for scientific discovery, how AI is being used to complement and supplant citizen scientists, and parallels between ML and traditional modelling techniques. 

Idealization in ML and xAI

Interpretability and xAI methods are important for establishing trust in using black-box models. However, recently criticism has mounted against current xAI methods that they disagree, are necessarily false, and can be manipulated, which has started to undermine the deployment of black-box models. Rudin (2019) goes so far to say that we should stop using black-box models altogether in high-stakes cases because xAI explanations ‘must be wrong’. However, strict fidelity to the truth is historically not a desideratum in science. Idealizations--the intentional distortions introduced to scientific theories and models--are commonplace in the natural sciences and are seen as a successful scientific tool. Thus, it is not falsehood qua falsehood that is the issue. In this talk, I outline the need for ML and xAI research to engage in idealization evaluation. I discuss where current research can help with idealization evaluation and where innovation is necessary. I address questions surrounding how idealization in highly idealized models differ from idealizations deployed in ML and how ML idealizations can aid scientific inquiry.


Deep Learning Robustness for Scientific Discovery: The Case of Anomaly Detection"

Machine Learning (ML) techniques such as Deep Neural Networks (DNNs) are of great promise in science today. In High Energy Physics in particular, they are supposed to foster scientific discovery through the detection of anomalies, without reliance on any specific theory or model. Anomalies, in turn, have long been recognized as a major driving force of science: Kuhn ([1970]) held them responsible for paradigm shifts, Lakatos ([1970], [1976]), in severe cases, for the abandonment of a theory’s hard core, Laudan ([1977]) for the establishment of a preference order among rival theories, and even recent proponents such as de Regt ([2020]) for the advancement of science through an increase of understanding. However, DNNs also have astonishing shortcomings, as they are vulnerable to ‘adversarial examples’; data instances that are easily classifiable for humans but totally misclassified by DNNs. Adversarial vulnerability is a double-edged sword: On the one hand, it shows that discerning DNNs’ credible outputs from flukes requires some skill. On the other hand, adversarials exhibit DNNs’ sensitivity to subtle, often humanly-inscrutable features that could also be scientifically productive (Buckner [2020]). Such features are, in fact, being utilised in anomaly detection. Against this backdrop, I offer an analysis of, and a cautionary tale about, DNNs’ present utility for scientific discovery in the talk. To do so, I will introduce a notion of performance robustness, which DNNs need to satisfy in order to be able to deliver genuine discoveries. Furthermore, I will argue that the achievement of performance robustness often, if not always, implies limitations to fully ML-driven discovery. 

Astronomical image interpretation by academics, citizen scientists and AI

LOFAR currently is the world’s largest ground-based low frequency radio telescope, with stations situated across Europe. The distance between stations gives ‘baselines’ of up to 2000 km, allowing LOFAR to create high-resolution radio maps of the whole northern sky. The LOFAR Two-metre Sky Survey (LOTSS) has mapped more than 4.4 million radio sources, making it the largest all-sky survey in human history. Most of these sources originate from distinct radio-loud galaxies, or ‘hosts’ with active galactic nuclei (AGN). However, the radio structures can extend for up to a million light years (hundreds of kiloparsecs or more) beyond the host galaxy. Correctly associating emission with its host galaxy is essential to understanding the physics of both the AGN itself and of galaxy evolution, but is very challenging for these large complex sources. In the current approach to dealing with these, AI and citizen scientists make fundamental contributions to the generation of the final dataset alongside professional astronomers. Users are asked to select the most likely host and its relationship to the radio emission based only on 2D images. Thus what counts as ‘the most likely host’ and its relationship to the radio emission becomes an epistemological problem based on the prior knowledge of the agent making the decision, whether full-time radio astronomer, citizen scientist, or AI algorithm. One example is giant radio galaxies, where almost completely different subsets are found by astronomers, citizen scientists and AI. We will discuss some of the issues involved in generating a homogeneous catalogue from disparate sources of knowledge, and how AI can help or hinder progress in this area. 

Machine learning and the problem of noise-dominated measurement

The detection of gravitational waves is among the most striking scientific successes in recent years (Abbott et al 2015). One of the primary challenges faced by researchers attempting to detect a gravitational waveform is the inherently noisy nature of the available data. Researchers employ a variety of techniques to extract a target signal from the noisy background. These techniques are informed by background physical theory, which provides crucial information about the expected shape of a target signal. 

Yet this way of dealing with noisy data raises challenges for the prospect of novel discoveries and breakthrough science. In particular, it becomes difficult to see how one might identify novel phenomena in such data given the role of current theory in filtering out the noise. We call this challenge “the problem of noise-dominated measurement”. This paper investigates one aspect of the problem of noise-dominated measurement in more detail: the use of machine learning (ML) techniques to detect a target signal. 

First, we argue that the use of such techniques exacerbates the problem of noise-dominated measurement. ML systems are often trained, via supervised learning regimens, to look for quite specific signals. Accordingly, such systems are not in a position to distinguish novel and potentially interesting signals from background noise. Second, we consider whether unsupervised learning methods might help to address the problem. Our assessment is mixed. On the one hand, unsupervised methods are generally able to identify novel structures and patterns in data. On the other hand, it is generally difficult to interpret the outputs of such methods without an appropriate conceptual framework (Boge 2021, Kieval forthcoming). Yet these frameworks are typically absent in in breakthrough science. Whether machine learning can provide a way forward thus remains to be seen.