Podcast cover

Data Skeptic

Kyle Polich
577 episodes   Last Updated: Jun 14, 25
The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.

Episodes

In this episode, Kyle does an overview of the intersection of graph theory and computational complexity theory.  In complexity theory, we are about the runtime of an algorithm based on its input size.  For many graph problems, the interesting questions we want to ask take longer and longer to answer!  This episode provides the fundamental vocabulary and signposts along the path of exploring the intersection of graph theory and computational complexity theory.
Jun 01, 2025
Actantial Networks
In this episode, listeners will learn about Actantial Networks—graph-based representations of narratives where nodes are actors (such as people, institutions, or abstract entities) and edges represent the actions or relationships between them.  The one who will present these networks is our guest Armin Pournaki, a joint PhD candidate at the Max Planck Institute for Mathematics in the Sciences and the Laboratoire Lattice (ENS-PSL), who specializes in computational social science, where he develops methods to extract and analyze political narratives using natural language processing and network science.  Armin explains how these methods can expose conflicting narratives around the same events, as seen in debates on COVID-19, climate change, or the war in Ukraine. Listeners will also discover how this approach helps make large-scale discourse—from millions of tweets or political speeches—more transparent and interpretable, offering tools for studying polarization, issue alignment, and narrative-driven persuasion in digital societies. Follow our guest Armin Pournaki's Webpage Twitter/X Bluesky Papers in focus How influencers and multipliers drive polarization and issue alignment on Twitter/X, 2025 A graph-based approach to extracting narrative signals from public discourse, 2024  
How to build artificial intelligence systems that understand cause and effect, moving beyond simple correlations? As we all know, correlation is not causation. "Spurious correlations" can show, for example, how rising ice cream sales might statistically link to more drownings, not because one causes the other, but due to an unobserved common cause like warm weather. Our guest, Utkarshani Jaimini, a researcher from the University of South Carolina's Artificial Intelligence Institute, tries to tackle this problem by using knowledge graphs that incorporate domain expertise.  Knowledge graphs (structured representations of information) are combined with neural networks in the field of neurosymbolic AI to represent and reason about complex relationships. This involves creating causal ontologies, incorporating the "weight" or strength of causal relationships and hyperrelations. This field has many practical applications such as for AI explainability, healthcare and autonomous driving. Follow our guest Utkarshani Jaimini's Webpage Linkedin Papers in focus CausalLP: Learning causal relations with weighted knowledge graph link prediction, 2024 HyperCausalLP: Causal Link Prediction using Hyper-Relational Knowledge Graph, 2024  
May 16, 2025
Power Networks
In this episode we talk with Manita Pote, a PhD student at Indiana University Bloomington, specializing in online trust and safety, with a focus on detecting coordinated manipulation campaigns on social media.  Key insights include how coordinated reply attacks target influential figures like journalists and politicians, how machine learning models can detect these inauthentic campaigns using structural and behavioral features, and how deletion patterns reveal efforts to evade moderation or manipulate engagement metrics. Follow our guest X/Twitter Google Scholar Papers in focus Coordinated Reply Attacks in Influence Operations: Characterization and Detection ,2025 Manipulating Twitter through Deletions,2022
Kyle discusses the history and proof for the small world hypothesis.
Kyle asks Asaf questions about the new network science course he is now teaching.  The conversation delves into topics such as contact tracing, tools for analyzing networks, example use cases, and the importance of thinking in networks.
Apr 01, 2025
Fraud Networks
In this episode we talk with Bavo DC Campo, a data scientist and statistician, who shares his expertise on the intersection of actuarial science, fraud detection, and social network analytics. Together we will learn how to use graphs to fight against insurance fraud by uncovering hidden connections between fraudulent claims and bad actors. Key insights include how social network analytics can detect fraud rings by mapping relationships between policyholders, claims, and service providers, and how the BiRank algorithm, inspired by Google’s PageRank, helps rank suspicious claims based on network structure. Bavo will also present his iFraud simulator that can be used to model fraudulent networks for detection training purposes. Do you have a question about fraud detection? Bavo says he will gladly help. Feel free to contact him.   ------------------------------- Want to listen ad-free?  Try our Graphs Course?  Join Data Skeptic+ for $5 / month of $50 / year https://plus.dataskeptic.com
Mar 17, 2025
Criminal Networks
In this episode we talk with Justin Wang Ngai Yeung, a PhD candidate at the Network Science Institute at Northeastern University in London, who explores how network science helps uncover criminal networks. Justin is also a member of the organizing committee of the satellite conference dealing with criminal networks at the network science conference in The Netherlands in June 2025. Listeners will learn how graph-based models assist law enforcement in analyzing missing data, identifying key figures in criminal organizations, and improving intervention strategies. Key insights include the challenges of incomplete and inaccurate data in criminal network analysis, how law enforcement agencies use network dismantling techniques to disrupt organized crime, and the role of machine learning in predicting hidden connections within illicit networks.   ------------------------------- Want to listen ad-free?  Try our Graphs Course?  Join Data Skeptic+ for $5 / month of $50 / year https://plus.dataskeptic.com