This course sits at the intersection of
Data Science and
Artificial Intelligence.
Its primary objectives are to:
- become fluent in the language of “data science and AI”,
- learn how to effectively communicate uncertainty,
- cultivate the ability to craft compelling narratives grounded in data,
- introduce students to data analytics for informed decision-making in uncertain environments,
- develop skills in analyzing and exploring large datasets,
- build and interpret (predictive) models with confidence.
Designed for students preparing for careers in data-driven environments, the course emphasizes practical concepts and tools commonly used by data scientists in business contexts. Rather than
focusing on coding, the course prioritizes data storytelling–the ability to interpret, analyze, and communicate insights from data.
Each lecture features the analysis of two to three real-world datasets, demonstrated live in class. Examples include consumer database mining, internet and social media tracking, asset pricing,
network analysis, sports analytics, and text mining.
The curriculum spans topics from classical statistics (e.g. hypothesis-driven decisions), data science (dimensionality reduction) to modern machine learning techniques (e.g deep learning). It
also explores cutting-edge advancements in generative AI. The course puts a particular emphasis on the analysis of text data in the context of both small and Large Language Models (LLM) that
form a basis of popular text-generating systems. Techniques covered include large-scale testing and false discovery rates, modern regression and model choice, machine-learning based classification,
network analysis, language and topic models, principal components, clustering, Bayesian analysis, deep learning, transformers and attention.
By the end of the course, students will be equipped to perform machine-supported intelligent data analysis and communicate findings effectively.