Syllabus

This is a high-level description of the content units that will be covered in the course:

i) Introduction to the course (1 lecture): We first cover course specifics and logistics: office hours, homework, final project, bibliography, etc. We then introduce the notion of network and present a general notion of Network Science as a cross-disciplinary field. We will motivate this via several examples and highlight the associated challenges.

ii) Introduction to graph theory (2 lectures): We cover basic notions and definitions related to undirected and directed graphs: vertices, edges, simple graphs, weighted graphs, neighborhoods, degree, path, cycle. Also, we introduce concepts from algebraic and spectral graph theory, such as adjacency, incidence, and Laplacian matrices.

iii) Centrality measures (2 lectures): We introduce the notion of node centrality as a way to measure the importance of a node within the network. We compare classical measures such as degree, closeness, betweenness, eigenvector, and Katz centrality.  We then continue with link analysis algorithms for web search and describe the PageRank algorithm for ranking nodes in graphs in the context of random walks.

iv) Community detection (3 lectures): We first introduce the notions of cohesive subgroups, clustering coefficient, homophily, and modularity. We then discuss the famous `strength of weak ties’ hypothesis from sociology and its relation with communities in graphs. We then formulate the problem of community detection as a graph partitioning problem and study several algorithms in detail including spectral and hierarchical clustering.

v) Network models (4 lectures): We discuss the notions of degree distributions, power laws, and popularity in networks. These naturally lead to the definition of important classes of network graph models: Erdös-Rényi, small-world, and preferential attachment. The emphasis is on model construction, simulation, and inference of model parameters.

vi) Inference of network topologies and features (5 lectures): We first introduce the fundamental problem of estimating a network summary characteristic from sampled graph data, and present the Horvitz-Thompson estimator as well as the capture-recapture method.  Several network graph sampling designs are then discussed. We then analyze in quite detail the problem of network topology inference, where a portion of the network is unobserved and we wish to infer it from measurements.

vii) Inference of network processes (2 lectures): We present the problem of prediction static processes defined on networks. We start with the simple non-parametric method of nearest-neighbor prediction, and then transition to model-based approaches that describe vertex attributes in terms of the given network graph structure. Specifically, we study Markov random field models and graph-kernel regressions.

viii) Linear dynamics and diffusion in networks (2 lectures): We introduce models for the evolution of opinions in networks, and study their equilibria. We analyze how algebraic properties such as Cheeger’s inequality relates to the speed of information spread in networks. Applications in social agreement and synchronization will be discussed.

ix) Epidemics and contagion (2 lectures): We study the spread of epidemics in networks. Starting from branching processes and classical epidemiology we then study the network-based SI, SIR, and SIS models of diffusion. We analyze transient properties of these processes as well as the problem of inferring the source of an epidemic spread.

x) Analysis of flow data (2 lectures): We present a general framework to study flow through the edges of a network. We introduce the fundamental problem of traffic estimation, where the goal is to recover unobserved traffic matrix entries from link totals, and provide statistical methods for traffic estimation.  We conclude with the problem of modeling and inference of network cost parameters.