Course Detail (Course Description By Faculty)

Big Data (41201)

BUS 41201 is a course about data mining: the analysis, exploration, and simplification of large high-dimensional datasets. Students will learn how to model and interpret complicated `Big Data' and become adept at building powerful models for prediction and classification.

Techniques covered include an advanced overview of linear and logistic regression, model choice and false discovery rates, multinomial and binary regression, classification, decision trees, factor models, clustering, the bootstrap and cross-validation. We learn both basic underlying concepts and practical computational skills, including techniques for analysis of distributed data.

Heavy emphasis is placed on analysis of actual datasets, and on development of application specific methodology. Among other examples, we will consider consumer database mining, internet and social media tracking, network analysis, and text mining.

Bus 41000 (or 41100). Cannot enroll in BUSN 41201 if BUSN 20800 taken previously.
This course will have a Canvas site.
Individual: 30% take-home Midterm exam

Group: 30% weekly homework, 40% final project

  • Allow Provisional Grades (For joint degree and non-Booth students only)
Description and/or course criteria last updated: March 15 2024
SCHEDULE
  • Spring 2024
    Section: 41201-01
    T 1:30 PM-4:30 PM
    Harper Center
    C01
    In-Person Only
  • Spring 2024
    Section: 41201-02
    W 1:30 PM-4:30 PM
    Harper Center
    C01
    In-Person Only
  • Spring 2024
    Section: 41201-81
    T 6:00 PM-9:00 PM
    Gleacher Center
    406
    In-Person Only

Big Data (41201) - Rockova, Veronika>>

BUS 41201 is a course about data mining: the analysis, exploration, and simplification of large high-dimensional datasets. Students will learn how to model and interpret complicated `Big Data' and become adept at building powerful models for prediction and classification.

Techniques covered include an advanced overview of linear and logistic regression, model choice and false discovery rates, multinomial and binary regression, classification, decision trees, factor models, clustering, the bootstrap and cross-validation. We learn both basic underlying concepts and practical computational skills, including techniques for analysis of distributed data.

Heavy emphasis is placed on analysis of actual datasets, and on development of application specific methodology. Among other examples, we will consider consumer database mining, internet and social media tracking, network analysis, and text mining.

Bus 41000 (or 41100). Cannot enroll in BUSN 41201 if BUSN 20800 taken previously.
This course will have a Canvas site.
Individual: 30% take-home Midterm exam

Group: 30% weekly homework, 40% final project

  • Allow Provisional Grades (For joint degree and non-Booth students only)
Description and/or course criteria last updated: March 15 2024
SCHEDULE
  • Spring 2024
    Section: 41201-01
    T 1:30 PM-4:30 PM
    Harper Center
    C01
    In-Person Only
  • Spring 2024
    Section: 41201-02
    W 1:30 PM-4:30 PM
    Harper Center
    C01
    In-Person Only
  • Spring 2024
    Section: 41201-81
    T 6:00 PM-9:00 PM
    Gleacher Center
    406
    In-Person Only