2020: Cancer Genomics and Class Discovery [MATH891.002]



Course title Cancer Genomics and Class Discovery
Course Offering: MATH 891.002
Instructors: Chuck Perou, Katie Hoadley, Steve Marron, Andrew Nobel, Joel Parker, and Greg Forest
Course description: The objectives of this course are to:

  • Gain an understanding of the genomic data types being collected on human tumors
  • Learn about the basics of RNA and DNA NGS sequence analysis
  • Learn about pattern discovery tools including hierarchical clustering and biclustering
  • Understand the challenges of integrating heterogeneous data types
  • Learn real world examples of complex data integration for class discovery

Course expectations:

  • Group project with hands-on analysis of genomics data
  • Final report (3 pages)
 Prerequisites  Current enrollment in graduate program of any participating Department in the NIH T32 BD2K graduate training program
Course Dates Monday, Feb 3 – March 30
Meeting pattern TBA
Location: MEJ 3116, 5pm -7:15pm
Syllabus: Feb 3, 2020 – Lecture 1 (Chuck Perou and Katie Hoadley)

Feb 10, 2020 – Lecture 2 (Andrew Nobel)


Feb 17, 2020 – Lecture 3 (Steve Marron)

  • Lecture
  • Reading Material:
    • Liu, Y., Hayes, D. N., Nobel, A., & Marron, J. S. (2008). Statistical significance of clustering for high-dimension, low–sample size data. Journal of the American Statistical Association, 103(483), 1281-1293.
    • Huang, H., Liu, Y., Yuan, M., & Marron, J. S. (2015). Statistical significance of clustering using soft thresholding. Journal of Computational and Graphical Statistics, 24(4), 975-993.


Feb 24, 2020 – In Class Presentation of Project Proposal

March 2, 2020 – Lecture 4  (Joel Parker)


  • Plan/work on projects and email instructors for questions/advice as needed

March 16, 2020 – Class Cancelled

March 23 – Lecture 5 (Katie Hoadley)



March 30, 2020 – In class presentations