One of the broad goals of data science is examining raw data with the purpose of identifying its structure and trends, and of deriving conclusions and hypotheses from it. In the modern world awash with data, data analytics is more important than ever to fields ranging from biomedical research, space and weather science, finance, business operations and production, to marketing and social media applications. This course introduces various statistical learning methods and their applications. The R programming language, a very popular and powerful platform for scientific and statistical analysis and visualization, is introduced and used throughout the course. We discuss the fundamentals of statistical testing and learning, and cover topics of linear and non-linear regression, clustering and classification, support vector machines, and decision trees. The datasets used in the examples are drawn from diverse domains such as finance, genomics, and customer sales and survey data.
Scientist IV, Head of Bioinformatics, Cystic Fibrosis Foundation Therapeutics Lab
Andrey Sivachenko
Scientist IV, Head of Bioinformatics, Cystic Fibrosis Foundation Therapeutics Lab
Dr. Andrey Sivachenko was trained in theoretical physics at Moscow Institute of Physics and Technology and then did his PhD work in theoretical chemical physics at Weizmann Institute of Science, Israel. After a postdoctoral research appointment at the Department of Physics at the University of Utah, he took an industry job in 2001 as a computational biologist. Sivachenko has been working ever since in the field of bioinformatics and computational biology, applying computational tools and statistical analysis methods to biological data, and developing algorithms and software. He has been doing his research both in the private sector and in academic or non-profit settings (such as the Broad Institute and the Cystic Fibrosis Foundation) and had the chance (and luck) to work on a diverse set of problems and projects involving basic research and drug discovery. Sivachenko is currently a co-leader of Next Generation Sequencing group and head of bioinformatics for the Cystic Fibrosis Foundation Therapeutics Laboratory.
Executive Director, Data Sciences, Verve Therapeutics
Victor A Farutin
Executive Director, Data Sciences, Verve Therapeutics
Victor A. Farutin is a seasoned computational biology data scientist and team leader with over two decades of experience in biotechnology and pharmaceutical industry with emphasis on bioinformatics, non-clinical statistics, and translational research and development. His expertise spans epi-/genomics, glyco-/proteomics, pathway network analysis, high-throughput computing, and statistical modeling, with particular focus on leveraging these domains to empower multidisciplinary teams and projects at the interface between molecular characterization of biological systems by orthogonal technologies and translational research. Currently at Verve Therapeutics, a wholly owned subsidiary of Eli Lilly and Company, he leads data sciences function. Previously, he held positions of increasing responsibility at Precede Biosciences, J&J, Momenta Pharmaceuticals, Pfizer, and Millennium Pharmaceuticals, where he started his career in the industry after obtaining PhD in physics and math (biophysics) from Moscow Institute of Physics and Technology and completing postdoctoral training at the University of Michigan.
Meeting Info
Th 8:10pm - 10:10pm (1/26 - 5/16)
Deadlines
Last day to register:
Additional Time Commitments
Optional sections to be arranged.
Prerequisites
Good programming skills, preferably in R or solid experience in other languages; good understanding of probability and statistics at the level of CSCI E-106 or STAT E-109. See the syllabus for the recommended pretest.
Notes
This course meets via web conference. Students may attend at the scheduled meeting time or watch recorded sessions asynchronously. Recorded sessions are typically available within a few hours of the end of class and no later than the following business day. See minimum technology requirements.