Machine Learning for Data Science and Analytics

Learn the principles of machine learning and the importance of algorithms.

Machine Learning for Data Science and Analytics

Course Description

Machine Learning is a growing field that is used when searching the web, placing ads, credit scoring, stock trading and for many other applications.

This data science course is an introduction to machine learning and algorithms. You will develop a basic understanding of the principles of machine learning and derive practical solutions using predictive analytics. We will also examine why algorithms play an essential role in Big Data analysis.

Course Outcomes:
  • What machine learning is and how it is related to statistics and data analysis
  • How machine learning uses computer algorithms to search for patterns in data
  • How to use data patterns to make decisions and predictions with real-world examples from healthcare involving genomics and preterm birth
  • How to uncover hidden themes in large collections of documents using topic modeling
  • How to prepare data, deal with missing data and create custom data analysis solutions for different industries
  • Basic and frequently used algorithmic techniques including sorting, searching, greedy algorithms and dynamic programming 
Course Details:

Prerequisites

  • High School Math
  • Some exposure to computer programming
About Instructor:

Ansaf Salleb-Aouissi - Department of Computer Science

Ansaf is a Lecturer in discipline of the Computer Science Department at the School of Engineering and Applied Science at Columbia University. She received her her BS in Computer Science in 1996 from the University of Science and Technology (USTHB), Algeria. She earned her masters and Ph.D. degrees in Computer Science from the University of Orleans (France) in 1999 and 2003 respectively.


Cliff Stein - Professor of IEOR and of Computer Science

His research interests include the design and analysis of algorithms, combinatorial optimization, operations research, network algorithms, scheduling, algorithm engineering and computational biology. Professor Stein has published many influential papers in the leading conferences and journals in his field, and has occupied a variety of editorial positions including the journals ACM Transactions on Algorithms, Mathematical Programming, Journal of Algorithms, SIAM Journal on Discrete Mathematics and Operations Research Letters. His work has been supported by the National Science Foundation and Sloan Foundation. He is the winner of several prestigious awards including an NSF Career Award, an Alfred Sloan Research Fellowship and the Karen Wetterhahn Award for Distinguished Creative or Scholarly Achievement. He is also the co-author of the two textbooks. Introduction to Algorithms, with T. Cormen, C. Leiserson and R. Rivest is currently the best-selling textbook in algorithms and has sold over half a million copies and been translated into 15 languages. Discrete Math for Computer Scientists , with Ken Bogart and Scot Drysdale, is a new text book which covers discrete math at an undergraduate level.


David Blei - Professor of Computer Science and Statistics

David Blei joined Columbia in Fall 2014 as a Professor of Computer Science and Statistics. His research involves probabilistic topic models, Bayesian nonparametric methods, and approximate posterior inference. He works on a variety of applications, including text, images, music, social networks, user behavior, and scientific data.

Professor Blei earned his Bachelor's degree in Computer Science and Mathematics from Brown University (1997) and his PhD in Computer Science from the University of California, Berkeley (2004). Before arriving to Columbia, he was an Associate Professor of Computer Science at Princeton University. He has received several awards for his research, including a Sloan Fellowship (2010), Office of Naval Research Young Investigator Award (2011), Presidential Early Career Award for Scientists and Engineers (2011), and Blavatnik Faculty Award (2013).


Itsik Peer - Associate Professor of Computer Science

Itsik Pe’er is an associate professor in the Department of Computer Science. His laboratory develops and applies computational methods for the analysis of high-throughput data in germline human genetics. Specifically, he has a strong interest in isolated populations such as Pacific Islanders and Ashkenazi Jews. The Pe’er Lab has developed methodology to identify hidden relatives — primarily in such isolated populations — that involves inferring their past demography, detecting associations between phenotypes and genetic segments co-inherited from the joint ancestors of hidden relatives, and establishing the exceptional utility of whole-genome sequencing in population genetics. With the arrival of high-throughput sequencing methods, Pe’er has focused on characterizing genetic variation that is unique to isolated populations, including the effects of such variation on phenotype.


Mihalis Yannakakis - Professor of Computer Science

He studied at the National Technical University of Athens (Diploma in Electrical Engineering, 1975), and at Princeton University (PhD in Computer Science, 1979).

He worked at Bell Labs Research from 1978 until 2001, as Member of Technical Staff (1978-1991) and as Head of the Computing Principles Research Department (1991-2001). He was Director of Computing Principles Research at Avaya Labs (2001-2002), and Professor of Computer Science at Stanford University (2002-2003). He joined Columbia University in 2004.

His research interests include design and analysis of algorithms, complexity theory, combinatorial optimization, game theory, databases, and modeling, verification and testing of reactive systems.


Peter Orbanz - Assistant Professor of Statistics

Before coming to New York, he was a Research Fellow in the Machine Learning Group of Zoubin Ghahramani at the University of Cambridge, and previously a graduate student of Joachim M. Buhmann at ETH Zurich.

His main research interests are the statistics of discrete objects and structures: permutations, graphs, partitions, and binary sequences. Most of his recent work concerns representation problems and latent variable algorithms in Bayesian nonparametrics. More generally, he is interested in all mathematical aspects of machine learning and artificial intelligence.


DON'T HAVE TIME?

We can send you everything you need to know about this course through email.
We respect your privacy. Your information is safe and will never be shared.