This class studies the mathematical theory behind modern-day big data applications.
Talks an extensive amount about the mathematics behind machine learning. The book (which can be found here) is used at Princeton and CMU for graduate level courses; this course is taught at a lower level.
In general, the more math classes that you have previously taken, the better prepared you will be for this class. If you are not very comfortable with statistics, this class will be challenging. Having completed a class such as ORIE 3500 (Engineering stats II) or ECE 3100 is very helpful. The math topics that you really should know cold are:
- CS 2800
- Linear algebra
- Basic probability (EngrD 2700 is fine)
Big-Oh notation Topics that will come up include:
- Multivariable Calculus
- Generating functions
- Difference equations (Math 2930)
- High-dimensional space
- Random graphs (G(n,p) model) and threshold probabilities
- Branching processes
- Singular value decomposition
- Random Walks and Markov chains
- Machine Learning (Linear SVMs / Batch Perceptron, Ensemble Learning, Clustering)
- Learning Theory
- Randomized Algorithms (Time permitting, most likely not)
3 problem sets a week consisting of 2 to 3 problems each.
In Spring 2014, there was one problem set each week consisting of ~6 problems from the textbook. The difficulty of the problems varies greatly, but the overall difficulty of each problem set is approximately equal. There were two in class midterms and a final.
PLEASE do not take this class, it is an absolute shitfest.
The grading scheme is completely up in the air, and he may as well pick grades out of a hat. The exams can cover material never gone over in class. However, usually the exams do not have arduous or time-consuming problems.
If you put an honest effort and do the homework assignments, attend lecture, and go to office hours if you need help understanding the material, you’ll do fine. Most people stopped attending lecture and stopped doing homework because Hopcroft had no clear grading rubric. Don’t fall into this trap. Theres a lot of interesting material in this course. Hopcroft is not bad as a lecturer, but he does make some assumptions on how much background in math/stats students have. Its VERY easy to get behind in this class. It would helpful to read bits of the textbook before each lecture.
Clear grading rubrics are a reasonable thing to expect from a professor and Hopcroft is remiss in his duties in not providing one.
If you accept an organized class, where you have some idea of what your grade will be before you get it, and what standard you will be graded on, then don’t take Hopcroft. He is of the philosophical belief that people should not know what standards they are being held to (although he wouldn’t phrase it like that), or get feedback on whether they’re successfully completing their work. He also rejects answers that are correct on tests because they weren’t the answer he was looking for, etc. There were also several homework problems that were actually impossible, or where the answer Hopcroft had in mind was actually wrong.
The basic problem is, while Hopcroft is extremely smart, he has a disdain for things like concrete grading standards and similar, recognizing them as somewhat artificial and tangential to learning but not valuing their practical merit in making the class easy to reckon with. It’s easy to blame people for not doing homework, but students have busy lives they need to prioritize, and when you get literally no value for each homework assignment (no feedback, no guarantee that it will even register that you did it, and no indication of whether you got it right or wrong) but still have to work very hard and long to arrive at solutions (which you are going to be uncertain of because of the nature of the course), YES, people will stop doing it. Hopcroft could have easily prevented this. … I say this as someone who was more than adequately prepared for the material. Professors have duties, too.
Please take the “advice” on this page with a grain of salt. While the reviews were generally negative in Spring 2013 (the first time it was offered), I found the experience in Spring 2014 to be quite good. Part of the problem in 2013 was that no one had taken the course before the TA’s were generally unprepared compared to other classes in CS where most of the TA’s have taken the class before. In addition, Hopcroft himself was still learning the best way to teach the course and there was no precedent for what would work. An obvious example is the homework policy. In 2013, they were due every class (3 times a week) and mid-way through the semester, he stopped collecting them (though he still assigned them; also, no assignment was ever graded or given comments on). Both of these policies were very unpopular and they were changed in 2014 to 1 hw a week, each of which was graded by hand and returned the following week. I thought grading on both homeworks and exams was done quite fairly. I had quite a positive experience so I would suggest you try it out if the subject interests you.
Nobody seems to know what’s going on.
<--- Spring 2013
Requires quite a bit of mathematical intuition to solve the problems, and by quite a bit, I mean A LOT.
The topics were very interesting!
|Semester||Time||Professor||Median Grade||Course Page|
|Spring 2013||MWF 1:25-2:15||John Hopcroft||B||110||http://www.cs.cornell.edu/courses/CS4850/2013sp/|
Although this may only apply to a small number of students, it’s probably worth mentioning:
If you’ve taken CS 4780, ORIE 3500, ORIE 3510, and a have a strong intuition of Linear Algebra you’ll know the majority (> 90%) of the material covered in the class and most likely find the class too easy (as this is an undergraduate version).
Specifically, if you’ve groked certain supervised learning approaches (finding maximum-margin hyperplanes via hard-margin linear svms and batch perceptron), ensemble learning, unsupervised learning approaches (clustering, dimension reduction (SVD/Spectral decomp/various matrix factorizations)), “curse of dimensionality”, and learning theory (VC Dimension) you’ll be more than fine.
The homework problems at the 4850 level test basic understanding of the concepts covered in class, but there are a few other unassigned problems that cover more interesting research problems. The exams are extremely straightforward; if you’ve gone to class consistently and understand the deeper ideas, the prelims should take no more than 20-30 minutes as a lot of them are one line answers.
’’’'’If you’ve been involved within the math competition space in high school (usamo/arml/pumac/etc.) or regularly attend the Putnam practices on campus, you’ll have developed the mathematical intuition that’s required to solve a lot of the harder problems in the book. Unfortunately, the vast majority of kids who wish to take the class haven’t and have trouble constructing basic proofs, which hinders their ability to reason about the material.’’’’’
Additionally, a basic understanding of non-measure theoretic probability is required.