Mining of Massive Datasets

The Lagunita platform offers another very interesting MOOC, Mining of Massive Datasets. The MOOC is self-paced and entitles those who manage to have answered at least 50% between the homework questions and the final exam. There are 14 homeworks, with a total of 62 multiple-choice questions. Together, they account for 50% of your grade.  The course is divided into 15 modules of videos and homeworks and a final exam. In the synchronous version of the course, the material is intended to be covered in seven weeks. However, you are free to spend more or less time learning this material. I have started it a month ago and I must say that this is a tough one, probably the toughest to date that I have taken. Two reasons for this:

  1. Some videos are too long and it is difficult to stay fully concentrated for the entire duration
  2. There is no programming. Actually you should be writing some programs to resolve some of the homework. However, it is not officially foreseen.

There is a (free) companion book, called MMDS (also downloadable totally free of charge).

The modules are as follows:

  1. MapReduce
  2. Link Analysis (PageRank)
  3. Locality-Sensitive Hashing
  4. Distance Measures and Nearest-Neighbor Learning
  5. Frequent Itemset Analysis
  6. Social-Network Graphs
  7. Algorithms for Data Streams
  8. Recommendation Systems
  9. Dimensionality Reduction
  10. Clustering
  11. Computational Advertising
  12. Machine Learning
  13. More on MapReduce Algorithms
  14. More on Locality-Sensitive Hashing
  15. More on Link Analysis

I am got full points at module 2 but I am stuck at module 3 (one of the questions is particularly edgy and you really need to understand what they mean before attempting an answer). Did I mention that you have only 5 attempts for each homework?

Anyway, the initial survey asks you if you really really intend to complete the course. I said yes…  If you want to know if I did well… try for yourself!