The Lagunita platform offers another very interesting MOOC, Mining of Massive Datasets. The MOOC is self-paced and entitles those who manage to have answered at least 50% between the homework questions and the final exam. There are 14 homeworks, with a total of 62 multiple-choice questions. Together, they account for 50% of your grade. The course is divided into 15 modules of videos and homeworks and a final exam. In the synchronous version of the course, the material is intended to be covered in seven weeks. However, you are free to spend more or less time learning this material. I have started it a month ago and I must say that this is a tough one, probably the toughest to date that I have taken. Two reasons for this:
- Some videos are too long and it is difficult to stay fully concentrated for the entire duration
- There is no programming. Actually you should be writing some programs to resolve some of the homework. However, it is not officially foreseen.
There is a (free) companion book, called MMDS (also downloadable totally free of charge).
The modules are as follows:
- MapReduce
- Link Analysis (PageRank)
- Locality-Sensitive Hashing
- Distance Measures and Nearest-Neighbor Learning
- Frequent Itemset Analysis
- Social-Network Graphs
- Algorithms for Data Streams
- Recommendation Systems
- Dimensionality Reduction
- Clustering
- Computational Advertising
- Machine Learning
- More on MapReduce Algorithms
- More on Locality-Sensitive Hashing
- More on Link Analysis
I am got full points at module 2 but I am stuck at module 3 (one of the questions is particularly edgy and you really need to understand what they mean before attempting an answer). Did I mention that you have only 5 attempts for each homework?
Anyway, the initial survey asks you if you really really intend to complete the course. I said yes… If you want to know if I did well… try for yourself!