Mining of Massive Datasets

The Lagunita platform offers another very interesting MOOC, Mining of Massive Datasets. The MOOC is self-paced and entitles those who manage to have answered at least 50% between the homework questions and the final exam. There are 14 homeworks, with a total of 62 multiple-choice questions. Together, they account for 50% of your grade.  The course …

More

Statistical Learning

This resource is totally free, and consists in a course based on a book which is itself totally free and available. I got to know it while browsing  the  forums discussions on Coursera Data Science Discussions. Somebody in a discussion compared this course to the Machine Learning Course by Andrew Ng, and added that this …

More

Machine Learning – Andrew Ng

I have mentioned this MOOC, from Stanford University’ professor Andrew Ng, available on Coursera in a couple of articles already. I have read about it in several forums and until now, I only took a quick look. A couple of months ago I finally decided to dive in, because I was really curious, the comments …

More

Build a Data Science skillset

I was invited to join the Coursera Data Science community a few weeks ago. I did, and this is a very interesting meeting point where people with any degrees of Data Science experience and skills meet and discuss several related topic. As I feel I am still not mature as a Data Scientist, I followed …

More

Data Science Specialization it is!

I just went through the experience of completing the Johns Hopkins University Data Science Specialization on Coursera. The last course of this specialization was the Capstone project, which consists basically in learning about a new subject, Natural Language Processing (or NLP in short) and producing a Shiny application hosted on Shinyapps.io that predicts the next word a …

More

NLP – Natural Language Processing

The Coursera JH Data Science Specialization closes with a Capstone Project based on Natural Language Processing. This course is in the references and its lessons are  still available for preview but only until the 30 June clicking on the following URL: https://class.coursera.org/nlp/lecture. The lessons in PDF format are still available from Dan Jurafsky at the …

More

Reproducible Research

This MOOC is conceptually one of the most interesting ones that I have taken to date. It is based around the implementation of the concept of “Literate Programming”, introduced by Donald Knuth in his 1992 book, in which basically a system where documentation and “live” source code are presented in the same document. In the …

More

Getting and Cleaning Data

“Getting and Cleaning Data” is the third course of the “Data Science Specialization” from Johns Hopkins Bloomberg School of Public Health on Coursera, was the first of this series of course where some connection to some of the Data Scientist’s real tasks  can be found. I imagine that the data collected on the fields in …

More

Plagiarism in MOOCs

I was on a discussion forum connected to the  “Getting and Cleaning Data” MOOC on Coursera. During the peer evaluation of the course’s project, one question puzzled me and I wanted to get some clues. The question is in no way related to the content of the project or the course, so I feel free …

More