Tutorial Review: How to Build a Text Mining, Machine Learning Document Classification System in R!

This tutorial by Tim D’Auria on Youtube, is shorter than 30 minutes. Without pretending too much background it gives you the basic tools and knowledge to build a basic document classification system. The classifier uses a simple KNN  classification algorithm and text mining techniques to learn to distinguish the candidate who pronounced the speeches of …

More

Shiny Application: A shaded Normal Distribution

I have played a bit with shiny. The RStudio folks provides the possibility of deploying R “Shiny” applications on their server. The final result  which will look a lot like this: https://<YourAccount>.shinyapps.io/<YourApp>. There are several plans for this type of deployment, including one that allows you to deploy up to 5 applications for free. You …

More

Exploratory Data Analysis MOOCs

At this moment I am following two similar MOOCs, one on Coursera (Exploratory Data Analysis, part of the Johns Hopkins Data Science Specialization) and one on Udacity (Data Analysis with R, part of the their Data Analyst Nanodegree). At first glance, the two should deal with roughly the same subject, however I would rather say …

More

R Kernel in Jupyter: Update

I was using my own  article “A R kernel in IPython notebook: Jupyter“, to install both Jupyter and the IRKernel on a small ASUS Eee PC Seashell 12” laptop that I use when we are on holidays. This runs Kubuntu, as most of the PCs in my household. Unfortunately the instructions I provided in July …

More

Getting and Cleaning Data

“Getting and Cleaning Data” is the third course of the “Data Science Specialization” from Johns Hopkins Bloomberg School of Public Health on Coursera, was the first of this series of course where some connection to some of the Data Scientist’s real tasks  can be found. I imagine that the data collected on the fields in …

More

The Analytics Edge MOOC on Edx

After the MOOC “Data Visualization”, I joined a discussion on “What MOOC are you  going to take next”. A participant to the discussion mentioned the  MIT’s “The Analytics Edge” course, on edx.org, as being one of the nicest he had taken. The name was intriguing, the description too. I decided to have a look. Dates: …

More

Mastering R Programming

I just completed the Coursera MOOC “R Programming” by Johns Hopkins University and on which I already have written about in another recent post. I have put together some thoughts on this course while responding to a question on the courses’ forums. The question was: “How do I master R Programming. My answer: If you …

More

R Programming

R Programming is the second course in the Data Science Specialization offered by the Johns Hopkins University on Coursera. At the end of the Data Visualization Course, discussing with other fellow MOOCers on the “Your next MOOC” thread, I was expressing the wish to take this course. Many people who had done it stated that …

More

R and Ggplot Courses on Udemy

Prof. Charles Redmond has produced a series of short courses on the udemy.com platform. The courses follow the principle of being finalized to a specific task, and the tasks, in the first three courses of the series, follow logically from each other. These first three courses are: R, GGPlot and Simple Linear Regression R, GGPlot …

More