Forecasting using R

Rob J. Hyndman is Professor of Statistics in the Department of Econometrics and Business Statistics at Monash University. He, together with George Athanasopoulos, has published the freely available book “Forecasting: Principles and Practice”, that can be found here or bought in its paper version at Amazon (amazon.com, amazon.com.uk, amazon.fr) or in its electronic version at …

More

Working with SQLite in R

In the words of its creators, SQLite is a self-contained, high-reliability, embedded, full-featured, public-domain, SQL database engine. And apparently is the most used in the world. Libraries exist for interfacing R with SLQLite, the minimum requirement being DBI (A Common Database Interface) and RSQLite (SQLite interface for R). The keyword here is “embedded”. You do …

More

Data Science Specialization it is!

I just went through the experience of completing the Johns Hopkins University Data Science Specialization on Coursera. The last course of this specialization was the Capstone project, which consists basically in learning about a new subject, Natural Language Processing (or NLP in short) and producing a Shiny application hosted on Shinyapps.io that predicts the next word a …

More

Tabular output in R

R provides several libraries to format tabular output. As I have had the problem of finding one that worked well in all occasions and without  too much hassle, I would like to compare, from the point of view of aesthetic and ease of use.  I will not use any of the options and parameters. Just …

More

Speak Like a doctor: Basic NPL in R

I am now aiming at the Capstone project in the Coursera’s Data Science Specialization from Johns Hopkins, to finish the Specialization. The project is focused on predicting the next word that somebody is going to type, based on several databases to be used to build up the prediction algorithm. There is a lot of previous …

More

Tutorial Review: How to Build a Text Mining, Machine Learning Document Classification System in R!

This tutorial by Tim D’Auria on Youtube, is shorter than 30 minutes. Without pretending too much background it gives you the basic tools and knowledge to build a basic document classification system. The classifier uses a simple KNN  classification algorithm and text mining techniques to learn to distinguish the candidate who pronounced the speeches of …

More

GoogleVis and R – Tutorial

During the first week of the “Developing Data Products” MOOC on Coursera, one of the lessons deals with GoogleVis. This is one of the ways to publish and animate your R charts. In practice GoogleVis provides an interface between R and the Google Charts Tools, allowing you to create interactive web charts from R without …

More

Shiny Application: A shaded Normal Distribution

I have played a bit with shiny. The RStudio folks provides the possibility of deploying R “Shiny” applications on their server. The final result  which will look a lot like this: https://<YourAccount>.shinyapps.io/<YourApp>. There are several plans for this type of deployment, including one that allows you to deploy up to 5 applications for free. You …

More