Open Data Sources

Professor Jean-Pierre Malle from in his “Datascience et Analyse situationnelle : dans les coulisses du Big Data” MOOC (Behind the scenes of Big Data), a course that can be found on the IONIS platform, made clear that the next big thing in IT is Bigdata. (The course is in French). It is not a -strictly …



TF-IDF stands for Term Frequency-Inverse Document Frequency. It is a method to find out how important a term or a set of terms is in a collection of documents or, as defined in wikipedia, in a “corpus”. I have first met it in the MOOC Text Retrieval and Search Engines,  but I have also retrieved …


The marvels of Python’s Matplotlib and Pandas

Today I will tell you a bit about my trip to the world of Python for scientific usage, and about two of the libraries that I found more amazing in general: Mathplotlib and Pandas. Matplotlib is an amazing 2D and 3D graphics library for generating matematical plots.It includes: Support for LATEX formatted labels and texts …


How to begin…

This is a lucky time in the history of the human kind. Sure there are so many bad things out there but there are tons of opportunities as well.  The web is full of resources about Data Mining, Search Engines, Retrieval Systems and the likes. However, for a total newbie I would recommend to take …



Hi. Before anything else, a big word of thanks! I would like to thank John Sonmez of Simple Programmer for his advice and his free blogging course. I have not followed all of his precious advice, and it took me a little more to be up and running that he showed in his lessons. But …
