Introduction to Data Science in Python

This is the first course of a series of five offered on Coursera by the University of Michigan. The specialization is called “Applied Data Science with Python”. It was the first course I took after I have completed the Data Science Specialization from the Johns Hopkins University on the same platform. The second was completely based around the R programming language, this one is completely based on Python. I decided that it was a good idea because:

  • I am a programmer, with a background of many programming languages and also some Python in it. This should not be too difficult but it should be able to give me an overview and the capability of understanding which one is better for me.
  • I should be able to pick up rather quickly the various libraries and techniques.
  • It is a specialization, so a coherent set of courses put together in a different way, it is always good to see the same subjects from a different perspective.

That was my idea. The Johns Hopkins Data Science specialization has been running for years, this one is a brand new one. It must be able to bring something new. And it did. Let me explain the bad and the good aspects in my opinion. The good:

  • Well taught:
    • Clearly explained. Professor Christopher Brooks communicates well the message and at with a good set of examples and detail.
    • Overall coverage is good.
    • Subjects are well chosen.
    • You get the impression that in some circumstances Python seems to be more complete/powerful than R. However, it is also more difficult to get things to work the first time.
    • The Jupyter Notebooks: This option makes it possible to complete the assignments even on a tablet or phone.
    • The mentors on the forums. These people are absolutely fantastic. Thanks, thanks and again thanks for your time.

The bad:

  • The course is too new and a lot is left to the student. In fact, without reading the recommended articles, it would not be possible to pass the assignments.
  • A lot of stackoverflow.com browsing is needed
  • The fact that the other courses of the series actually were not ready when I completed the first. Bad, it screws up your planning.
  • The automated grader
  • The automated grader
  • The automated grader.

Let me explain. This thing is a real bitch. In every programming language there are many ways to do the same task. In this course, the grader has been programmed so that the responses must match EXACTLY, all decimals included, what it is expected. It does not allow other ways of using the libraries which are perfectly legal, it does not allow to bring some aesthetics into the output. None of this is possible. Apart from this aspect I enjoyed the course. I rated this 3 stars out of five, and I would improve the following aspects:

  • Ditch the automatic grader or make the programming effort needed to cope with different possible solutions (of course with correct reasoning and results).
  • Make more video material to increase the coverage.

That’s it, this grader feeling makes me say…  …do not actually know if I want to take the rest of the specialization. I much prefer peer grading. In principle in the beginning of this MOOC adventures I was against it, I have totally changed my mind. You get the great advantage of seeing other students work, approach and different ways of thinking. You get suggestions to improve your work. This is the way to go I think.