Another wordcloud visualization: Marlene

Ok, this is not Data Science. This is just for fun. I have to admit, I tend to mix up things. Here I am mixing up the German class I am taking with one of the visualizations that I prefer, the wordcloud. Not because it conveys a message about data, in this case is just for the mere aesthetics, for the fact that it looks nice. I just needed a cover for my essay on Marlene Dietrich. (However, there seem to be  businesses that are built on this visualization nowadays).

So, this is how I built a Marlene Dietrich wordcloud, based on the text of two of her most famous songs (“Lili Marleen” and “Sag mir wo die Blumen sind”).

I used the concepts that I explained in two old articles that you can find on this site:

In the first one I scraped this site, and gave a shape to the wordcloud with a mask, then used a custom font thanks  to the excellent webcloud generator by Amueller.  In the second post I actually used a text file, a custom mask and a manipulation of the stop words to adapt to an ancient language. This time the language is current German, so I do not manipulate the stop words at all, but I use the Gimp to merge two  layers consisting of the wordcloud and of  the original styled image that I used to build the mask. A couple of final touches with Gimp to add the title of Marlene’s biography “Ich bin, Gott sey Dank, eine Berlinerin”

The styled portrait as found on the Internet

The mask used to generate the wordcloud

The code used to read the wordcloud text  from a file where the text of the two songs have been merged (sequencially):

This second bit of code actually generates the first version of the wordcloud.

This version looks as follows:

As I explained before, I did the rest with GIMP.