Edit: I do not maintain this script anymore, dgorissen forked this and made it better – https://github.com/dgorissen/coursera-dl

After taking two Coursera classes(db and algo), I wanted a simpler way to download the lecture assets(video files, pdf, ppt etc) from the site.

The features that I was looking for in the downloader:
1. Bulk download from the command prompt.
2. The downloader should recreate the structure present in the video listings page i.e it should create directories corresponding to each weekly topic and their sub topics and download the files into the appropriate directories.
3. The downloader should be smart as in when it is run each time from the command prompt, it should know the assets previously downloaded and download only the newly updated assets. Speaking in programmer terms, it should diff between all the previous downloads and the assets in the current page and download only the newly added ones.

Hence, my free time during the last few days were dedicated to scripting this. The Python script that I wrote for this can be downloaded from github. Extract the contents of the zipped file to the directory where you want the assets to be downloaded and run the python script. More info is present in the readme.txt file accompanying the script. The script recreates the structure present in the video listings page and downloads the assets to the appropriate directories.

The image below shows the directories created and some of the files downloaded for automata course.

Compare the above with the course structure present in the video listings page of automata course:

Whenever new lessons are posted, just run the script and the script takes care of downloading the updated contents. It keeps track of already downloaded files and hence only downloads the newly updated contents.

Give it a spin and let me know how it works for you.

About these ads