Saturday, March 13, 2021

Quick Exploratory Data Analysis with SweetViz

I recently came across a Python library that is useful for quick exploration of a dataset (or two) in a Jupyter notebook, SweetViz.


    import sweetviz
    !pip install sweetviz --user
    import sweetviz
import pandas as pd
df = pd.read_csv('')
analysis = sweetviz.analyze(df)
analysis.show_notebook() # or export with show_html()

Here is a brief notebook demonstration.

Saturday, January 9, 2021

Getting Student Submission Data from Brightspace with Python and Selenium

As a teacher with students in multiple Brightspace courses, I was looking for a dashboard to show which students have unsubmitted assignments. While Brightspace does have an API available, I decided that it wasn't going to work for a few reasons. There are also commercial (non-free) plugins that can do most of what I was looking for, but this was a good opportunity to explore scraping of content from dynamic web pages with Python.

You may be familiar with the Python Requests and Beautiful Soup libraries, which are great, but since Brightspace is requiring a Microsoft login I needed to go with Selenium. Selenium is designed for automating web browser interactions, which means we can use it to log in to a site and scrape pages.

While Selenium can be installed and run locally, it also works in a Colab notebook:

!apt update
!apt install chromium-chromedriver
!pip install selenium
options = webdriver.ChromeOptions()
browser = webdriver.Chrome(options=options)

Once that is set up, we need to log in to our Brightspace server:

email = ''
base_url = ''
import getpass # so you don't show your password in the sourcecode
password = getpass.getpass()
email_field = (By.ID, 'i0116')
password_field = (By.ID, 'i0118')
next_button = (By.ID, 'idSIButton9')

From there it's a matter of scraping the course progress pages as we loop through the course IDs and student IDs. I may update this post later with an automated way to scrape these IDs, but for now we need to look them up on Brightspace and code them in:

courses = {'LA':11111, 'Math':11111, 'Science':11111, 'Social':11111}
students = {'First Student':111111111, 'Second Student':111111111}
import pandas as pd
df = pd.DataFrame.from_dict(students)
for course in courses:
    course_id = courses[course]
    submissions = []
    for student in students:
        student_id = students[student]
        url = base_url+'/d2l/le/classlist/userprogress/'+str(student_id)+'/'+str(course_id)+'/Summary'
        WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.CSS_SELECTOR, "span[class^='d2l-textblock']")))
        elements = browser.find_elements_by_css_selector("span[class^='d2l-textblock']")
        submitted = elements[8].text
    df[course] = submissions

This will give us a Pandas DataFrame (and a CSV file) with information about how many assignments each student has submitted in each course. We can manipulate this a bit to create some visualizations, but perhaps that's a post for later. For now, here's part of a chart that we could generate:

Let me know if you try this or if you come across anything that should be corrected.

Saturday, September 19, 2020

Bookmarklet for Generating a Link to Copy a Google Doc, Sheet, Slides, or Drawing

If you want students to create a copy of a Google Doc, Sheet, Slides, or Drawing, you can replace the /edit at the end of the link with /copy.

To make that easier, I've created a bookmarklet. To set it up for yourself, drag the following link to your bookmark bar or menu:


Then when you have a Doc, Sheet, Slides, or Drawing open (and you've set the sharing permissions) you can click the bookmarklet and it will generate a link that you can copy and send to your students. When they click the link it will prompt them to make a copy.

Wednesday, July 8, 2020

Creating, Editing, and Sharing Jupyter Notebooks with GitHub and Anaconda

For the past year, I have been working on the Callysto project, which involves fostering computational thinking and data science within the regular curriculum for grades 5 to 12. One of the services provided by Callysto is the Callysto Hub, a free online environment hosting an open-source instance of Jupyter Hub that allows you to run, edit, and create Jupyter notebooks such as these.

While some commercial platforms have good sharing options, there isn't currently a great open-source solution for sharing Jupyter notebooks with collaborators and students. I can show you my current workflow, though, for creating, editing, and sharing Jupyter notebooks.

There are two programs I have installed on the computer in front of me, Anaconda for running Jupyter on this computer and GitHub Desktop for synchronizing with GitHub.

We'll start with GitHub, this is the site used by many software developers to share and collaborate. You'll need to create a free account.

Next we'll use the GitHub Desktop application. Download, install, and run it, then log in with the GitHub account you just created. Now when you visit a GitHub repository, such as Callysto Curriculum Notebooks, you can click the green Code button near the top right of the page and then Open with GitHub Desktop. It should pop you back into the GitHub Desktop program, and you can click the Clone button to download the code from that repository.

Then install Anaconda. Run Anaconda Navigator and click the Launch button under Jupyter Notebook (or JupyterLab if you prefer). That will start up the Jupyter server on your computer, and launch your web browser to a page like http://localhost:8888/tree. From there you can browse to the folder where GitHub Desktop downloaded the repository for you in the previous paragraph. You can run and edit any Juypter notebooks, as well as create new ones.

If you encounter errors running notebooks, you likely need to install some Python libraries with commands like !pip install pandas , post in the comments (or reach out on Twitter) for help with that if you need.

At this point you're basically set up, but it gets a little more complicated if you want to share notebooks you create. You'll want to create a new GitHub repository, add your notebook file to that repository folder on your computer, and use GitHub Desktop to commit and push it to GitHub.

Once the notebook file is on GitHub, you can either have your students or colleagues go through this setup process to download your new repository, or you can send them a Callysto nbgitpuller link.

Hopefully that's not too complicated, but feel free to reach out if you have questions or would like help with this.

Tuesday, June 9, 2020

Streaming OBS Recordings to YouTube

Currently OBS Studio can only stream to a single service, such as Facebook or YouTube, but we are going to set up a way to stream to another service at the same time. Assuming that you are already comfortable streaming to Facebook, YouTube will be our second service.
You'll need to install FFmpeg and Python 3.
The following Python code can be saved as something like and run from there.
Replace xxxx-xxxx-xxxx-xxxx with your stream key from YouTube Studio, and /home/username/Videos with the path to the folder where OBS records your videos. You may also need to include the ffmpeg_path.
This code finds the most recent file in your OBS recordings folder and streams that file to YouTube. You may want to enable the setting "Automatically record when streaming" in OBS, otherwise you'll need to click "Start Streaming" and "Start Recording" each time.
Start recording in OBS then run the code, and it should start streaming the recording to YouTube without interfering with your primary stream. You will, of course, need enough upload bandwidth for both streams.
Potentially you could have another copy of this Python script running to streams the recording to a third service, such as Twitch.

Hopefully that helps get you started with secondary streams from OBS Studio. Let me know if any of this doesn't work for you.

Saturday, February 29, 2020

Authoring Open Educational Resources using only Open Source Software

Recently a leader in the Alberta OER community, Michael McNally, suggested that it is difficult or impossible to only use open source software (OSS) when creating open educational resources (OER). I agree with his point that using only OSS doesn't make OER more "pure", but perhaps it is still an interesting challenge.

Here are some of my suggestions, please comment if anything is missing. And I do understand the hypocrisy of posting this on a Google-hosted blog.

Writing Text

Text is still often the primary medium for OER, and there are a number of great open-source text-authoring tools. LibreOffice is a great office suite, and it is similar to traditional office suites so there shouldn't be much of a learning curve.

If you prefer collaborative writing, perhaps check out Nextcloud. You'll need to host it somewhere, if you are in Alberta consider Cybera's Rapid Access Cloud which uses OpenStack.

Diagrams and Graphics

Inkscape is a great vector editing and layout program. For image editing and creation, check out GIMP, Krita, or MyPaint.


One of the best simple audio recording and editing programs is also open source, Audacity. There are others, of course, but it should do everything you need.


My favorite open-source video editing program is Open Shot, but you may also want to check out Shotcut.

Hosting video is another issue, though. You can host videos in a learning management system such as Moodle, or check out alternatives such as MediaGoblin, Kaltura, or ClipBucket.

Operating System

Linux has gotten much easier to install and use if you'd like to replace Windows or MacOS. My current favorite distribution is Peppermint.


As previously mentioned, Albertans can avoid the big five cloud providers by running servers on RAC, but your institution may have self-hosted instances of Pressbooks or similar open-source hosting platforms.

Hopefully those cover anything you may need to use when creating OER with OSS. In some cases these tools are preferable to commercial products.

Of course if you are philosophically opposed to proprietary software then you are probably already familiar with most of these.

As always, please comment if you have any other suggestions.

Monday, February 3, 2020

Getting new copies of Jupyter notebooks with shutil and nbgitpuller

Getting a Fresh Set of Jupyter Notebooks

If you would like to update your copy of notebooks, for example on the Callysto Hub, you can delete the folder and pull the files from GitHub again. This is useful if something no longer works, or if the repository has been updated.

Unfortunately you can’t just select a directory in Jupyter hub and delete it if it contains files. One way to delete a folder, though, is to use the Python command shutil.rmtree() which is a shell utility command that will remove a whole directory tree.

To remove a folder, create a new Python 3 notebook in the same folder as the one you want to delete (but not inside the folder to be deleted).

In a code cell, type (or paste) the following two lines:

import shutil

Replace curriculum-notebooks with the name of the folder you would like to delete. Then run the cell, and you should see that the folder no longer exists.

Then you can click on an nbgitpuller link, for example from, that pulls down a new copy of the repository or notebook files that you are interested in.

You can also see the process in this video.