All the information about workshops happening on September 18th can be found here. We
will be using discord for all workshop related announcements and communications. Please check your email
for the discord invite.
There will be workshops in the morning session, a lunch break followed by a afternoon session of workshops. Some workshops may provide additional "office hours" in the week leading up to the Datajam. Check back here for more information closer to the date!
Each workshop lasts up to three hours, workshops will run in parallel.
|9:00 am - 12:00 pm||Morning Workshop Options|
|Git and GitHub with Shannon Lo|
|R Workshop - special topic (machine learning) with Zaid Haddad|
|Python Workshop - Data Structures with Fatemeh Salehian Kia|
|Python Dash Workshop with Laura Gutierrez Funderburk and Hanh Tong|
|12:00 pm - 1:00 pm||Lunch Break and Keynote presentation|
|1:00 pm - 4:00 pm||Afternoon Workshop Options|
|Intro to Noteable and the Python Data Stack with Carol Willing and Dave Stuart|
|R Beginner Workshop with Yuka Takemon|
|Python Beginner Workshop with Jennifer Walker|
|Intro to Ploomber with Ido Michael and Eduardo Blancas|
Shannon is a data analyst who has worked in various industries including telecom, retail, and public health. Currently, her work involves building products backed by analytics and synthesizing data to highlight optimization opportunities and insights. When she's not geeking out about data, you can usually find her hiking, snowshoeing, or camping.
This workshop will equip you with the basics of the version control software git and how to collaborate with others on Github. Participants will be able to pull, commit and push by the end of this session as well as branch and fork repos. This workshop is recommended for anyone attending as it is an important piece in online coding collaborations.
Zaid is a Data Scientist Leader at Slalom. He brings experience delivering high quality data science solutions in personalized healthcare and big data while supporting commercialization in regulated environments. He has supported several organizations in the development and implementation of data science products’ road map from conception to deployment. Zaid holds his Bachelor of Science from Simon Fraser University in Computing Science and Molecular Biology & Biochemistry.
This is an advanced R workshop focused on machine learning. This workshop provides a hands-on introduction to unsupervised machine learning. The workshop will focus on Clustering and Demiensionality reduction. Basic knowledge of R and the tidyverse package is required.
Fatemeh Salehian Kia is a data scientist. Her area of research is learning analytics and AI. Her interests are not limited to data science but are broadened by applying theory and understanding design.
This workshop will introduce the core data structures of the Python programming language. We will explore how we can use the Python built-in data structures such as lists, dictionaries, and tuples to perform data analysis. Basic knowledge of Python programming is recommended to attend this workshop.
Laura works as a data scientist at Cybera. Laura has a B.Sc. in Mathematics from Simon Fraser University, and was awarded a Terry Fox Gold medal for overcoming severe childhood trauma and helping the communities she forms part of. In her spare time, she is a co-organizer for PyLadies Vancouver, and is enthusiastic about cycling by the seaside and trail running.
Hanh works as a data scientist at Theory+Practice - a data science consulting company. She has a PhD in Economics from Simon Fraser University. She is interested in using experiments and data science to understand how people make decisions and why they behave the way they do. In her spare time, she enjoys dancing (West Coast Swing) with her two left feet.
In this workshop we will use pandas, plotly and Dash to create a dashboard that explores changes in the average housing price in various provinces in Canada for the last 5 years. We will start by generating interactive visualizations using plotly and turn exploratory code into reusable functions. We will then work together to bring our functions into a script. Participants will be introduced into dashboarding, layout options, and will work together to generate and test a local dashboard. Participants will learn how to deploy their dashboard to production.
Yuka is a PhD candidate in the Genome Science and Technology program at UBC. She is a member of the Marra Lab at Canada’s Michael Smith Genome Sciences Centre, where she conducts her research to understand genetic interactions of genes that are frequently mutated in cancers. Yuka is also a certified instructor at The Carpentries and RStudio, and is one of the organizers of RLadies Vancouver. When she’s not at the lab, Yuka is teaching introductory workshops in bioinformatics and programming to help “wet lab” sciences make the transition over to the “dry lab”.
This intro R workshop will use RStudio with a focus on data wrangling and data visualization with the tidyverse packages, and report generation using RMarkdown. This will be a good workshop for anyone who is completely new to programming or wants to refresh their R knowledge.
Jennifer is an environmental scientist who uses data science to study the Earth's atmosphere and climate. Her favourite programming language is Python and she loves to spread the joy of Python to others as a workshop instructor. She also enjoys volunteering with Data for Good, using data science to support local non-profit organizations that are working to improve our community.
This beginner level workshop will introduce data analysis with Python, focusing on Jupyter notebooks, working with data in Pandas, and visualization with Seaborn. A familiarity with Python basics will help you get the most out of this workshop, but you do not need any prior experience with Pandas or any other libraries.
Carol Willing is the VP of Learning at Noteable and is a member of both the Jupyter Steering Council and the Python Steering Council. Carol has a strong commitment to community outreach and education. She’s passionate about Open Science and Education and serves on the Chan Zuckerberg Initiative Open Science Advisory Board.
Dave Stuart is the Director of Customer Success at Noteable. Previously Dave was a senior executive at the National Security Agency where he founded and led a large-scale effort to use Jupyter Notebooks to empower business analysts.
This workshop will provide an introduction Noteable - a collaborative notebook platform. We will cover how to easily get up and running with notebooks on Noteable, including how to access data and create rich visualizations. The workshop will then provide an introduction to the Python Data Stack, sharing popular libraries used to analyze and make sense of large data sets.
Both are Ploomer co-founders.
Ido Michael is leading data engineering teams in AWS and has a masters degree from Columbia University. Ido is a big-believer in cloud technologies and Data & Analytics. He is helping customers and executives building data platforms to produce business insights, while helping companies scale their data solutions to drive innovation. (photo attached).
Eduardo Blancas is interested in developing tools to deliver reliable Machine Learning products and that's why he developed Ploomber, an open-source Python library for reproducible Data Science, first introduced at JupyterCon 2020. He holds an M.S in Data Science from Columbia University, where he took part in Computational Neuroscience research. He started his Data Science career in 2015 at the Center for Data Science and Public Policy at The University of Chicago.
Notebooks are hard to maintain. Teams often prototype projects in notebooks, but maintaining them is an error-prone process that slows progress down. Ploomber overcomes the challenges of working with .ipynb files allowing teams to develop collaborative, production-ready pipelines using JupyterLab or any text editor. In this workshop, participants will learn about Ploomber (https://ploomber.readthedocs.io/en/latest/), a Python package that allows you to modularize your analysis in smaller tasks without losing the power of an interactive notebook.