Workshop: Exploratory Data Analysis in Python
Overview:
In this workshop, we will delve into the art of conducting Exploratory Data Analysis (EDA) on a given dataset. EDA encompasses a broad spectrum of critical data analysis components, which include, but are not restricted to, the following:
- Data Preprocessing: This encompasses activities such as data cleaning, summarization, and wrangling, ensuring that the dataset is in a usable and informative state.
- Data Visualization: EDA entails univariate, bivariate, and multivariate analyses, employing various visualization techniques to unveil underlying patterns and relationships within the data.
- Hypothesis Formulation: EDA aids in the generation of hypotheses, setting the stage for further testing and validation through statistical techniques.
- Time Series Analysis: For datasets with temporal aspects, EDA includes the examination of trends, seasonality, and patterns over time, providing insights into data evolution.
- Feature Selection: Identifying and selecting the most relevant features is a crucial step in EDA, as it can significantly impact the success of subsequent analyses and modeling.
By the end of this workshop, students will have a solid foundation in conducting EDA, a vital skill in the realm of data analysis and decision-making.
Prerequisites:
- Introductory knowledge of Python
-You need to bring your own laptop for this workshop. Install Anaconda on your computer. You can find installation instructions聽here. Please contact us (cdsi.science at mcgill.ca) if you are having trouble with installation.
Instructor:聽Kiwon Lee, Faculty Lecturer, Department of Mathematics & Statistics
Registration: