Basic Data Cleaning for Education and Social Sciences

Three-Part workshop series on Basic Data Cleaning hosted by the EHE Quantitative Methodology Center (QMC)

Lizeng Huang

The QMC is offering a three-part series on basic data cleaning in education and social sciences. These workshops will occur on February 2nd, February 9th, and February 16th.  The overall goals of this workshop series are to equip participants, particularly in the field of education and social sciences, with the essential skills and knowledge needed to effectively clean, prepare, and transform data for robust and accurate analysis in their research and studies.

The workshop will be hosted by Lizeng Huang, Ph.D. student in Learning Technologies, Graduate Research Associate for the QMC.

Read below for more information about each workshop. You may register for 1, 2, or all three workshops; workshop registration is completed using a single registration link.

Registration is now closed. Recordings of the presentations can be found below!

Workshop 1: Fundamental Procedures for Data Cleaning

Friday, February 2nd, from 10:00 am to 11:00 am

During this session, the instructor will provide an introduction to data cleaning and its importance in the data analysis process. Participants will learn the key differences between raw data and clean data, understanding why clean data is essential for accurate analysis. The workshop will cover data validation techniques to ensure data integrity and standardization methods to make data consistent. Participants will also engage in hands-on activities using Excel to practice data cleaning procedures, such as identifying and dealing with duplicates. Presented by the EHE QMC. Participants should have their own laptops with access to Excel and SPSS.

  • Data cleaning
  • Raw data vs clean data
  • Initial data preparation
  • Data validation & Data standardization
  • Duplicated data

Watch the recording for Part 1 of the workshop series here!

Workshop 2: Intermediate Techniques in Managing Outliers and Missing Data

Friday, February 9th, from 10:00 AM to 11:00 AM

During this session, the instructor will delve into the topic of outliers, explaining what they are and how they can affect analysis results. Participants will learn how to identify outliers using various statistical techniques. The workshop will also cover different strategies for handling outliers effectively. Furthermore, the session will briefly explore missing data mechanisms and techniques for handling missing data, with a focus on using SPSS. Participants will actively participate in hands-on activities to apply outlier identification and missing data handling methods to real datasets. Presented by the EHE QMC. Participants should have their own laptops with access to Excel and SPSS.

  • Outlier
  • Identifying outliers
  • Outliers handling
  • Missing data mechanisms
  • Different methods to deal with missing data

Watch the recording for Part 2 of the workshop series here!

Workshop 3: Advanced Transformation: Computing, Recoding, and Merging Data

Friday, February 16th, from 10:00 AM to 11:00 AM

During this session, the instructor will guide participants through advanced data cleaning techniques. This includes computing new variables and recoding existing variables to better suit analysis needs. Participants will also learn how to merge different datasets, allowing for more comprehensive analysis. Additionally, the session will cover data format transformations, such as converting data from long to wide or wide to long format, which can be crucial for specific analysis requirements in social sciences. The final hands-on activity will integrate content from all three sessions, giving participants a comprehensive understanding of data cleaning and transformation in practical data analysis scenarios. Presented by the EHE QMC. Participants should have their own laptops with access to Excel and SPSS.

  • Merging different dataset
  • Transforming data format (long to wide, wide to long)
  • Computing and recoding variables

Watch the recording for Part 3 of the workshop series here!

Click here to download SPSS

Contact us at if you have questions or need assistance downloading SPSS.

This event will be presented with automated closed captions. If you wish to request traditional CART services or other accommodations, please contact Brian Timm at or 614-247-6490. Requests made by January 21 will generally allow us to provide seamless access, but the university will make every effort to meet requests made after this date.