In the summer of 2016, I sent a call for proposals to many social service agencies in the Greater Lehigh Valley with the goal of finding a suitable database systems project for my Introduction to Database Systems Class (CS 320). Ms. Cassaundra Amato, M.A., Assistant Director of Measurement & Evaluation of United Way of the Greater Lehigh Valley, responded. She said, in part:
Pennsylvania Department of Education data is stored in different sections of the website, and there is no one place that lists all of the available data for a school or makes it easy to compare schools, levels (elementary, middle, high), or district to one another or as a group. This makes it very difficult for service providers, residents, parents and social impact funders to gather the most comprehensive view of what is going on in the community.
She went on to ask me to have my class:
combine and make searchable data points located in different Pennsylvania Department of Education datasets for a more holistic view of individual schools and school district in the Lehigh Valley (and possibly Pennsylvania), with the possible addition of district academic achievement gap and socioeconomic indicators from the Stanford Center for Education Policy Analysis data archive.
Since that time, my database and project classes have worked in teams to integrate and visualize this data. The most recent class to work with the data was the Fall 2017 Database Class. They worked on visualizations of the data for the Lehigh Valley Community in response to a request from the Lehigh Valley Research Consortium.
One helpful set of visualizations was produced by the Byerly’s Boys Team of Joe Cenci, Wassim Garbi, Zurabi Mestiashvili, and Darren Norton. See the following 16 minute video for an overview of their project.
The earlier Spring 2017 Senior Project Class, working in a team of three, integrated and audited the work done by the Fall 2016 database class when they first integrated the data. They produced one integrated relational database and a tool for comparing data side by side. You can find out more about this project:
The Fall 2016 class, working in teams of four, downloaded more than 60 different spreadsheets, designed a database for them, uploaded the data and built data visualization applications. You can learn more about three of the projects:
The following data about schools in Pennsylvania were included in the overall project.
- School fact summaries from school year 2012-2013 to the present
- Academic performance data from 2012-2013 to the present
- Enrollment reports for public schools from 2004-2005 to the present
- Enrollment reports for private schools from 2005-2006 to the present
- Cohort graduation rates from 2010-2011 to the present
- Drop out rates from 2004-2005 to the present
- Low income data for public schools from 2005-2006 to the present
- Low income data for private schools from 2005-2006 to the present
- Professional support personnel from 2012-2013 to the present
- Fiscal data from 2012-2013 and 2014-2015
- Keystone Exam results for schools from 2014-2015
- PSSA results for schools from 2014-2015 to the present
- State Accountability Assessment from 2013-2014
- SPP scores from 2012-2013 to the present
- Federal accountability data from 2013-2014
- School location information (latitude and longitude)
The following data about school districts in the United States were included in the project:
- Stanford Education Data Project: aggregated performance means in grade equivalent units, constant population units, NAEP referenced units, and state referenced units including white-black and white-Hispanic achievement gaps, along with model covariates
The challenges faced by these projects are an inspiration for my research in data integration and visualization.