Data Integration and Visualization

Big Data is the result of advances in technology that give us more capacity to collect, store and manipulate data. It is a paradigm shift in data collection, measurement and evaluation that has led to orders of magnitude more data being available for everything. Everyone has more data, and my goal is to help them make effective use of it, to understand it, … to harness data science insights. At its best, data science is accompanied by a thoughtful examination of the strengths and limitations of big data and the techniques applied to it. For example, data science helps us understand that correlation is not causality, and hypothesis generation by empirical induction is not explanatory.

One of my research goals is to assist data science practitioners, especially the user untrained in databases, in using big data effectively and correctly. My efforts span areas of (1) data collection and integration, (2) data modeling, and (3) data analysis and visualization. My teaching is enhanced by my research, and my students often engage in service projects to harness big data for the welfare of the community. One of these projects is the Soccer Project, a project to explore what makes a great college soccer player, and the United Way Project, a project to integrate more than 60 data sources for K-12 school information and performance in Pennsylvania, and provide a web-based visualization platform for the data.

Another of my research goals is to imagine how the new types of data we can collect, for example crowd-sourced data, can be used to benefit society.  This is part of the inspiration for my Artful Recommender Project.