I wanted to make sure that everyone gets to read this blog post by Skillman Library’s Digital Scholarship Services featuring our very own 2018 Summer Scholar, Ben Gordon, talking about his experience as a Data Science major at Lafayette, as well as his DHSS research project, New York City’s Subways, Bridges, Highways, and Expressways in the 20th Century.
David Bishop Skillman Library
Tel: (610) 330-3191
I’m reposting the piece here:
My journey to Digital Scholarship Services and Data Science has been a long and rewarding one. I came into Lafayette with an interest in Math because of a statistics course I took in high school. I was always successful in my math classes, but often didn’t understand the relevance of geometry, algebra, and trigonometry. Yes, problem solving was interesting, but for what purpose? This changed when I immersed myself in that statistics course. I truly fell in love with the applied nature of the discipline, using mathematical equations in order to find meaning in the real world. Assignments weren’t just problem solving, but included some form of essay writing to explain what our mathematical answer meant.
By the time I was a freshman in college, this course had stuck with me and I was itching to continue. I was hoping for an applied version of mathematics in college, deciding I would be a major in math. I did not know what engineering might have entailed, although that might have ended up more of what I was looking for, so I ended up in math. However, when I took transition to theoretical mathematics, the infamous “weed out” course in the math major, I found it challenging to connect with the material.
This experience prompted me to try to forge my own applied statistics path in the major, so I met with the head of the math department. He understood my difficulties with the theoretical nature of the required classes for the major, and told me about how the future of the department included new statistics teachers and maybe a statistics major. But there was no statistics major currently.
The next semester, I decided to switch into a computer science major as a junior. The summer before my junior year, I made sure that I was going to approach my Data Structures and Algorithms course (the “weed out” course in the CS major) much differently than I did when I was a math major. I spent my summer taking notes from the textbook when I rode the subway to and from work, and practicing coding on my computer. I wanted to put myself in the best possible situation to succeed going into the class.
Yet when I started the class, I got failing grades on my first two labs. I did all I could possibly do to prepare for this next step over the summer; what was I missing? Why was I still failing? I met with my teacher and tried to work through the mistakes, setting aside most of my time here at Lafayette just to passing this class.
Eventually, I hit my stride, and started getting passing and above grades on my labs, tests, and projects. I finally felt some sort of relief, that my change in attitude towards school, combined with a new interest in what I was learning was going to get me through this class, and subsequently the computer science major. But one day after class, when I did not expect it, my professor approached me and asked about what my plan was for the next two years. I explained what I have written above; I was driven out of the math major because I wanted to do statistics, and was going to try and squeeze a computer science major into four semesters.
He said that he thought I could still make my own Data Science major. I was ecstatic – Data Science sounds a lot like statistics. Finally, after trying a year ago and giving up, I was going to be able to make my own major and do the discipline I actually wanted to. Then I met the director of Digital Scholarship Services, Charlotte Nunes, who told me there was an academic planning committee for Data Science & Digital Scholarship. I was excited, and realized that I was actually the guinea pig for Data Science at Lafayette College.
This long process was how I got introduced to Digital Scholarship Services in the library, and all of the different intersections between Computer Science, Data Science, and what is being done in DSS. I started then to do research for Charlotte, on topic modeling and different text analyses in R. Matthew Jockers is one of the leading scholars for this buzzworthy subject in the field of digital scholarship. We spent the semester working through Jockers’ how-to book and discussing how he was received by scholars.
This is a long and winding story, but I ended up in Data Science and Digital Scholarship Services for the same reasons I wanted to study statistics in the first place. I was interested in understanding the world around us in ways that were only possible with mathematical and technological methods. And Digital Scholarship Services is building capacity in this area. Topic modeling and Matthew Jockers’ scholarship, for example, is all about trying to discover new aspects to text that would be impossible without technology.
Response from Charlotte Nunes
Ben Gordon’s research project, New York City’s Subways, Bridges, Highways, and Expressways in the 20th Century, offers a great example of how Lafayette College Libraries supports data-oriented undergraduate and faculty research. Ben completed the project under the supervision of Angela Perkins, Research & Instruction Librarian and director of the Lafayette College Libraries Digital Humanities Summer Scholars (DHSS) program. As part of this program, Ben consulted with members of Digital Scholarship Services including John Clark, Data Visualization & GIS Librarian, on geospatial data discovery, research data management, data transformation, and data visualization. Ben currently collaborates with Janna Avon, Digital Initiatives Librarian, and I on a text analysis project featuring oral history transcripts.
In the academic year leading up to his DHSS project, I appointed Ben as a student worker to assist me in clarifying where DSS might build on departmental strengths to better consult on introductory data science methods for analyzing data in the humanities and humanistic social sciences. Together we explored data science as a varied, multidisciplinary field involving data analytics, data visualization, and data ethics. The field requires skills in finding, cleaning, and organizing data, articulating research questions, drawing interpretive conclusions from statistical inference, and communicating persuasively about the results of data analysis.
As Lafayette College advances its Data Science & Digital Scholarship academic planning initiative, I anticipate that DSS will continue to provide wide-ranging data services while growing in new areas, including:
- Providing consultation and workshop instruction on humanistic uses of R, a statistical computing language that allows for a variety of modeling, clustering, and visualization techniques in text corpora.
- Building research-ready digital archival collections guided by the principle of Collections as Data, and consulting on data mining in primary source databases such as Adam Matthew Digital.
- Exploring uses of artificial intelligence and natural language processing for humanistic data analysis, as addressed in the Institute of Museum and Library Services grant Investigating the National Need for Library Based Topic Modeling Discovery Systems.
I thank Ben for his work exploring the field of data science, building skills as a practitioner, and helping to set a vision for the future of DSS. Check out the Lafayette News coverage of Ben’s research titled Uncovering Political History of NYC Subways!