Data Mining Ethics Course Syllabus

_________________________________________________________

Data Mining Ethics (STSC)

EGRS Department

_____________________________________________

 

Course Description:

 

In today’s world, data is one of the most valuable resources a business can have. We need to understand where it’s coming from and how it’s being used in order to create more equitable systems going forward. This course engages with the ethical dilemmas surrounding data mining. The four main focuses of the course are socio-technical systems, an introduction to data mining, bias in data mining, and privacy in data mining. In the first module, students will gain a better understanding of our relationship with technology and its place in society. Once they view data mining as a broader socio-technical system, students will be better equipped to analyze some of the ethical issues discussed later in the class. A background in data mining allows students to effectively understand the technical ability data has and the different industries it is a part of. Transitioning to the main focuses of the class, bias in data mining and privacy and data mining, the course uses practical data lessons that allow students to see ethical issues in algorithms first hand.

 

This course does not require a significant computer science background as it is meant to encourage students from different disciplines to be a part of the discussion surrounding data mining ethics.

 

Assignment Values:

40%   Data set activities

30%   Essays, reflections, and mics. writing assignments

30%   Class participation

 

Specific Student Outcomes:

  1. Analyze technology among different contexts
  2. Develop technical understanding of data mining & its applications
  3. Identify ethical issues relating to bias in data mining
  4. Identify ethical issues relating to privacy and data mining
  5. Demonstrate proficiency with using data and manipulating it to show an outcome
  6. Introduce methods of ethical arguments and analysis
  7. Develop teamwork, organization, and communication skills

 

_______________________________________________________________________

 

Module 1: Technology as a Socio-Technical System

Week               Subject                           Activity
1
  • What is technology?
  • Components of technology
  • S. Matthewson, Technology and Social Theory [read ch. 1–pp. 8-28]
  • Breaking down a technology into its components  
2
  • Technology and society
  • Technology is value laden
  • Cont’d discussion of Matthewson
  • D. Nye, Technology Matters [read abridged ch. 2 & 4 on Moodle]

 

Module 2: Introduction Into Data Mining

 

  • Essay on The Human Face of Big Data
  • Rise of big data
  • Importance of data
  • Watch and discuss The Human Face of Big Data
4
  • What is data mining?
    • Machine Learning Algorithms
  • Proliferation of data mining
    • How data mining is used
  • Read “All Machine Learning Models Explained in 6 Minutes”
  • Discuss how data mining and Machine Learning Models work
  • Discuss various applications across industries 

 

Module 3: Bias In Data Mining

5  

  • Reflection paper on both evolution of bias and bias in technology
  • Evolution of bias
  • Bias in technology
  • R. Benjamin, Race after Technology [read chapter 1–pp 49-76]
  • S. Noble, Algorithms of Oppression [read the intro–pp. 1-14]
  • V. Eubanks, Interview with J. Rossmann: “Virginia Eubanks on Digital Surveillance and People Power”
6
  • Facial recognition
    • Societal uses
    • Looking at FR in schools
    • Summary of “Coded Bias”
  • Interact with”How Normal Am I?” website
  • Read “Technology Assessment Report on Facial Recognition”
  • Watch “Coded Bias” 
7
  • Predictive policing
    • PredPol
    • Laser
  • Watch CBSN’s Racial Profiling Documentary Racial Profiling 2.0
  • Read “High-Tech Policing”  by Barbara Mantel for background
8 Activity 1 

  • Students will be given two large data sets of kids K-12 with age, family background, Extracurriculars and grades in the form of Data Set A and B.
  • Students will experiment with different data models and draw conclusions based on their analysis.
  • Then present the conclusions drawn from the data, learning how two sets of similar data can cause very different outcomes.
  • Only difference in Data Set A and B is coming from either a low or high income neighborhood.
9
  • Presentation on data set activity findings
  • Discussion of potential solutions to bias in data mining

 

Module 4: Privacy & Data Mining

10
  • Privacy in the digital age
  • Current privacy laws
  • Discuss the right to privacy
  • Are our laws behind technological advancement?
11
    • Privacy on Social Media
      • Lack of regulation?
    • Surveillance Capitalism
  • Essay on The Social Dilemma
  • Watch and discuss The Social Dilemma
  • S. Zuboff, The Age of Surveillance Capitalism [read intro]
12
  • Political ramifications
  • Erosion of democracy
    • Cambridge Analytica 
  • Watch and Discuss The Great Hack
  • Watch and discuss Zuckerberg Senate Hearing Highlights (link on Moodle)
13 Activity 2

  • Students will break into groups and discuss a privacy policy they would like to implement and research
  • They must then analyze, in an essay, how this change in policy will affect; The individual, the corporate sector, society as a whole, the government
14
  • Presentation on policy prescriptions

 

Class Materials:

Algorithms-of-Oppression-Excerpt.pdf. (n.d.). Retrieved December 3, 2020, from https://moodle.lafayette.edu/pluginfile.php/588004/course/section/252639/Algorithms-of-Oppression-Excerpt.pdf

Amer, K., & Noujaim, J. (2019, January 26). The Great Hack. Netflix. https://www.netflix.com/title/80117542

CBS News. (2020, March 6). Racial Profiling 2.0 | Full Documentary [Video]. https://www.youtube.com/watch?v=DAoe22-r-QQ

CNET. (2018, April 10). Zuckerberg’s Senate hearing highlights in 10 minutes [Video]. https://www.youtube.com/watch?v=EgI_KAkSyCw

Kantayya, S. (2020, November 11). Coded Bias [Documentary]. https://www.codedbias.com/

Matthewman, S. (2011). Technology and Social Theory (2011th ed.). Red Globe Press. https://moodle.lafayette.edu/pluginfile.php/588004/course/section/252637/Matthewman.Intro.2011.pdf

Nye, D. (2006). Technology Matters: Questions to Live With. MIT Press. https://moodle.lafayette.edu/pluginfile.php/588004/course/section/252637/Nye.Technology%20Matters.Ch%202%20and%204%20Excerpt%202018.pdf

Orlowski, J. (2020, January 26). The Social Dilemma. Netflix. https://www.netflix.com/title/81254224

Public Thinker: Virginia Eubanks on Digital Surveillance and People Power. (2020, July 9). Public Books. https://www.publicbooks.org/public-thinker-virginia-eubanks-on-digital-surveillance-and-people-power/

Race-after-Technology-Excerpt.pdf. (n.d.). Retrieved December 3, 2020, from https://moodle.lafayette.edu/pluginfile.php/588004/course/section/252639/Race-after-Technology-Excerpt.pdf

Schep, T. (n.d.). How normal am I? Retrieved November 30, 2020, from https://www.hownormalami.eu/

Smolan, R., & Erwitt, J. (2012). The Human Face of Big Data. Against All Odds Productions. https://play.hbomax.com/feature/urn:hbo:feature:GXp5LawJYbaC7uAEAAAOW?camp=googleHBOMAX

Terence, S. (2020, October 17). All Machine Learning Models Explained in 6 Minutes. Towards Data Science. https://towardsdatascience.com/all-machine-learning-models-explained-in-6-minutes-9fe30ff6776a

Zuboff, S. (2019). The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power (1st ed.). PublicAffairs.