Course Advice
Students Considering a Major in Data Science
Data science is new and evolving; there are many important combinations of theoretical, applied, and field-specific knowledge that may provide a foundation for future work. If you are interested in a data science major, we recommend that you work with your advisor to choose a set of related courses that reflect your interests and priorities from the list of electives. Course combinations that focus on individual topics, disciplines, or domains are strongly recommended. We also strongly recommend substantial engagement with issues of ethics, which could be in one focused course or across multiple courses.
Students Considering Graduate School or a Career as a Data Scientist:
While there are many fields for which the combination of data analysis and computational tools may be valuable, we have particular recommendations for students seeking a future as a data scientist. We strongly recommend that you take both COMSC-335 Machine Learning and STAT-340 Applied Regression Methods. Ideally, at least one course should involve an extended project requiring the analysis of data. We also recommend that you contextualize your data science preparation in the content of a domain or area of study that is theoretically and empirically cohesive.
Course Offerings
DATA-113 Introduction to Data Science
Data scientists answer questions with scientific and social relevance using statistical theory and computation. We will discuss elementary topics in statistics and learn how to write code (in Python) to visualize data and perform simulations. We will use these tools to answer questions about real data sets. We will also explore ethical issues faced by data scientists today.
DATA-225 Topics in Data Science
DATA-225AR Topics in Data Science 'Ethics and Artificial Intelligence'
Artificially intelligent technologies are prominent features of modern life -- as are ethical concerns about their programming and use. In this class we will use the tools of philosophy to explore and critically evaluate ethical issues raised by current and future AI technologies. Topics may include issues of privacy and transparency in online data collection, concerns about social justice in the use of algorithms in areas like hiring and criminal justice, and the goals of developing general versus special purpose AI. We will also look at ethics for AI: the nature of AI 'minds,' the possibility of creating more ethical AI systems, and when and if AIs themselves might deserve moral rights.
DATA-225DH Topics in Data Science: 'Introduction to Digital Humanities'
This class is an interdisciplinary course that examines the application of computational tools and methodologies to humanities research, with a strong emphasis on practical Python programming. It covers key topics such as image processing, data visualization, and statistical analysis applied in various domains, including history, archaeology, and the arts. Students engage with diverse case studies and projects, employing computational and statistical techniques to analyze and interpret complex real-world datasets. The course also critically explores methodological challenges in digital humanities, including issues related to sparse data, noisy contexts, and the inherent limits of interpretation.
DATA-295 Independent Study
DATA-350 Advanced Topics in Data Science:
DATA-350TE Advanced Topics in Data Science: 'Technology, Ethics, and Public Policy'
In this course, we study the most pressing ethical concerns relating to emerging technology and envision novel policy solutions to address them. Existing regulatory and policy instruments are often unable to provide sufficient oversight for emerging technology. Can legal anti-discrimination doctrine address biased algorithmic decision-making systems? How does generative artificial intelligence challenge traditional ways of thinking about intellectual property? Do we have rights over the personal data that private firms collect about us? We examine these gaps in the context of contemporary regulatory proposals on national, multinational, and international scales.
DATA-390 Research and Topics in Data Science
This seminar provides an opportunity for students from all disciplines to do guided research using data science tools in a research project of their choice. Students will develop an understanding of the full pipeline of successful data science research by selecting a topic, identifying relevant datasets, designing research methods, conducting in-depth analyses, deriving meaningful conclusions, and submitting a final report. Opportunities for students to present their work and review journal articles create a scaffolded approach. Past project topics include geology, music, demographics, art, economics, government, religion, transportation, and law.
DATA-395 Independent Study
DATA-395P Independent Study w/Practicum
Required Core Courses for the Data Science Major
Course List
Code |
Title |
Credits |
COMSC-151 | Introduction to Computational Problem Solving | 4 |
COMSC-205 | Data Structures | 4 |
COMSC-335 | Machine Learning | 4 |
MATH-211 | Linear Algebra | 4 |
STAT-140 | Introduction to the Ideas and Applications of Statistics | 4 |
STAT-242 | Intermediate Statistics | 4 |
STAT-340 | Applied Regression Methods | 4 |
Note: Majors need to take either COMSC-335 or STAT-340.
Elective Courses for the Data Science Major
Course List
Code |
Title |
Credits |
BIOL-223 | Ecology with Laboratory | 4 |
BIOL-234 | Biostatistics with Laboratory | 4 |
BIOL-350GE | Topics in Biological Sciences: 'Genomics and Bioinformatics with Laboratory' | 4 |
COMSC-235 | Applications of Machine Learning | 4 |
COMSC-312 | Algorithms | 4 |
COMSC-334 | Artificial Intelligence | 4 |
COMSC-335 | Machine Learning | 4 |
COMSC-341CD | Topics: 'Causal Inference for Data Science' | 4 |
COMSC-341NL | Topics: 'Natural Language Processing' | 4 |
COMSC-341TE | Topics: 'Text Technologies for Data Science' | 4 |
DATA-113 | Introduction to Data Science | 4 |
DATA-225AR | Topics in Data Science 'Ethics and Artificial Intelligence' | 4 |
DATA-350TE | Advanced Topics in Data Science: 'Technology, Ethics, and Public Policy' | 4 |
DATA-390 | Research and Topics in Data Science | 4 |
ECON-220 | Introduction to Econometrics | 4 |
ECON-320 | Econometrics | 4 |
EOS-299AR | Topic: 'Ethics and Artificial Intelligence' | 4 |
GEOG-205 | Mapping and Spatial Analysis | 4 |
GEOG-210 | GIS for the Social Sciences and Humanities | 4 |
MATH-339PT | Topics in Applied Mathematics: 'Optimization' | 4 |
MATH-339SP | Topics in Applied Mathematics: 'Stochastic Processes' | 4 |
MATH-342 | Probability | 4 |
PHIL-260AR | Topics in Applied Philosophy: 'Ethics and Artificial Intelligence' | 4 |
PHIL-350TE | Topics in Philosophy: 'Technology, Ethics, and Public Policy' | 4 |
POLIT-387EC | Advanced Topics in Politics: 'U.S. Elections' | 4 |
PSYCH-326CP | Laboratory in Personality and Abnormal Psychology: 'Advanced Statistics in Clinical Psychology' | 4 |
SOCI-216TX | Special Topics in Sociology: 'Text as Data I: From Qualitative to Quantitative Text Analysis' | 4 |
SOCI-316TX | Special Topics in Sociology: 'Text as Data II: Computational Text Analysis for the Social Sciences' | 4 |
STAT-244MP | Intermediate Topics in Statistics: 'Survey Sampling' | 4 |
STAT-244NF | Intermediate Topics in Statistics: 'Infectious Disease Modeling' | 4 |
STAT-244NP | Intermediate Topics in Statistics: 'Nonparametric Statistics' | 4 |
STAT-244SC | Intermediate Topics in Statistics: 'Computational Statistics' | 4 |
STAT-331 | Design of Experiments | 4 |
STAT-340 | Applied Regression Methods | 4 |
STAT-343 | Mathematical Statistics | 4 |
STAT-344ND | Seminar in Statistics and Scientific Research: 'Analysis of Neural Data' | 4 |
STAT-344TM | Seminar in Statistics and Scientific Research: 'Time Series Analysis' | 4 |
STAT-351 | Bayesian Statistics | 4 |
Note: DATA-113 can only count toward the major if taken before completing 200-level course in Statistics or Computer Science.