Data Science

Undergraduate

Data science extracts meaningful knowledge from large amounts of complex data by using scientific methods, statistics and computing tools.

Program Overview

The data science major lives at the intersection of many fields. Cross-disciplinary by definition, this major is an exercise in integration, one that brings together technical skills and domain-specific knowledge to create new pathways for learning. The data science major will prepare you to face emerging questions and challenges in the communication of data concepts and outcomes. Offering rigorous training in statistics and computer science, the major emphasizes how to be an effective, ethical and judicious consumer of data.

The major offers a foundational understanding of the data generating process, the appropriate and efficient translation of analytic strategies to specific data settings, the potential biases arising from missing data or data collection, the means for drawing accurate conclusions, and the techniques and principles of integrity in data visualization and communication. As part of your data science education, you will develop excellent communication skills and the ability to make clear and persuasive arguments framed by logic and supported by data.

The curriculum is flexible and innovative, serving a diverse backgrounds and disciplinary interests, and deep enough to accommodate anyone who wants ultimately to pursue advanced study in statistics and computer science. The Data Science curriculum reflects the increasingly collaborative and interdisciplinary academic landscape.

Courses and Requirements

Culminating in an integrative capstone and data ethics course, the data science curriculum includes required coursework in mathematics and statistics with a choice of domain area, such as English, economics or chemistry.

Learning Goals

  • Apply core concepts of statistics, computing, and domain knowledge to extract insight from data sets.
  • Understand the social and ethical issues surrounding data collection, analysis and use.
  • Be able to communicate in multiple modalities the results of large scale data analysis.

Requirements for the Major

A minimum of 40 credits:

STAT-140Introduction to the Ideas and Applications of Statistics4
STAT-242Intermediate Statistics4
As a prerequisite for MATH-211:
MATH-102
Calculus II (or above)
MATH-211Linear Algebra4
COMSC-151Introduction to Computational Problem Solving 14
COMSC-205Data Structures4
12 credits at the 300 level from at least two departments or programs and chosen from the approved list of elective courses for Data Science. One course must be either: 2, 3, 4, 512
COMSC-335
Machine Learning 6, 7
or STAT-340
Applied Regression Methods
8 additional credits chosen from the approved list of elective courses for Data Science 2, 3, 4, 58
Total Credits40
1

Any COMSC-151 offering, for example, COMSC-151CP, COMSC-151DS, or COMSC-151HC.

2

Students who do not elect both COMSC-335 and STAT-340 will need to choose two other 300-level courses from this list, one of which is from a department other than their COMSC-335/STAT-340 choice.

3

Many elective courses require prerequisites. Students are encouraged to plan their elective courses early in order to ensure that they meet the requirements to access chosen courses.

4

Students are strongly encouraged to take an elective course in ethics.

5

Other courses that focus on ethics, cover data analytic methods, or involve an independent project with data can be substituted with approval of the Data Science Program Committee.

6

Students intending to attend graduate school in data science are advised to take both of these courses.

7

COMSC-335 Machine Learning requires MATH-232 as a prerequisite.

Additional Specifications

  • Students who declare a Data Science major automatically fulfill the College's "outside the major" requirement.

Course Advice

Students Considering a Major in Data Science

Data science is new and evolving; there are many important combinations of theoretical, applied, and field-specific knowledge that may provide a foundation for future work. If you are interested in a data science major, we recommend that you work with your advisor to choose a set of related courses that reflect your interests and priorities from the list of electives. Course combinations that focus on individual topics, disciplines, or domains are strongly recommended. We also strongly recommend substantial engagement with issues of ethics, which could be in one focused course or across multiple courses.

Students Considering Graduate School or a Career as a Data Scientist:

While there are many fields for which the combination of data analysis and computational tools may be valuable, we have particular recommendations for students seeking a future as a data scientist. We strongly recommend that you take both COMSC-335 Machine Learning and STAT-340 Applied Regression Methods. Ideally, at least one course should involve an extended project requiring the analysis of data. We also recommend that you contextualize your data science preparation in the content of a domain or area of study that is theoretically and empirically cohesive.

Course Offerings

DATA-113 Introduction to Data Science

Fall and Spring. Credits: 4

Data scientists answer questions with scientific and social relevance using statistical theory and computation. We will discuss elementary topics in statistics and learn how to write code (in Python) to visualize data and perform simulations. We will use these tools to answer questions about real data sets. We will also explore ethical issues faced by data scientists today.

Applies to requirement(s): Math Sciences

DATA-225 Topics in Data Science

DATA-225AR Topics in Data Science 'Ethics and Artificial Intelligence'

Spring. Credits: 4

Artificially intelligent technologies are prominent features of modern life -- as are ethical concerns about their programming and use. In this class we will use the tools of philosophy to explore and critically evaluate ethical issues raised by current and future AI technologies. Topics may include issues of privacy and transparency in online data collection, concerns about social justice in the use of algorithms in areas like hiring and criminal justice, and the goals of developing general versus special purpose AI. We will also look at ethics for AI: the nature of AI 'minds,' the possibility of creating more ethical AI systems, and when and if AIs themselves might deserve moral rights.

Crosslisted as: PHIL-260AR, EOS-299AR
Applies to requirement(s): Humanities

DATA-225DH Topics in Data Science: 'Introduction to Digital Humanities'

Spring. Credits: 4

This class is an interdisciplinary course that examines the application of computational tools and methodologies to humanities research, with a strong emphasis on practical Python programming. It covers key topics such as image processing, data visualization, and statistical analysis applied in various domains, including history, archaeology, and the arts. Students engage with diverse case studies and projects, employing computational and statistical techniques to analyze and interpret complex real-world datasets. The course also critically explores methodological challenges in digital humanities, including issues related to sparse data, noisy contexts, and the inherent limits of interpretation.

Applies to requirement(s): Math Sciences
Prereq: DATA-113 (or COMSC-151 and STAT-140) or equivalent familiarity with Python and statistics. Contact the instructor if needed.

DATA-295 Independent Study

Fall and Spring. Credits: 1 - 4

Restrictions: Contact instructor for independent study declaration form and signatures.
Instructor permission required.

DATA-350 Advanced Topics in Data Science:

DATA-350TE Advanced Topics in Data Science: 'Technology, Ethics, and Public Policy'

Not Scheduled for This Year. Credits: 4

In this course, we study the most pressing ethical concerns relating to emerging technology and envision novel policy solutions to address them. Existing regulatory and policy instruments are often unable to provide sufficient oversight for emerging technology. Can legal anti-discrimination doctrine address biased algorithmic decision-making systems? How does generative artificial intelligence challenge traditional ways of thinking about intellectual property? Do we have rights over the personal data that private firms collect about us? We examine these gaps in the context of contemporary regulatory proposals on national, multinational, and international scales.

Crosslisted as: PHIL-350TE
Applies to requirement(s): Humanities
Other Attribute(s): Writing-Intensive
Prereq: 8 credits in Philosophy.

DATA-390 Research and Topics in Data Science

Fall. Credits: 4

This seminar provides an opportunity for students from all disciplines to do guided research using data science tools in a research project of their choice. Students will develop an understanding of the full pipeline of successful data science research by selecting a topic, identifying relevant datasets, designing research methods, conducting in-depth analyses, deriving meaningful conclusions, and submitting a final report. Opportunities for students to present their work and review journal articles create a scaffolded approach. Past project topics include geology, music, demographics, art, economics, government, religion, transportation, and law.

Applies to requirement(s): Math Sciences
Prereq: STAT-242 and COMSC-205, or a 300-level class in Statistics or Computer Science.

DATA-395 Independent Study

Fall and Spring. Credits: 1 - 8

Restrictions: Contact instructor for independent study declaration form and signatures.
Instructor permission required.

DATA-395P Independent Study w/Practicum

Fall and Spring. Credits: 1 - 8

Restrictions: Contact instructor for independent study declaration form and signatures.
Instructor permission required.

Required Core Courses for the Data Science Major

Computer Science
COMSC-151Introduction to Computational Problem Solving4
COMSC-205Data Structures4
COMSC-335Machine Learning4
Mathematics
MATH-211Linear Algebra4
Statistics
STAT-140Introduction to the Ideas and Applications of Statistics4
STAT-242Intermediate Statistics4
STAT-340Applied Regression Methods4

Note: Majors need to take either COMSC-335 or STAT-340.

Elective Courses for the Data Science Major

Biological Sciences
BIOL-223Ecology with Laboratory4
BIOL-234Biostatistics with Laboratory4
BIOL-350GETopics in Biological Sciences: 'Genomics and Bioinformatics with Laboratory'4
Computer Science
COMSC-235Applications of Machine Learning4
COMSC-312Algorithms4
COMSC-334Artificial Intelligence4
COMSC-335Machine Learning4
COMSC-341CDTopics: 'Causal Inference for Data Science'4
COMSC-341NLTopics: 'Natural Language Processing'4
COMSC-341TETopics: 'Text Technologies for Data Science'4
Data Science
DATA-113Introduction to Data Science4
DATA-225ARTopics in Data Science 'Ethics and Artificial Intelligence'4
DATA-350TEAdvanced Topics in Data Science: 'Technology, Ethics, and Public Policy'4
DATA-390Research and Topics in Data Science4
Economics
ECON-220Introduction to Econometrics4
ECON-320Econometrics4
Entrepreneurship, Orgs & Soc
EOS-299ARTopic: 'Ethics and Artificial Intelligence'4
Geography
GEOG-205Mapping and Spatial Analysis4
GEOG-210GIS for the Social Sciences and Humanities4
Mathematics
MATH-339PTTopics in Applied Mathematics: 'Optimization'4
MATH-339SPTopics in Applied Mathematics: 'Stochastic Processes'4
MATH-342Probability4
Philosophy
PHIL-260ARTopics in Applied Philosophy: 'Ethics and Artificial Intelligence'4
PHIL-350TETopics in Philosophy: 'Technology, Ethics, and Public Policy'4
Politics
POLIT-387ECAdvanced Topics in Politics: 'U.S. Elections'4
Psychology
PSYCH-326CPLaboratory in Personality and Abnormal Psychology: 'Advanced Statistics in Clinical Psychology'4
Sociology
SOCI-216TXSpecial Topics in Sociology: 'Text as Data I: From Qualitative to Quantitative Text Analysis'4
SOCI-316TXSpecial Topics in Sociology: 'Text as Data II: Computational Text Analysis for the Social Sciences'4
Statistics
STAT-244MPIntermediate Topics in Statistics: 'Survey Sampling'4
STAT-244NFIntermediate Topics in Statistics: 'Infectious Disease Modeling'4
STAT-244NPIntermediate Topics in Statistics: 'Nonparametric Statistics'4
STAT-244SCIntermediate Topics in Statistics: 'Computational Statistics'4
STAT-331Design of Experiments4
STAT-340Applied Regression Methods4
STAT-343Mathematical Statistics4
STAT-344NDSeminar in Statistics and Scientific Research: 'Analysis of Neural Data'4
STAT-344TMSeminar in Statistics and Scientific Research: 'Time Series Analysis'4
STAT-351Bayesian Statistics4

Note: DATA-113 can only count toward the major if taken before completing 200-level course in Statistics or Computer Science.

Contact Us

Cross-disciplinary by definition, Data Science prepares students to face emerging questions and challenges in the communication of data concepts and outcomes.

Next Steps

Apply to Mount Holyoke

Mount Holyoke seeks intellectually curious applicants who understand the value of a liberal arts education and are driven by a love of learning. As a women's college that is gender diverse, we welcome applications from female, trans and non-binary students.

Financing your education

Everyone’s financial situation is unique, and we’re here to make sure cost does not get in the way of an exceptional education.