Data Science

Undergraduate

Data science extracts meaningful knowledge from large amounts of complex data by using scientific methods, statistics and computing tools.

Program Overview

The data science major lives at the intersection of many fields. Cross-disciplinary by definition, this major is an exercise in integration, one that brings together technical skills and domain-specific knowledge to create new pathways for learning. The data science major will prepare you to face emerging questions and challenges in the communication of data concepts and outcomes. Offering rigorous training in statistics and computer science, the major emphasizes how to be an effective, ethical and judicious consumer of data.

The major offers a foundational understanding of the data generating process, the appropriate and efficient translation of analytic strategies to specific data settings, the potential biases arising from missing data or data collection, the means for drawing accurate conclusions, and the techniques and principles of integrity in data visualization and communication. As part of your data science education, you will develop excellent communication skills and the ability to make clear and persuasive arguments framed by logic and supported by data.

The curriculum is flexible and innovative, serving a diverse backgrounds and disciplinary interests, and deep enough to accommodate anyone who wants ultimately to pursue advanced study in statistics and computer science. The Data Science curriculum reflects the increasingly collaborative and interdisciplinary academic landscape.

ALUM CONNECTIONS

Stories from Data Science Alums

Valerie Barr ’77 Jean E. Sammet Professor of Computer Science and chair of Computer Science

Courses and Requirements

Culminating in an integrative capstone and data ethics course, the data science curriculum includes required coursework in mathematics and statistics with a choice of domain area, such as English, economics or chemistry.

Learning Goals

  • Apply core concepts of statistics, computing, and domain knowledge to extract insight from data sets.
  • Understand the ethical challenges and potential privacy issues involved in data analysis.
  • Be able to communicate in multiple modalities the results of large scale data analysis.

Requirements for the Major

A minimum of 40 credits:

STAT-140Introduction to the Ideas and Applications of Statistics4
STAT-242Intermediate Statistics4
STAT-340Applied Regression Methods4
As a prerequisite for MATH-211:
MATH-102
Calculus II (or above)
MATH-211Linear Algebra4
COMSC-151Introduction to Computational Problem Solving 14
COMSC-205Data Structures 24
COMSC-335Machine Learning4
or 300-level alternative to COMSC-335
Two courses at the 200 level or above within a single domain area 38
DATA-390Data Science Capstone 44
Total Credits40
1

The combinations of COMSC-150 and COMSC-121 or of COMSC-150 and COMSC-161 are equivalent to COMSC-151.

2

The combination of COMSC-205PY and COMSC-122 is equivalent to COMSC-205.

3

A domain area -- chosen in consultation with the student's Data Science advisor -- is defined as any College-defined major excluding mathematics, statistics, and computer science. Course selection must be approved by the student's Data Science advisor.

4

The study of ethics in relation to data science is integrated throughout the curriculum and emphasized in this integrative capstone course.

Other Requirements

  • At the time of major declaration, a domain area will be selected by the student in consultation with an advisor from Data Science.

  • Prior to the DATA-390 Capstone course, each Data Science major will submit to their advisor a brief document of reflection  on the domain area, its connection to data science, and topics they might pursue for their capstone research. The Capstone will be offered in the spring term and be run as a research seminar.

Additional Specifications

  • Course substitutions through the Five Colleges require pre-approval in writing  by an advisor from Data Science.

  • Independent studies cannot be used to satisfy any of the above requirements unless approved by the Data Science Program Committee (with the possible exception of the capstone).
  • Students who declare a Data Science major automatically fulfill the College's "outside the major" requirement.

Sample Domain Pathways

At the time of major declaration, the student selects a domain area in consultation with an advisor from Data Science. Some sample pathways are described below:

Chemistry

Analytical and physical chemists often generate and analyze significant amounts of data. Analysis methods learned in analytical or physical chemistry courses are regularly applied to organic, inorganic, or biochemical systems. Two course sequences highlighting both methods and systems could include (a) a course in analytical or physical chemistry and (b) a course with a focus on organic, inorganic, or biochemical materials. More data generation and analysis based two course sequences can be two courses in analytical and/or physical chemistry. All first courses in the above sub-areas of chemistry CHEM-150 General Chemistry: Foundations of Structure and Reactivity and some also require CHEM-202 Organic Chemistry I and/or MATH-203 Calculus III.

Economics

Data touches nearly all parts of economics by informing models and revealing patterns and causal relationships in data. Data science is becoming an essential part of every subfield in economics. For example students interested in: (1) finance might take ECON-270 Accounting and ECON-215 Economics of Corporate Finance; (2) development might take ECON-213 Economic Development ; (3) theory might take ECON-201 Game Theory and ECON-212 Microeconomic Theory. Almost all 200-level courses in economics require ECON-110 Introductory Economics as a prerequisite.

English

Digital Humanities and New Media Studies represent two humanities avenues for potential cross-pollination with data analysis. Topic modeling, text mining, and database construction for interactive editions of texts are examples of particular areas of digital humanities that lend themselves to asking interesting questions about large humanities corpora. Students interested in English and Data Science would take courses in literary analysis and at least one upper-level course in digital humanities in the Five Colleges. For example, students interested in: (1) text analysis of literature and the environment might take ENGL-240 American Literature I and ENGL-366 Love, Sex, and Death in the Anthropocene, or Living Through the Age of Climate Change and Other Disasters. Alternately; (2) exploring large corpora might take survey courses offering breadth, e.g., ENGL-251 Contemporary African American Literature II or ENGL-241 American Literature II , ENGLISH-302 (UMass) Studies in Textuality and New Media or ENGL-390 (Amherst) Digital Humanities. Ideally, students would also take ENGL-199 Introduction to the Study of Literature.

Course Advice

The courses listed below form the core of the Data Science curriculum. In addition to core courses, students majoring in Data Science will take courses from their selected domain areas in consultation with their Data Science advisors.

Course Offerings

DATA-295 Independent Study

Fall and Spring. Credits: 1 - 4

The department
Instructor permission required.

DATA-390 Data Science Capstone

Spring. Credits: 4

The Capstone is a research seminar that brings together the three pillars of the Data Science curriculum. The course will start with common readings about research projects across a range of disciplines, including readings that address issues of ethics involved with the collection, treatment, and analysis of data. Concurrently, each student will develop an individual research topic and identify relevant data resources. The remainder of the term will be dedicated to exploring these topics through extensive data analysis, visualization, and interpretation, leading to a final report with complete results and a presentation.

Applies to requirement(s): Math Sciences
V. Barr
Prereq: COMSC-205 and STAT-340. STAT-340 may be taken concurrently (contact instructor for permission).

DATA-395 Independent Study

Fall and Spring. Credits: 1 - 8

The department
Instructor permission required.

DATA-395P Independent Study w/Practicum

Fall and Spring. Credits: 1 - 8

Instructor permission required.

Courses in Other Departments Counting toward the Major in Data Science

Chemistry
CHEM-348Using Data Science to Find Hidden Chemical Rules4
Computer Science
COMSC-151AAIntroduction to Computational Problem Solving: 'Algorithmic Arts'4
COMSC-151ARIntroduction to Computational Problem Solving: 'Artificial Intelligence'4
COMSC-151CPIntroduction to Computational Problem Solving: 'Computing Principles'4
COMSC-151DSIntroduction to Computational Problem Solving: 'Big Data'4
COMSC-151HCIntroduction to Computational Problem Solving: 'Humanities Computing'4
COMSC-151SGIntroduction to Computational Problem Solving: 'Computing for Social Good'4
COMSC-205Data Structures4
COMSC-335Machine Learning4
Data Science
DATA-390Data Science Capstone4
Mathematics
MATH-211Linear Algebra4
Statistics
STAT-140Introduction to the Ideas and Applications of Statistics4
STAT-242Intermediate Statistics4
STAT-340Applied Regression Methods4

Contact Us

Cross-disciplinary by definition, Data Science prepares students to face emerging questions and challenges in the communication of data concepts and outcomes.

Next Steps

Apply to Mount Holyoke

Mount Holyoke seeks intellectually curious applicants who understand the value of a liberal arts education and are driven by a love of learning. As a women's college that is gender diverse, we welcome applications from female, trans and non-binary students.

Financing your education

Everyone’s financial situation is unique, and we’re here to make sure cost does not get in the way of an exceptional education.