Requirements for the Major
A minimum of 40 credits:
Course List
Code |
Title |
Credits |
STAT-140 | Introduction to the Ideas and Applications of Statistics | 4 |
STAT-242 | Intermediate Statistics | 4 |
STAT-340 | Applied Regression Methods | 4 |
| |
MATH-102 | Calculus II (or above) | |
MATH-211 | Linear Algebra | 4 |
COMSC-151 | Introduction to Computational Problem Solving 1 | 4 |
COMSC-205 | Data Structures 2 | 4 |
COMSC-335 | Machine Learning | 4 |
| |
3 | 8 |
DATA-390 | Data Science Capstone 4 | 4 |
Total Credits | 40 |
Other Requirements
-
At the time of major declaration, a domain area will be selected by the student in consultation with an advisor from Data Science.
-
Prior to the DATA-390 Capstone course, each Data Science major will submit to their advisor a brief document of reflection on the domain area, its connection to data science, and topics they might pursue for their capstone research. The Capstone will be offered in the spring term and be run as a research seminar.
Additional Specifications
-
Course substitutions through the Five Colleges require pre-approval in writing by an advisor from Data Science.
- Independent studies cannot be used to satisfy any of the above requirements unless approved by the Data Science Program Committee (with the possible exception of the capstone).
-
Students who declare a Data Science major automatically fulfill the College's "outside the major" requirement.
Sample Domain Pathways
At the time of major declaration, the student selects a domain area in consultation with an advisor from Data Science. Some sample pathways are described below:
Chemistry
Analytical and physical chemists often generate and analyze significant amounts of data. Analysis methods learned in analytical or physical chemistry courses are regularly applied to organic, inorganic, or biochemical systems. Two course sequences highlighting both methods and systems could include (a) a course in analytical or physical chemistry and (b) a course with a focus on organic, inorganic, or biochemical materials. More data generation and analysis based two course sequences can be two courses in analytical and/or physical chemistry. All first courses in the above sub-areas of chemistry CHEM-150 General Chemistry: Foundations of Structure and Reactivity and some also require CHEM-202 Organic Chemistry I and/or MATH-203 Calculus III.
Economics
Data touches nearly all parts of economics by informing models and revealing patterns and causal relationships in data. Data science is becoming an essential part of every subfield in economics. For example students interested in: (1) finance might take ECON-270 Accounting and ECON-215 Economics of Corporate Finance; (2) development might take ECON-213 Economic Development ; (3) theory might take ECON-201 Game Theory and ECON-212 Microeconomic Theory. Almost all 200-level courses in economics require ECON-110 Introductory Economics as a prerequisite.
English
Digital Humanities and New Media Studies represent two humanities avenues for potential cross-pollination with data analysis. Topic modeling, text mining, and database construction for interactive editions of texts are examples of particular areas of digital humanities that lend themselves to asking interesting questions about large humanities corpora. Students interested in English and Data Science would take courses in literary analysis and at least one upper-level course in digital humanities in the Five Colleges. For example, students interested in: (1) text analysis of literature and the environment might take ENGL-240 American Literature I and ENGL-366 Love, Sex, and Death in the Anthropocene, or Living Through the Age of Climate Change and Other Disasters. Alternately; (2) exploring large corpora might take survey courses offering breadth, e.g., ENGL-251 Contemporary African American Literature II or ENGL-241 American Literature II , ENGLISH-302 (UMass) Studies in Textuality and New Media or ENGL-390 (Amherst) Digital Humanities. Ideally, students would also take ENGL-199 Introduction to the Study of Literature.