PDF of syllabus available here

Spring 2023 Mondays 1-4pm

Course Description

Cultural production is often thought of at the individual level - an artist, author, auteur - but with the turn of the century, new theories, methods, and fields have started to coalesce around the possibilities of producing and studying culture with computers, and specifically at larger and larger scales. Rather than reading a novel or watching a film, or even dozens, scholars are increasingly utilizing mass digitization and born digital materials to explore hundreds, if not thousands or millions of cultural objects; in essence undertaking what might be described as culture at scale. Whether labeled computational humanities or cultural analytics, or digital humanities or distant reading (to name just a few!), this new area of research is focused on understanding what Andrew Piper succinctly described as “computation plus culture.” Often treated as largely a technical practice or as distinct fields that require interdisciplinary teams, this seminar is devoted to understanding how we might engage with these two practices constitutively and simultaneously – uncovering how computation can transform our understandings of culture and how in turn focusing on culture can impact how and when we use computation.

None of these terms (culture, computation, or scale) are self-evident. Culture for whom and by whom? Do we mean culture with a capital C or culture in the sociological sense? What is computation, and what is its relationship to statistics and programming? And even more oblique is scale. While Big Data has become ubiquitous, scale itself is not necessarily new, but the proliferation of relatively available digital storage and increasingly powerful processing power has profoundly accelerated and at times even transformed the horizons of possibilities. Rather than outright define these terms, this course will explore how scholars have determined and navigated these questions, as well as traversed and created new scales to make knowledge claims.

This course will therefore require both technical and theoretical expertise, as we engage with materials from multiple Humanities disciplines and sub-fields, use a variety of computational and statistical methodologies, and ultimately consider the very sociology of knowledge production and culture itself, which is increasingly happening at scale. Students will have the opportunity to develop specific research projects, and also read broadly across disciplines and methodologies. Seminar will be devoted to discussing both assigned readings and exploring relevant code, methods, and datasets that were used to produce the readings. Students therefore should feel comfortable prior to enrolling the class with the foundations of research programming and working with data programmatically, as well as humanities-centric research (this does not mean humanities disciplines per se, but research concerned with culture). The goal in combining what has previously been called “yack and hack”1 is that this seminar will explore how combining culture and computation at scale can “in an ideal world, […] equal more than the sum of its parts.”2

Pre- and Co-requisites

This course is inherently experimental and emergent, as many of the topics we will discuss are still being developed and refined by researchers – from statistical methods to novel model architectures to theoretical and intellectual frameworks.

Therefore, no one prerequisite is required but students should feel comfortable with programming and some area(s) of humanities research (again both broadly defined). Many of the course in the iSchool provide excellent foundation for this course, including:

However, this is not a definitive list. Interested students who have prior experiences that may be relevant, but not formal instruction, are also welcome to enroll.

Interested students should contact the instructor if they have any questions.

Assignments and Methods of Assessment

Setting the Foundation: Flexibility, Commitments, and Intellectual Journeys

What does it mean to do “assignments” or be assessed as a graduate student? Ideally this question wouldn’t matter, and you enrolled in this course with a deep passion for this research, so grading would be of minimal concern. In reality, we know that grades have enormous influence in both how we have been taught to value ourselves and in turn how society often values us. While I cannot wave a magic wand and restructure society, I do want to address this situation up front and honestly for a few reasons.

First, we are still in a global pandemic that has affected all of us, but not in equal ways. I address policies for what should happen if you or myself contract COVID-19 below under Attendance, but fundamentally no course is worth jeopardizing anyone’s health over (whether physical or mental!). While having a hybrid option mitigates some of our risk, the fact is that teaching and learning is fundamentally different in a COVID world.

That being said, while this may all sound very depressing, I firmly believe we have an opportunity to build a more equitable and supportive learning environment even in these circumstances. But this requires some ground rules and (re)setting of our foundations when it comes to this course.

First, I hope that we can all approach this course as an experiment in learning where flexibility and patience is prioritized. Such a perspective does not mean that we don’t have deadlines or expectations, but rather we acknowledge that we are all coming into this course with differing levels of expertise, responsibilities, and bandwidth (your instructor included!!). Rather than trying to erase these differences, we will strive to accommodate them as much as possible.

But to have this flexibility, we also want to make sure that we are committed to doing our best in this course. Otherwise, we lose this rare opportunity to think collectively and grow intellectually. So how to balance these goals: flexibility and commitment, in an increasingly difficult to predict world? Historically this is where grades and deadlines would come in. While I do have to submit grades for you, I want to emphasize that this course is about your individual intellectual development and journey. Therefore, I will try to make all expectations for grading as transparent as possible and also ask you to reflect on your intellectual journey during the course and remember an important adage throughout the course: comparison is the thief of joy.

The reason I want to emphasize this point is that rather than compare yourselves to each other, I hope that you can focus on your individual growth during the semester and see each other as colleagues to learn from, instead of competition. Some of you will already be in this mindset, but I find that this is one of the most important shifts from undergraduate to graduate education, so hopefully this reminder is helpful!

So, what does this all look like in practice?

There are three assignments in this course.

Assignments

Lead Seminar Discussion 25%

Each student will be responsible for leading seminar discussion once over the course of the semester. In this capacity, they will be given the opportunity to develop a lesson plan and also delve deeper into the assigned materials with the goal of reproducing (as much as possible) the analysis of the assigned week (i.e. re-running the code and exploring relevant datasets). Students will be graded on their preparation for discussion, the questions they prepare for the course, and their ability to engage with the materials. Students are not responsible for how much their peers participate (though hopefully that will not be an issue). Students will consult with the instructor at least one week prior to leading discussion to confirm assigned readings and the topic focus for the week.

Students can co-lead discussion if there is more than one student interested in the topic, and the only extra requirement is that co-leading students will have to individually submit a short summary detailing how they divided the work. If for some reason a student(s) cannot lead the week they are scheduled, they must contact the Instructor as soon as possible, and collectively we will work to reschedule.

Ideally, students should select a weekly topic that furthers their engagement with their domain and selected methods for their research project.

Book & Code Review 25% Due Mar 6, 2023 (Extension Possible)

Students will select one of the semester’s readings (or suggest one to the instructor that fits the course themes) to review in two senses: first, as an exercise in writing a book review, and second, in performing a code review. Both activities are fundamental to culture at scale but often treated as distinct. We will experiment with what it looks like to combine these two activities; to consider both the interpretation and infrastructure simultaneously, and how that can inform both our critical perspectives and understanding of scholarship.

If there is no book that meets your interests, you are welcome to select a subset of articles (no more than four) to write a thematic review across them. You may also review more than one book in your review if that would be useful to your research goals. Finally, you may also undertake a more methodological review, where you review either the use of a method in multiple articles or infrastructure for undertaking a method.

This review will be written individually, though students may review the same materials. The review should be 2-5 pages double spaced or no more than 1500 words (not counting references or footnotes). Students are welcome to use whatever style guide they prefer for citations and bibliographies.

The goal of this assignment is to help you engage with the secondary literature in your domain/methodology of interest. Ideally, you should select something that you will read for when you are leading seminar discussion. Students will be assessed on the quality of their writing, their ability to position their review materials in larger debates around culture at scale, and their assessment of both the strengths and weaknesses of their selected materials.

Research Project 50% Due May 15, 2023

Students working either individually or in groups will develop a research project with input from the Instructor and their peers. Beginning from the first class, students will decide if they either plan to test and develop new methods for an existing domain and research question, or plan to work to apply methods they know already into a new domain of cultural production (other permutations exist as well). The only requirement of this project is that students use computation, broadly defined, to study some aspect of culture, also broadly defined. While there are few constraints on the research question per se, students will be encouraged to develop their project with an orientation towards the audiences for this research and to consider how this project can further their larger research agenda (whether as a dissertation chapter or a future article).

The final two weeks of the semester are currently allocated for student presentations of their research project, which will be between 20-30 minutes. Students will then submit their final paper, 15-30 pages double-spaced, along with their code and datasets. Students again can use any style guide they prefer, but please be consistent in your usage. If working in a group, students will also submit a short summary outlining the division of labor in their project. Students will be assessed on the quality of their writing, the formulation of their research questions, the implementation and suitability of their methods and data, and their ability to engage with relevant secondary research from both the course and beyond.

Students should plan to meet with the Instructor at least once over the course of the semester (ideally in the first two weeks) to discuss their proposed project in-depth and to ensure that it fits the course remit.

Some Caveats and Clarifications

What if our project fails?

This is a question that every researcher faces but is certainly exacerbated by the constraints of a semester long course. First, I would encourage you to reach out to the Instructor and your peers if you are concerned that your initial research question and project is not panning out the way you hoped. That is often the case in research, and needing to pivot should be expected. Furthermore, I would encourage you to consider what you might mean by failure. Is it a null hypothesis? That can still be written up and theorized as research. Is it a method that failed to adequately measure what you hoped? Again, that still counts as research. Often, we need to adjust our expectations for what we can achieve and it is certainly acceptable to have future directions included in your paper. However, if you are struggling to implement methods or interpret results, please reach out to the Instructor for a consultation.

No participation grades?

I have intentionally not assigned any assessment to participation since this is a graduate seminar, which inherently expects you to engage with assigned materials through discussion. Furthermore, trying to assess what counts as good participation is always fraught. However, you may be wondering if this lack of participation grades means that you could theoretically not attend any course meeting and still do well in the course. The answer is hypothetical yes, but there are few things to consider. Much of what we will discuss in this course cannot be gleaned from just reading the assigned materials. So, if you never attend, you will miss out on learning from your peers and Instructor, and furthermore, your submitted assignments will struggle to engage with these materials in a sufficiently rigorous manner. I have no interest in forcing anyone to attend a course, so the choice is ultimately yours.

This question of attendance is further discussed in the COVID-19 & Attendance section.

Using AI Tools?

You are welcome to use any AI tools that will help you in this course, whether that is tools like GitHub Co-Pilot or OpenAI’s ChatGPT. I personally do not think these tools are going anywhere soon, and so learning to leverage them in your research is likely beneficial.3 However, I realize that many of these increasingly charge subscription fees, so please let me know if you would like to try a tool and are constrained for financial reasons, and I will try to advocate for some temporary funds from the iSchool.

Course Schedule

The schedule will be finalized after our first meeting, but it will involve combining three thematic threads:

Notes on reading the schedule and assigned materials

This course is somewhat unique in that we are not focused on one domain or discipline, or one set of methods, which means we do not have a more traditional course schedule. I have initially selected materials around broad themes for each week, but these are mostly suggestions and will likely be altered depending on the interests of whomever is leading discussion and our collective interests. There are general framing questions that we will discuss each week that are listed below. These questions transcend the assigned materials and are intended to help us both work through larger meta-issues in working with culture at scale, and also consider how we will deal with these issues in our respective research projects. Many of these questions could be entire courses on their own, so we will engage with them as much as possible, as well as our weekly case studies and theoretical readings.

When leading discussions, students are welcome to engage with these questions but are primarily responsible for discussions relating to the assigned materials. My tentative plan for seminar discussions is that we will spend most of the session discussing the assigned materials, but then reserve some time towards the end of the seminar for discussing these framing questions, as well as any methodological or project questions students may have. I want to emphasize that this is a tentative plan and will largely depend on student participation and input.

You will also notice that on our schedule we have different categories of assigned materials: core and background, as well as applied and theoretical. The first binary is my attempt to manage the scope of readings for this course, while at the same time pointing you to relevant materials for further research. You are only required to read core materials but are welcome to bring in any background material to discussions as well. The second binary is a bit hazier and is mostly to indicate to you the ways that you should expect to engage with the assigned readings. Applied readings will likely be examples of scholars using computation to make knowledge claims about culture, whereas theoretical readings are more likely to be arguments both for and against culture at scale. Some theoretical readings will include code and datasets, and some applied readings will also have theoretical and intellectual arguments as well.

Weekly Schedule

Introductions and First Next Steps Jan 23, 2023

What is culture at scale? How do we make knowledge claims about culture and the past? How have we in the past? Why does scale matter? What changed once we had scale? What are the origins of these practices, whether known as Cultural Analytics or something else?

Theoretical Materials

Core
Background

Applied Materials

Core

Assignments

Arguments and Algorithms: Theories Jan 30, 2023

How has mass digitization and the digital age made culture at scale a possibility? How do we start to frame hypotheses and think with scale? Where do we find data and how do we create it? What sorts of arguments can we make about culture with data? How do we balance domain with method when developing a research question?

Theoretical Materials

Core
Background

Applied Materials

Core
Background

Close and Distant: Genre Feb 6, 2023

How does scale complement and contrast non-scalable methods for working with cultural materials? How can we start to explore datasets but also operationalize hypotheses? How do we start to turn culture into machine-readable data and what is gained/lost in this process?

Theoretical Materials

Core
Background

Applied Materials

Core
Background Applied Materials

Models and Measurements: Race Feb 13, 2023

How can we further transform datasets to answer our questions? How can we turn our questions into models? What are models? How do we deal with minimizing information loss and maximizing algorithmic power? How do we use current methods but make them work for our purposes? How much do we care about statistics?

Theoretical Materials

Core
Background

Applied Materials

Core
Background

Vectors and Clusters: Publishing Feb 20, 2023

How can we represent culture in space? What are the benefits of high dimensional spaces and the challenges of dimensionality reduction? How well do unsupervised or off-the-shelf methods find patterns? How much validation work should we do with these methods?

Theoretical Materials

Core
Background

Applied Materials

Core
Background

Similarities and Distances: Laws Feb 27, 2023

How can we aggregate data to find patterns? How do we know certain patterns are meaningful? How can we find similarities and differences in our results? How do we define these terms and what do we consider statistically meaningful and at the same time interesting to our domain areas?

Theoretical Materials

Core
Background

Applied Materials

Core
Background

Categories and Classifications: Gender Mar 6, 2023

Review Piece Due

How can we build models to categorize and classify culture? What is the power of prediction for understanding cultural trends? How can we use computation to understand cultural categories of the past and present?

Theoretical Materials

Core
Background

Applied Materials

Core
Background

SPRING BREAK Mar 13, 2023

Speculation and Prediction: Networks Mar 20, 2023

How can we study and predict the past with scale? How can speculation be used to understand culture? How can computation create and uncover connections?

Theoretical Materials

Core
Background

Applied Materials

Core
Background

Influence and Embeddings: Ideas Mar 27, 2023

How do we incorporate the latest technological developments? How are these new infrastructures and architectures changing how we study culture at scale?

Theoretical Materials

Core
Background

Applied Materials

Core
Background

Viewing and Visible: Images Apr 3, 2023

How do we deal with non-textual data? What are generative methods, and what are the tradeoffs for dealing with accuracy of our phenomenon of interest? How do we balance considerations of ownership and ethics with data driven research?

Theoretical Materials

Core
Background

Applied Materials

Core
Background

Capitalism and Scalability: Movements Apr 10, 2023

What are the limitations of culture at scale? How can we deal with missing-ness in our data? How can this work be beneficial for society, and at the same time how can we critically understand the political economy of scale? Is it possible to have scale without capitalism? What might that look like?

Theoretical Materials

Core
Background

Applied Materials

Core

Background

Interactive and Persuasive: Collections Apr 17, 2023

How do we visualize our results? What are meaningful visualizations and how can you integrate them into your narrative? Do you publish digitally or print?

Theoretical Materials

Core
Background

Applied Materials

Core
Background

Plots: Narratives and Storytellings Apr 24, 2023

Student Presentations

How do we tell stories with our data? How do we give constructive feedback? How do we balance what we have achieved with what we hoped to achieve? How do we balance explaining knowledge claims with detailing methods and data selection?

Presentations: Interpretations and Final Future Steps May 1, 2023

Student Presentations

What are our next steps with our projects? How do we document what we have and explain future directions? What is the relationship between coding documentation and written publications?


  1. Bethany Nowviskie “On the Origin of “Hack” and “Yack”” Debates in Digital Humanities 2016 

  2. Andrew Piper, “There Will Be Numbers,” Cultural Analytics May 23, 2016. DOI: 10.22148/16.006 p. 2 

  3. I initially included a somewhat confusing double negation in this section, which I was tempted to keep for historical accuracy, especially since I’m usually writing a lot of these policies last minute. But decided to revise for clarity. Thanks to Scott Weingart for catching this 😊🙏🏽.