We are what we say…

I tweeted about this a little earlier, but it stayed in my mind for a while today, so I thought it deserved a fuller exploration.  This is all very sketchy and not fully thought-out, but it’s something I’ve thought about before and don’t have any easy answers for.

I spent this morning curled with a cup of coffee and the Spring/Summer issue of The American Archivist, which I have been slowly working my way through.  This morning, I finished Jennifer Douglas and Heather MacNeil’s article applying genre studies to the calendars and inventories produced at the Public Archives of Canada, and Heather Soyka and Eliot Wilczek’s on documenting the Iraq and Afghanistan wars.  Both excellent pieces, with a lot to think about.  For some reason, the thing that popped out at me from both pieces was the role of archival jargon in what we do and the way we present ourselves as a profession.

Douglas and MacNeil discuss jargon more directly, arguing that archivists at PAC used more archival jargon in their finding aids as they adopted a more “professional” attitude towards their work and an increasing differentiation from the historical profession (160-161, 166).  Using words like provenance and original order was a way for the PAC archivists to indicate a level of expertise about the records in their care.  The change in descriptive practices that Douglas and MacNeil describe (moving from calendars to general inventories) is clearly a move towards what we today consider “correct” archival practice, so as I was reading, I found myself mentally cheering the PAC for “getting it right.”  I also think that archivists, individually and as a group, do have a unique and valuable skill set for doing the work that we do, so I was also happy to see an increased sense of professional identity, expressed through shared concepts.

But on the other hand, adding this extra layer of professional jargon onto the PAC’s finding aids seems like an obstruction for users.  Now, in addition to identifying the records they want to look at, a researcher also has to puzzle through a description of evidentiary and informational value.  A valuable concept for us, but is it equally valuable for users?

So, issues of language and professionalization were on my mind as I read Soyka and Wilczek’s article.  One of the challenges to adequately documenting these two wars is the lack of records management capacity in the US military, with smaller military units relying on a person who has been assigned the job of keeping important records, on top of the rest of their responsibilities.  Soyka and Wilczek point out that these people who have been assigned this responsibility often don’t have adequate training in it or an understanding of why it is important, especially relative to the other things they have to do. (187-188)  They note that “army history doctrine and instructions generally describe appraising records and documenting activities as ‘collecting’ documents” (179).  A little thing, but Soyka and Wilczek realize that talking about “collecting documents” indicates a different way of thinking than “appraising records and documenting activities.”  The archival way of looking at this problem, reflected in the different way we talk about it, from my perspective (and I think from Soyka and Wilczek would agree) would be better – it would hopefully lead to better record keeping and a fuller picture of these two wars.  But is it necessary to explain functional appraisal and documentation strategies to the military personnel tasked with preserving records?  Can they get the general concepts without learning the jargon?  Does using the right words actually mean anything, other than a declaration that the person speaking is fluent in archives-ese and therefore a member of the club?

At work right now, I am working with a colleague to minimally process and create a finding aid for a large archival collection.  Several of my co-workers, who are librarians but not archivists, refer to the finding aid as a “pathfinder.”  It’s not wrong, and it’s actually not a bad description of what the document is – no worse than “finding aid,” to be honest.  But I still cringe a little bit inside whenever they say it.  Does it really matter?  Probably not.  Am I just being snobby?  Almost certainly that is at least some of it.  When I tweeted about jargon earlier, a friend from grad school replied that “acronym overload” is the worst.  Which I can definitely agree with.  But does that mean we should do away with jargon altogether?

Isn’t there some value in all using the same words?  So that when I say, “here is the finding aid for this collection,” whoever I’m talking to has some idea of what I mean, and some expectation that it will be similar to other finding aids they have seen?  Part of me thinks that yes, there is value to that.  (Even though I know that what users think of when they hear “finding aid” isn’t necessarily the same as what we mean.)  So when I hear people complain about too much archival jargon, part of me cheers along with them, but part of me also knows that we do have specialized skills and ways of looking at our work, and that is naturally reflected in the language we use.

This is one of those blog posts were I don’t really have an convenient conclusion.  The obvious answer is to use as much jargon as we want when we talk to each other (well, maybe within limits), but explain things in more simple terms when talking to non-archivists.  If only it were that easy.  I’d love to hear other people’s thoughts about jargon and our professional discourse – let me know what you think!

Reclassifying DVDs, Part 2: Wrestling with Genres

Over the past few months, I have been reclassifying the “Popular Movie Collection” at the small academic library where I work. This is the second of two blog posts about the project.  In the first entry, I wrote about finding a way to fit movies into the Library of Congress classification system.  In this post, I am going to talk about implementing that system, and in particular, my attempts to make sense of movie genres.

As I explained in Part 1, the classification scheme I am using categorizes movies from the U.S. by genre.  I created a list of genres to use by synthesizing the lists given in the Library of Congress classification system and the Library of Congress Genre/Form Thesaurus (LCGFT).  I used only “broad” genres, hoping to have a small set of terms that could describe all of the movies in our collection.  I assumed I might make some adjustments to my genre list as I processed the first few dozens movies, but I didn’t expect too much difficulty.

Very quickly, however, I realized that assigning genres to movies is HARD.  More specifically, deciding on one single genre that is best to describe a movie can be extremely challenging.  The LCGFT is designed as a faceted thesaurus, with catalogers encouraged to assign multiple terms, and at different levels of specificity, to individual movies.  That way, for example, Skyfall is described in its catalog record as an “action and adventure film,” “spy film,” and “crime film.”  This makes perfect sense from a search perspective — if you search for any one of those terms in our catalog, Skyfall will be in your results.  But since I am using genre to decide where to shelve the movie, each film has to be assigned to one and only one category, since it can only sit in one place on the shelf.

Compounding the challenge was my desire for as short a list of genres as possible.  For example, I opted to not include “Romantic Comedy” as one of our genres, although it is a valid LCGFT term.  But what, then, to do with romantic comedies?  Are they romances, or comedies?  The answer, of course, is both (hence the name) — unless it is neither.  Is a romantic comedy really the same as a pure romance, or a pure comedy?  To some degree, such philosophizing is beside the point. For me, what matters most is where our users are most likely to look for a movie. But it was hard for me to not get caught up in trying to make logical sense of genre structures.

It turns out that defining genres is notoriously difficult.  Where is the line drawn between fantasy and science fiction?  Or between horror and suspense?  When I started working on this project, I decided to do some exploring into the literature about genre in film & media studies.  From my extremely cursory foray into that literature, there is no accepted way of delimiting the boundaries of a genre.  According to genre theorist Rick Altman, some approaches create a short list of archetypal movies that define a genre, while others draw up more expansive lists to show what a genre could be.  This means that when we think about what a “fantasy film” or a “horror film” should be, we often imagine one of the defining archetypal examples. But those examples are obviously only a small subset of all movies; many other movies can be considered fantasies or horror films, or at least incorporate some of the elements of those genres. There were some movies in our collection that were easily identifiable as typical of a given genre (The Hobbit is obviously fantasy; Cabin in the Woods is obviously horror), but many movies seem to fall into the interstices.

Hugh Jackman as Wolverine threatening with his claws

Wolverine is frustrated that he doesn’t know his genre.

X-Men was one that I struggled with in this regard.  Is it action and adventure? Fantasy? Science fiction?  It has elements of all of these, but doesn’t easily fit within the typical definition of any of them.  Again, my goal was to put it in the category that would make the most sense to our students.  With titles that I found particularly difficult (like X-Men), I did a little bit of informal user testing by asking our student workers what made the most sense to them (I also asked friends on Facebook, launching what turned out to be a very lively conversation).  This sort of crowd-sourcing seems particularly appropriate for movie genres, since as some film theorists argue, genres are defined in part by the popular understanding of them. The results of these informal surveys were sometimes useful, but often they had as many conflicting opinions as I did (no one, in fact, could agree on where X-Men belonged, and it ended up in science fiction).  In part due to genres’ reliance on cultural convention for definition, absolute certainty and agreement is often impossible.

Another reason it is so difficult to assign every movie to a genre category is that the idea of genre as it is sometimes defined is not meant to encompass all movies.  Film theorists often treat “genre movies” as a distinct type of movie, one that follows a predefined set of norms regarding narrative, theme, and tone.  Often, “genre movies” are associated with a particular historical time period, that of the Hollywood studio system in the mid-20th century.  This definition of genre is at odds with the one I was using in my classification, in which I treated genre as a characteristic that all movies possess.  To iron out this contradiction, I created a genre category for “Drama,” a term that is notably absent in both the LCGFT and LC classification system.  I essentially treat this genre as a placeholder for otherwise “non-genre” films, those that don’t conform to the kind of subject matter or narratives that belong to an established genre.  In practice, this works fairly well, since I think most people do think of “dramas” as a distinct genre.  Intellectually, however, I remain somewhat unsatisfied with “Drama” as a genre, since its definition does not have any real content of its own, but instead relies on being opposed to other existing genres.

Also complicating matters is the fact that a genre’s definition can depend on various elements of a film, including content, theme, style, and so on.  In fact, different sources emphasize some elements over others, but without consistency.  The Library of Congress’ 1998 Moving Image Genre-form Guide states that genres “are recognizable primarily by content, and to a lesser degree by style.”  A more recent source, OLAC’s Best Practices Guide for LCGFT for Moving Images describes genres as consisting of a “packaging of various topical and stylistic elements.”  One thing that is clear is that a genre is more than subject matter; genre terms themselves are distinct from subject terms for that exact reason.  To put it differently, subject terms describe what a film is about, while genres describe what a film is.  Catalogers are familiar with this distinction, but putting into practice is often more challenging that we would like to think.  For one thing, as Martha Yee has pointed out, many genre terms are in fact defined by subject matter.  The LCGFT, for example, defines “Crime Films” as “fictional films that feature the commission and investigation of crimes” – clearly a definition that relies on what the film is about.

Animated gif of Jack Lemmon and Tony Curtis in drag in

Nope, not a “gangster film.”

This sort of content-based ascription of genres, though, leads to imperfect categorizations.  Transformers is a movie about robots from outer space, which would lead to thinking it is a science fiction movie.  However, in discussions with students at my library, I found that many people think of it primarily as an action & adventure film.  Similarly, Some Like It Hot features gangsters prominently in its plot, but no one thinks of it in the same genre as The Godfather.  Genres describe more than just a movie’s content; defining precisely what that is, however, can be tricky.  From a cataloger’s standpoint, the challenge is even greater when trying to assign a single genre to a movie that I have not seen.  It’s easy to determine what a movie is about, from the container and the description on IMDb, but it isn’t always possible to get at the subtler elements of tone and theme that can distinguish, say, a thriller from an action movie.  Assigning genres based on a subset of plot elements is often the easiest route, and one that I found myself taking for many movies, especially those that I was not familiar with, even while I had to acknowledge that it was an imperfect way to go about the task.

Ultimately, the process of determining a single genre for each movie involved a combination of looking at what other cataloger’s had put in the catalog record; checking IMDb (including user reviews to get a sense of how viewers understood the film); reading the movie’s case, to see how it described itself; and asking students and co-workers on tough cases.  None of these sources were perfect, and they often disagreed with one another, but together they helped me get a sense of the movie.  I have come to accept that genres are, at best, a very inconsistent science, and that I am trying to put them to a use that does not exactly match their inherently fuzzy nature.  In a great many cases, there is no one “right” answer.  However, as the project is nearing its completion, I do think that basing call numbers on genres has helped group similar movies together and has given the collection an overall structure.

My next steps are to finish the reclassification, then create some signs or labels that will help make the new structure understandable to users as they approach the Popular Movies shelf. Is this the perfect solution for classifying and shelving movies? Probably not. But for this collection, it works well enough.  I hope it makes the shelves seem approachable for a casual user — if you see a movie that catches your eye, the other movies close to it should be similar.  It also helps librarians find movies easily, because if I know the genre, I can guess approximately where on the shelf to find the case.  The system isn’t perfect, but it is at least a starting point and an improvement on shelving by title.

Plus, now I know a lot about movie genres, which should come in handy at a cocktail party at some point…

———————————————————————————–

My readings about genre in film & media studies came from Film Genre Reader III, edited by Barry Keith Grant (2003).  Some of the pieces are a bit unapproachable for non-experts, but the collection as a whole was an extremely interesting look into a field I had never explored before.

Reclassifying DVDs, Part 1: Creating a System

Over the last few months, I have been working on reclassifying the “Popular Movie Collection” at the small academic library where I work. This is the first of two blog posts I am planning about the project, describing the process of deciding how to classify the movies. [Update: Part 2 can be found here.]

The collection

The “Popular Movie Collection” includes DVDs, Blu-rays, and videogames, and as the name suggests, it is provided primarily for students’ entertainment, rather than as direct support for coursework. The collection has been built up piecemeal through purchases and donations, and consequently has a rather eclectic range of new movies, classic films, and documentaries. It also includes our library’s small collection of Xbox and Playstation games, which students can check out for home use or to use with the gaming consoles in one of our group rooms.

The collection had been shelved by title, I presume because of its initially small size. As the collection grew, however, the decision was made to begin classifying and shelving the collection using Library of Congress call numbers. Some months before I began working here, new movies began being given LC call numbers as they were entered to the collection, but no effort was made towards retroactively re-classifying the entire collection. Now that I have somewhat settled into my role and have the time and energy to devote to larger projects, I decided to tackle the DVDs. We also have a large donation of DVDs waiting to be added to the collection, and I decided it was wise to handle the existing collection before adding to it.

Popular DVD shelf

Options…

Doing some initial research, however, I quickly realized a problem: the Library of Congress Classification system isn’t designed to handle movies. Documentaries would be no problem; they could easily be applied LC numbers based on subject. The small number of fiction television shows and videogames in the collection meant that they could be given one number each (PN1992.55 and GV1436, respectively). But LC has no easy way of accommodating fiction movies. There are ranges of numbers associated with motion pictures in the PN range, but they are clearly designed for books about movies, rather than for movies themselves. One number, PN1997 (1997.2 for movies produced after 2000), is designated for “Individual motion pictures,” and while it was originally intended for scripts, it is often assigned to recordings of movies. PN1995.9 covers “Motion pictures – Other special topics,” and is sometimes used, since it includes numbers for genres, countries, topics, and other topics. Local guidelines on adapting the LC system for this collection were clearly needed, and in the process of creating those, I had to think through how the shelving system could best support our goals for the collection.

This presentation from Maryke Barber at Hollins University (PDF) helped me think through some of the options available. I considered and quickly discarded putting all fiction movies under PN1997 and PN1997.2, then sub-arranging them by title. This had the appeal of a simple solution, but was not really scalable, even to the needs of my small collection. Things would get very complicated very fast, leading to unhelpfully long Cutter numbers, which in turn would lead to shelving errors. It also didn’t add any value to our shelving system — if I was going to do all this work, I wanted the end result to be more useful to our users than what we started with.

I also thought about following the rules for creating call numbers for literature, in which the author’s last name determines the number, and substituting the director for the author. One of the big draws of this approach was that it would let me utilize the complete run of LC numbers in the PR-PZ range. This makes sense in some cases — it might be useful to have all of Alfred Hitchcock’s movies together, for example — but doesn’t really make sense for most movies. I didn’t want to break up the Star Wars films, for example, because the entire series wasn’t directed by the same person. Shelving by director might make more sense for a film studies collection, in which the users have more specific needs, but for a popular collection, it seems a poor fit. I also wasn’t crazy about assigning so much importance to the director, since movies are a much more collaborative form than novels.

The solution (or at least our solution)

Ultimately, I decided to divide the movies by country of origin, and then further divide U.S. movies by genre. Movies from countries other than the U.S. are assigned to PN1997 and subdivided by country of origin; U.S. movies are classed PN1995.9 and subdivided by genre. My university has a very international student body, so providing access by country seems appropriate to me, since presumably some users would want to easily find movies from a particular country. I do have some qualms about separating international movies from U.S. movies, since I worry that it creates a sense of “other-ness” and flattens those movies to only their nationality, discouraging some users from checking them out. Part of me thinks, for example, that putting a South Korean action movie alongside Fast & Furious and Skyfall might encourage more users to look at it, but then it is much harder for a user who specifically wants a South Korean film to find it. I decided that the value of being able to group all of the movies from one country together outweighted those concerns, but I do still have some doubts about it.

I chose to subdivide U.S. movies by genre again based on my perception of users’ needs. Partly, the idea came from the fact that the PN1995.9 range in LC Classification already includes subdivisions for some genres, among other subjects (the list of subjects in that range is actually quite hilarious, including everything from Colors to Existentialism to Postage stamps). Arranging by genre, I hoped, would also seem familiar to our users, since that is how most commercial video rental outlets organize movies, from Blockbuster to Netflix. Of course, I feel rather uncomfortable suggesting that libraries should act more like commercial rental places, but for this collection, for these users, I think it makes sense. My main goal is to increase browsability on the shelves, and imitating the contexts that people are used to finding movies in made sense for that goal.

I based my genre list on the Library of Congress Genre/Form Terms list, specifically on the sub-list for movies created by Scott M. Dutkiewicz for OLAC (PDF). Immediately, I realized that I needed to narrow the list down, or I would have scores of genres, many with only one or two movies in them. That would obviously not serve my purpose of co-locating similar movies. I created a my short list by comparing the LCGFT list with those genres given LC Cutter numbers. I opted for genres that are at the top of the LCGFT hierarchy (i.e. those with no “broader terms” listed). I also used my knowledge of the collection and its goals to decide on what terms to include. I then created a spreadsheet with Cutter numbers for each genre that I have used to apply this new system to each movie.

I also made one last local decision: to use the year of the movie’s original release in the call number, rather than the year of the DVD’s publication. Again, my goal was for the call number to include as much information about the movie as possible. Because DVD is a recent format (and Blu-ray even more so), the publication year has no necessary relationship to when the movie came out and, in most cases, is meaningless. The exceptions are DVDs that are released as “special editions” with lots of new material and collections of movies released as a set, for which the publication year is ued.

The final product:

Le Jour Se Leve
PN
1997
.F73       (Cutter number for France)
J68         (Cutter number for title)
1939      (Date of original release)

Lady Sings the Blues
PN
1995.9
.B55       (Cutter number for Biography)
L33        (Cutter number for title)
1972      (Date of original release)

With the system for creating call numbers established, I thought the hardest part was over and began the process of retrospectively applying it to our collection. Little did I know, however, that genres would become an enormous source of confusion. In my next post, I’ll explain why and go into the details about creating call numbers for the collection.

Archives of Queer Geography

I just got home from a great talk at the Stonewall National Museum and Archives.  The talk was about the queer geography of South Florida from the 1970s to today, using census data and advertisements in archived gay magazines to track movements of lesbian and gay households and businesses through different communities in the area.  It was fascinating research, and helped me learn quite a bit about the region I currently call home.  The most exciting thing to me, though, was getting to see people be excited about archives and archival research.

The Stonewall archives are special to me, since they are where I did most of the research for my master’s thesis.  During that research, I had wondered about some of the questions that this talk addressed, but didn’t have the time or the methodological know-how to answer them the way this man did.  He went through decades of magazines and created a database of business addresses, which he then used to create a series of GIS maps showing distributions and change over time.  This is probably a fairly common method in cultural geography, but I thought it was pretty exciting.  Even more so, though, was the clear enthusiasm that this young researcher, who is working on an MA, had when he talked about realizing that this source existed.  He talked about discovering that the Stonewall had copies of these old magazines, many of which are fairly marginal “bar rags” that would be hard to track down otherwise, and it was clear that doing archival research was somewhat new to him.  I’m so glad he got connected to those resources, and it really speaks to the importance of archives reaching out to a wide range of potential user communities — something we all know, but I think bears repeating.

It also speaks to the importance of community-focused archives that not only preserve records, but provide a venue for disseminating the research that is done on them.  It think it is phenomenal that the Stonewall provided a space for this research to be shared in a non-academic context.  In addition to sponsoring this talk, they are also displaying some of the maps that the researcher created from his data in one of their gallery spaces, alongside related artifacts.  Even more incredible was the number of people who came out to a talk about geography at 7pm on a Friday night!  The audience was so engaged, and during the Q&A, several people brought their own memories of being gay in South Florida into the conversation.  It was really great to see that kind of conversation happening between an academic and a public audience — a kind of conversation that is far too rare.  Local, community-based archives can be great venues for those kinds of events and dialogs, I think largely due to the fact that they already have community buy-in and trust.

During the Q&A, I asked the speaker if he had plans for preserving the data set and maps that he had created during his research.  Perhaps not surprisingly, he didn’t seem to have given that much thought before.  I don’t know what the scholarly conventions are for preserving research data in geography, but I think those would be a great resource for future researchers.  Someone from the Stonewall Museum said that they are definitely planning on keeping copies of his maps, which is wonderful.  But I would really love to see them preserve the raw data and the GIS files he used to create the maps.  Unfortunately, I don’t think the Stonewall archives, which are entirely volunteer-staffed (albeit by trained archivists) and part of a small non-profit organization, have the capacity to accept and preserve those kind of complicated digital objects.  There is a real challenge in how community-run archives can preserve the increasingly digital materials that document their communities.  But tonight reminded me how much I believe in those institutions, and how exciting they can be, and I’m confident that they are going to meet the digital challenge, just like the rest of the archival world.

Allen Ginsberg and the Power of Archives

I recently read this article about the Allen Ginsberg collection at Stanford, which I saw on Twitter, and was really struck by several things about it.

The first was the description of how thinking about his personal collection inspired Ginsberg himself to become more interested in photography:

Morgan, [Ginsberg's personal archivist] who spent 20 years cataloging Ginsberg’s materials, said the organization process spurred Ginsberg’s interest in photography.

In the early years of their working relationship, Morgan continually came across unidentified photos as he pored through boxes upon boxes of Ginsberg’s ephemera. Morgan nagged Ginsberg to identify the people in the photos, and when Ginsberg finally started to review them, Morgan said, Ginsberg realized his images “were not only the history of the people he knew” but were also “kind of good works of art.”

So Ginsberg, who had taken about 1,200 photos before he put photography on the back burner in the early 1960s, consulted his friend and photographer Robert Frank on photography techniques, and between 1982 and his death in 1997 took “the other 78,800 photographs,” Morgan said.

I have occasionally heard about archives providing inspiration for artists, but I haven’t heard before about a person’s own records inspiring them in this way.  I love the idea that paying attention to one’s own records can be the impetus for opening up new avenues in someone’s life. As personal archiving is becoming more prominently discussed, and not only for people as prominent as Ginsberg, I hope we will hear more about these kinds of unexpected inspirations.

One of the things I love about that anecdote is that it reminds us that archival records are much more than just historical evidence (as important as that is).  Which brings us to the second thing I loved about this article: its acknowledgement of the emotional power of archival records.

[Special Collections librarian] Taormina especially remembered a patron who came in to see the original Howl manuscript.

“He sat quietly in the back and when I walked the room I noticed that he was quietly weeping.” That kind of response, she said, “is not common in our reading room and I will most likely always remember it.”

Documents can be powerful repositories of meaning and emotion.  While such responses may not be common in reading rooms, I have felt them myself when working with records, and I think many other archivists could say the same.  Archives can and should do more to allow reactions such as that man’s to occur, to not wall our collections off from emotional responses.  I don’t know exactly how we can do that, but I believe it is important.  By opening our repositories to the feelings invoked by our records, we can do away with the stereotype of archives as dusty and lifeless, encouraging more people to engage with archives.  And wouldn’t that be a great thing for all of us?

Describing Queer Mathematicians

Just in time for Pride month, I spent most of June processing the collections of two (probably) lesbian mathematicians.  I really enjoy my job at the Archives of American Mathematics, which collects the papers of mathematicians and mathematics organizations, and over the past year, I have developed a real affection for the world of mathematics.  I was very excited to work with materials that combine my newly discovered love of mathematicians with my interest in LGBTQ history.  One of the things I love best about working in archives is the ability to get to know people through their collections, and I was eager to learn more about these two women and their relationship through processing their records.

These two women, Dorothy Bernstein and Geraldine “Jerry” Coon, were professional mathematicians in the mid-20th century.  I suspected they were more than just colleagues when I found out that they had lived together while teaching at a women’s college in the 1970s and continued to make a home together after retiring.  My suspicions grew stronger when I found out that Jerry had served as the executor of Dorothy’s will after her death in 1988.  Although Dorothy’s obituary does not mention Jerry, when Jerry passed away in 2008, her obituary refers to Dorothy as her “life companion.”  Based on that evidence, I feel confident that the two women had a relationship that was much more than simply professional, and that probably had a romantic and sexual dimension.  I do not, however, have any definite proof of that.  Nor do I know how their relationship evolved over the 50 years they knew each other or what others knew or thought about it.  As nearly as I have been able to piece together their stories, they met in graduate school at Brown University in the 1930s.  Both pursued their separate careers, Dorothy as an academic and Jerry as an applied mathematician, and Dorothy served as Jerry’s dissertation advisor in the 1950s.  By the mid-1960s, both women taught at Goucher, a women’s college in Maryland, where they worked closely together and co-authored several papers.  Both women retired in 1979 and moved together to New England, where they lived until Dorothy’s death.  Their collections contain no personal correspondence between the two, no photos of them together, really nothing of a personal nature at all.  While that is not entirely unusual for collections we get at the math archives, it was somewhat disappointing for me to not learn more about that part of their lives.

The lack of information about their relationship caused me to think hard about how to describe their collections.  In writing the finding aids for these two collections, I struggled with how to talk about the two women’s relationship.  As I mentioned, there is no indication in either woman’s collection about how they referred to each other, either privately or publicly.  I was very committed to respecting these two women’s identities, and did not want to assign them an identity as “lesbians” if that was not something they themselves claimed.  At the same time, I definitely did not want to be complicit in obscuring their relationship.   Historical studies of LGBTQ people, especially projects relating to lesbians and queer women, have struggled with archival practices that silence and hide relevant collections for too long, and I definitely did not want to be a part of that problem.  How, then, to describe these women and their collections, in a way that both respects the creators and makes the collections discoverable as belonging to queer mathematicians?  Where is the line between describing the creators of these collections and interpreting the evidence in them?

Ultimately, I referred to the women as each other’s “long-time companion, mathematical collaborator, and colleague.”  The term “companion” came from Jerry’s obituary.  I also took guidance from the Harry Ransom Center’s collections guide for LGBTQ Studies (which I found through the help of my friend Sam Bruner).  The guide recommends “companion” as one possible search term for finding individuals in queer relationships.  Although clearly originally used as a euphemism, “companion” fits my needs well, since it implies a close connection between two people without naming it specifically (while still conjuring some sense of queerness, given its frequent use in that context).  The term’s matter-of-fact nature seems to fit with the few documents that referred to their relationship using direct yet unrevealing terms, stating simply that the two lived and retired together.  Whatever else may have been the case about their relationship over the years, it is undeniable that these two women were companions in many meanings of the word.

The descriptive language I chose does not explicitly mark them as lesbian and queer, and I did not use any terms that would show up in most searches for LGBTQ materials.  However, since the collections themselves contain almost no materials specifically useful to a historian of queer sexuality, that seems acceptable to me.  I tried to describe the collections in a way that the queerness of these women’s relationship comes through, making it possible for archives users to be aware of that part of their lives.  Most of all, I hope I was able both to respect Dorothy and Jerry and to provide the information that will make these collections discoverable and useful.

What do you think?  Any ideas for the best ways to describe collections like these?

Click here for the finding aids for the Dorothy L. Bernstein Papers and Geraldine A. Coon Papers.

All the usual caveats apply that the opinions in this post are my own and do not reflect the views of the Archives of American Mathematics.

Presentation at SSA

This morning, I am presenting my first-ever talk at a professional conference!  This week is the Society of Southwest Archivists annual meeting here in Austin, and I am giving a talk entitled “Web Archiving for University Records” on a panel about digital records in university archives.  I’m a little nervous, but glad to be giving a presentation at a conference that everyone says is friendly and welcoming.  Plus it doesn’t hurt to have the home court advantage!

An interest in web archiving is something that I have fallen into somewhat by accident, but through several projects this past year, I had the chance to delve into it.  Working on this presentation was a great opportunity to synthesize some of my thinking about web archiving, and to realize how much more I have to learn about it.

I’ve posted the slides from my presentation, as well as some related links, here.  I’m looking forward to getting feedback after the talk, and I’d love to hear your thoughts here, as well.

Queer Time, Archives Time

I was lucky enough to spend most of the day yesterday at a symposium put on by a group of UT graduate students entitled “Queer Archives, Queer Affect.” The symposium was the culmination of a seminar taught by Ann Cvetkovich exploring the intersections of queer theory, affect theory, and theories of “the archive.”  The presenters were graduate students from English, Women’s and Gender Studies, Radio/Television/Film, and the School of Information, and the day was capped off with a keynote talk by Heather Love.  It was a really great symposium, and a pretty wonderful way to wrap-up my life as a graduate student.

I am really fascinated by academic theories about archives, even as I am sometimes frustrated by them.  I believe very strongly that theory is important and useful, even when it doesn’t immediately seem that way.  Thinking theoretically helps me be much more intentional about what I do and look at things in exciting new ways, and I think that literary and cultural theories about the archive can help archivists do exactly that.  At the same time, though, many of the theorists who talk about “the archive” are so disengaged from what archives actually look like in the traditional sense that their arguments can seem totally foreign.  One of my classmates quoted the Princess Bride when discussing academic uses of the word “archive,” and it seems quite appropriate: “You keep using that word; I do not think it means what you think it means.”  The archivists I know often talk about needing to educate academics about what we do and what material archives are really like, and I agree that that’s important.  But I think in doing that, we shouldn’t lose sight of the fact that we have a lot to learn from them, too.

As I was sitting in the symposium yesterday, I thought quite a bit about how queer theory’s interest in temporality is really relevant to what archivists do.  In the last decade or so, quite a few queer theorists have started to think about different ways of experiencing time and how a queer perspective changes what it means to think about the relationship between past, present, and future.  This “turn to temporality” has taken a lot of different forms, not all of which I fully understand.  It is fairly easy to caricature and dismiss out of hand the idea that queer people experience time differently, but for me, reading this work has been a really productive way of rethinking our relationship to history, which I believe is an important question for archivists.

Several of the papers yesterday referenced Carolyn Dinshaw’s distinction between “amateur time” and “professional time.”  Dinshaw argues that amateurs and professionals have different ways of focusing their attention and experiencing the time spent working on a project.  Because amateurs often feel deeply, emotionally attached to their work, they are free of the professional’s expectation of detachment, which fosters an ability to experience time in ways other than the purely linear, highly regimented temporality associated with modernity.  It is in that affective relationship to work and to the past that Dinshaw locates the “queerness” of this way of experiencing time, since it allows the amateur to create all sorts of attachments other than those required by work and the family.

This idea of amateur vs. professional temporality really resonated for me as a way of thinking about what it is like to work with archival records.  Isn’t it a very different experience to leaf casually through a box of records with no real purpose than to be actively scanning for one particular document or one particular subject?  And isn’t it affectively different to process a collection as quickly as possible to reduce your backlog, than it is to painstakingly remove every staple and unfold every crease?  I think it would be valuable for archivists to better understand the differences between a casual (or “amateur”) and a scholarly (or “professional”) use of our materials.  Does that lead to different interpretations of the material, on the part of both archivists and users?  Does it lead to different ways of valuing archival records?  How do our finding aids and the physical environments of our reading rooms encourage or discourage either amateur or professional ways of interacting with our records?  These are not explicitly questions of temporality, but I think that Dinshaw’s work is an inspiration to think more deeply about how both archivists and users experience working with archival records, and to admit that there can be multiple, very different experiences based on different positionalities.

I write this feeling very aware of how sketchy and unfinished this analysis is – how unexplored some of my assumptions are, how much deeper the engagement with queer temporality theories could be.  But maybe that’s appropriate – appropriate for me to engage with these ideas as an amateur theorist.  Trying to engage with these “high theory” texts is challenging for me, and I think for most archivists, but it is nonetheless one which inspires me to think more imaginatively about archival work, making it (I hope) a productive avenue for more exploration.

Dinshaw’s theory of amateur time comes from her book How Soon Is Now? : Medieval Texts, Amateur Readers, and the Queerness of Time (Durham, NC: Duke University Press, 2012). It’s well worth taking a look at, especially if you enjoy medievalism.

Web archiving in a virtual machine

This semester, I took a class on Digital Archives and Preservation, taught by the inestimable Dr. Pat Galloway.  One of distinctive things about this course is that students are assigned into teams that spend the semester tackling a real digital preservation problem.  This semester, groups worked with materials from the School of Information and from the Videogame Archives at the Briscoe Center for American History.  Each group researched the technical and policy challenges of their collection, worked with stakeholders, ensured the long-term stability of their materials, and ingested them into a digital repository.  So not only do we learn about the theoretical bases of digital archiving, we also have the chance to gain practical experience and contribute to the actual preservation of our digital heritage.

I was part of a group that was tasked with preserving the School of Information’s website, along with my classmates Jarred Wilson, Laura Vincent, and Kathryn Darnall.  The iSchool’s website had been undergoing a major re-design, and the new version of the site launched in March.  This was a perfect time to archive the website, as it gave us a natural “cut-off” date for what version of the site to preserve.  The iSchool recognizes the value of the website, both as a record of the school’s official policies and as a historical document providing evidence about the evolution of the field of information studies.  We were therefore lucky to have the cooperation of the administration and website administrators in archiving the site.  Early in the semester, we began meeting with Sam Burns, the iSchool’s Content and Communications Strategist and the person who has the most direct control of the website.  He was an enormous help to us, helping us understand the back-end structure of the site and assisting with many of the more technical challenges we faced over the semester.

Archived version of the iSchool website, featuring the infamous "bendy people"

Archived version of the iSchool homepage, featuring the infamous “bendy people”

From the beginning, our group was interested in moving beyond just crawling the website.  Web crawling has become the most generally accepted method of web archiving, and it is a great way to create interactive copies of web records.  However, crawling has limitations.  There are several kinds of online materials that crawlers can’t capture, including many forms of dynamic and database-drive content.  More fundamentally, crawlers only document how a website is displayed in a browser; they capture the experience of an end-user of a site, but not the many pieces that exist behind the scenes to generate the site.  Sam encouraged us to think about how we include site administrators in our designated community of users, and that prompted us to re-think the significant properties that we considered necessary for documenting the website.  Future administrators of the site might be very interested in knowing how the archived website was built, but the components they would need to study, for example PHP codes, would not be included in a crawled version.

For those reasons, we decided to explore alternative web archiving methodologies, in addition to crawling the site.  (I wrote about our struggles with using Web Curator Tool for crawling in my last post.)  The solution we decided to attempt was to use a virtual machine to create a fully functional replica of the website.  Our goal was to recreate the environment in which the website was run as completely as possible, so that users could interact with the site as it was displayed in a browser and with the component databases and file directories.  With Sam’s help, we obtained all of the files used to generate the site from the iSchool’s servers, including HTML and PHP files, copies of databases, images, and other files.  We then created a virtual machine using Oracle VM Virtual Box, placed those files inside, recreated the databases, and installed the software that was originally used to run the site, including PHP, an Apache web server, and a MySQL database.  After a few challenges and hurdles, we successfully managed to get all of the various components communicating and working together.  We then saved the virtual machine as a VMDK file (an open-source format), which we ingested in the iSchool’s DSpace repository, along with the individual component files and a crawled version of the site.

he page on the left is displayed in the Virtual Machine. The item on the top right is the PHP code that when processed by the server into HTML, and then rendered by the browser, makes the page on the left. The items on the bottom right show the database query and result which the PHP code then utilizes. Each of these components are accessible in the virtual machine.

The page on the left is displayed in the Virtual Machine. The item on the top right is the PHP code that when processed by the server into HTML, and then rendered by the browser, generates that page. The items on the bottom right show the database query and result which the PHP code then utilizes. All of these components are accessible in the virtual machine.

We were very pleased that we were able to get the website up and running within a virtual machine environment, and all of us who worked on the project believe that this is an exciting new method for web archiving.  However, we also recognize that there are considerable challenges in using virtualization for web preservation.  Obviously, this project was quite time consuming and required considerable commitment not just from the archivists, but also from the site administrators.  It remains to be seen if this method can be made scalable, or will be useful only as a boutique solution for high-value websites.  Another considerable challenge regards access, privacy, and intellectual property.  To recreate the website in the virtual machine, we obtained copies of every file used to generate the website, and almost without a doubt, some of those files include material that should not be publicly accessible for various reasons.  We did make some efforts to clean up and redact sensitive information, for example by changing the passwords in the databases to prevent breach’s of the iSchool’s security.  However, we were unable to commit the time necessary to granularly examine all of the files.  For those reasons, we chose to make the archived virtual machine and individual component files closed in the digital repository.  Obviously, though, open access would be much more desirable, and further study would be necessary to determine how to appraise and redact the files to allow that.

Despite these challenges, I am very proud of our work on this project, and excited about the possibilities that virtualization offers for expanding the field of web archiving.

To learn more about this project, download our final report or view the poster we presented at the iSchool 2013 Spring Open House.  You can also view the archived website in the iSchool’s digital repository at https://pacer.ischool.utexas.edu/handle/2081/30462

Open-source software: Expertise required?

I have spent a lot of time in the last few weeks installing software.  Or, to put it more precisely, I have spent a lot of time trying to install software.  Some of my attempts have been successful; some, less so.  The time I spent tinkering with various programs (and occasionally wanting to bang my head against a wall) has led me to reflect on the evolution of my own technical skills and the ways that software programs and documentation encode certain expectations of their users.

Two years ago, I would have considered myself a fairly competent computer user.  I was able to do pretty much everything I needed to do on a computer, but I had very little knowledge about (or frankly, interest in) what went on under the hood.  In my time at the iSchool, though, I’ve learned quite a bit more, mostly through my coursework in digital archiving and records management, but also from just being in a more technical milieu.  I’m still very much a novice, but I am at least at the beginning stages of becoming an “advanced” user.  I know the basics of how a computer works on a physical and logical level, have a very rudimentary understanding of a few programming languages (not enough to do much on my own, but enough to interpret basic code), and most importantly, I am gaining the confidence to explore and try new things (just recently I made my first foray into a Linux command line).

I explain all of that mostly to express my frustration with the way most software installation guidelines are written.  In my attempts to install various open-source software programs, I’ve discovered that installation guidelines aren’t really intended for people like me.  For example, a group of classmates and I are currently trying to install Web Curator Tool on a computer running Ubuntu.  (My good friend and project collaborator Jarred Wilson describes the website project as a whole on his blog.)  WCT has two support documents available: one is for “users,” who presumably don’t need to understand the rather complicated back-end, and the other is for “system administrators,” who are assumed to already have a significant set of technical skills, to be able to do things like configure a database through the command line and deploy programs on a web server.  That’s understandable, but it excludes people like me, who need to install the tool but do not have an extensive technical background.  Before we can even get to installing the program itself, my group has had to spend hours teaching ourselves how to use the command line, how to deploy programs in Apache Tomcat, and other “preparatory” issues.  While that is frustrating and time-consuming for us as students, we still have access to resources and support staff who are willing and able to help us figure it out.  (In fact, one of our IT staff confirmed to us that the WCT installation guide was particularly unhelpful.)  It is very easy to imagine how issues exactly like this make it extremely difficult for small institutions or lone arrangers to take advantage of the really exciting open-source tools that are being developed.

One of my other projects this semester has actually been to try and create a beginner-friendly installation guide for the archival management program ICA-AtoM.  Working in a group, five of my classmates and I installed ICA-AtoM on our own laptops (including MAMP or WAMP software to create a server environment).  Then, we extended and simplified the installation instructions provided with the software, to explain exactly what needs to be done to run the program, step-by-step.  Writing those instructions has been a fascinating window into both how many different components are necessary to get a program up and running, and how difficult it is to clearly articulate the steps needed to get each of those components running in turn.

The experience of creating the ICA-AtoM installation guide has helped me understand how difficult and time-consuming it can be to create beginner-friendly documentation.  However, I believe that it is necessary work.  For open-source software to be truly “free,” more work needs to be done to make it accessible to a wider audience of users, including those without a strong technical background or access to IT support staff.  Combined with that, I think the archival profession and society at large need to encourage computer literacy at a much higher level, to begin decreasing the gap between “experts” and “beginners.”  As someone who is still working to bridge that gap, I know how difficult that is.  But our work (and our world) is only going to continue getting more technical, and we should all be able to participate in it fully.