How and why to cite museum specimens in research

About once a year, I take the time to comb through the Internet for references in books and journal articles to museum specimens in the collections I manage. Despite the fact that I give all the researchers who visit the collections instructions for keeping the museum informed if/when their research gets published, sometimes it doesn’t happen. Sometimes it’s an innocent mistake: it can be a decade between data collection from specimens and publication and in the tweaking of manuscripts remembering to let the museum know about publications citing their specimens can drop off the priority list. Sometimes however, it seems like researchers failed to listen to what those annoying museum people said and just ‘forget’ or just make it up entirely.

Recently the researchers and collections managers at the Oxford University Museum of Natural History have undertaken a big drive to try to find orphaned citations of our collections going back to 2010 for our reporting cycles and with dogged determination to leave no stone unturned, we’ve managed to find an order of magnitude more citations that weren’t previously linked to the collections.

It’s really fundamental to the scientific process, the future or museums and the legacy of biological sciences that hypotheses and research can be repeated and that we can trace the theory back to the evidence that leads to new conclusions being made. It’s really important to properly cite specimens and here’s why and how.

The End Is In Cite. What is citation? A citation is when a specific museum specimen is mentioned in a paper and can be for a number of reasons. In taxonomic work, it may be that the specimen(s) is the basis for a new species and the characters shown by individual museum specimens define that group. It may be that molecular samples taken from museum specimens have been analysed to work out the relationships or evolution of entire groups of organismal life. It can also be that individual specimen shows something completely unique, be it a bizarre colour variant, evidence of behaviour, pathology or extreme size. Citing a specimen should be as formal as referencing previous literature and should point future workers to the exact objects that were examined so that they can be rechecked or analysed. Normally this is a reference to the specimen’s museum or accession number.

Accession numbers. An accession number one of the key pieces of documentation we use in museums. For accreditation we have to have policies, documents and procedures for how we generate, use and associate specimens with their accession number. Of course, useful specimens are more than just the physical object, and we need to keep all the associated information about them together. That can be field notebooks, loan forms, photos, scans, samples, legal documents, old labels and of course publications in which they are cited. It’s not practical to keep all this information physically with a specimen so instead we link it all together with the accession number. Accession numbers should be the unique identifying number that a museum uses to hang all the information off of and are normally made up of two parts, a number and a prefix. The number part is usually the museum’s internal reference and the prefix is one that is used to separate it from all the other numbers in all the other museums. So at Oxford University Museum of Natural History, a mollusc would be OUMNH.ZC-M1234, a bird OUMNH.ZC-B1234 etc. In the UK there is a national list of accession number prefixes  so in theory there isn’t overlap. If you found a random specimen in the street, from the number you should be able to work out which museum it came from. However, not all museums use theirs which I’ll get onto later. So if you use a specimen in research be it sample, examine, figure, scan or measure, the gold standard is to reference the accession number so that it can be traced back to a specific rhinoceros, shrimp, dinosaur model or newt.

Don’t make the number up. In the recent exercise to update the citations of our collections, there were nineteen and counting versions of our accession numbers and half that many variants of the name of the Museum. Some of them were in papers from this academic year. Variants include OUM, ONM, OMNH, HOPE, HED, ONHM and names of the museum have been: Oxford Museum, Oxford Museum Natural History, Oxford University Museum and so on. This not only makes it hard for us to find citations in the event we’re not told about them but making numbers and sometimes museums up is sloppy. It may not seem a huge deal now but I’ve run into acronyms and numbers which can’t be traced to an institution existing or closed. Not being able to go back to the original material can lead to:

Trash science. Okay so mis-citing a number might seem like no big deal now but it creates confusion in the future and as a collections manager, weekly I get enquiries for specimens that were published back in the 20th, 19th and 18th Centuries and if the accession number is wrong, it can make tracking that specimen down difficult or impossible. That’s when the specimen number is even referenced in the first place. A number of papers, didn’t cite numbers at all. ‘Seven owls’ or ‘material’ was examined and here’s the new hypothesis nobody can substantiate. That’s trash science. Scientists bang on all the time about the scientific method and repeatability. It’s one of the main reasons palaeontologists are against expensive auctions of important material that end up in private collections (not any more it seems) because research on them is not repeatable if access isn’t guaranteed. If you don’t cite the specimens you looked at (or even took samples from), nobody can check your work or your missed assumptions and biases. This is particularly pertinent for collections I look after, because there’s often very little awareness of the biases in a collection, variation caused by preparation method, ‘pollution’ of wild samples with captive bred animals as well as more general temporal or geographical biases in sampling that vary from museum to museum. Research becomes less objective and more subjective if the steps in the process and original material cannot be retraced. As data is getting linked ever more, it’s really important to be able to go back to the primary evidence, especially when genetic data from a limited number of samples is ending up in phylogenies (and contamination can be a pain).

Waste of Time. It can also be a waste of time as biologists working on applied research sometimes get bogged down with slogging through the busy work of historical taxonomy. Currently, this problem is particularly rife in the 19th Century, plagued with dodgy and vague type descriptions that can’t be traced back to a specimen. I’ve had researchers working on poison evolution, subspeciation of insular reptiles and my entomologist colleagues have researchers trying to piece together historic ranges of species which have been misidentified in the past. By not citing specimens properly, this problem is perpetuated.

Leach's description of the type specimen of Loligo banksii

Leach’s original description of Loligo banksii roughly translates as normalish arms with rhomboid fins, like you know, most squid. (Leach 1817).

From reality to ‘data’-  Tweeting some of my frustration with citations, Adam Jenkins linked me to this great blog post from 2011 by Roderick Page on Dark Taxa in GenBank (the comments are worth a read too for an insight into the stamp collectors vs sequence scrutinisers). In brief, there were, and I’ve noticed continue to be, an awful lot of animal genetic data which isn’t properly identified (or linked to an original specimen). This data is then widely used in further genetic work and for most species there are still precious few sequences which get used often. Having seen colleagues struggle with this, it can be impossible to tell rogue sequences from closely related species’ genetic data alone, especially if there are only sequences from a few individuals in the first place. If this sequence isn’t linked back to an original specimen (and so many aren’t) that’s deposited in a permanent and accessible museum or tissue bank, that sequence can never be rechecked to see if the original identification was wrong or if it’s a genetically distinct organism or (probably) a contamination error. Having been involved with a number of “we’re clearing out the lab do you want this stuff” events in the past, many of those samples that never left the lab probably don’t exist anymore.

Reference Rot. Less an issue now that most reputable journals encourage electronic publishing of supplementary data with or in publications and funding bodies stress the openness and permanence or raw data but in the past it was possible to ‘host’ supplementary data elsewhere. Some authors prefer to tuck methods and materials citations in supplementary data rather than in the body of the paper itself and I’ve come across a few where, the data was stored on some university webpage, which, three website restructures later, is no longer available. Chasing down former PhD students and heads of labs to only find out that the backup data died with a laptop. Again this makes your science trash as there’s no way to try to repeat the science and compare. It’s also impossible for museums to retrospectively link those citations to the specimens that were used in them if the clunky museum numbers are only referenced in the supplementary data.

You will publish, you will perish. Publish or perish is sadly the mantra/motto for many academic scientists and is seen as necessary for career progression, securing the next grant and occasionally advancing science. However, in museums we take a longer term view and preserve your legacy of research on our specimens too. We have specific documentation systems that can record a potentially infinite amount of information about our specimens so we don’t lose those links and that history and that research is repeatable 2, 4, 50 0r 100 years later. Our time scales are often ‘perpetuity’. We record this information because as I mention above, hundreds of years later people are still referring back to scientific papers of the past and partly because your research then becomes part of the history of the collection and the museum. It helps us prioritise what to preserve and it helps us with modern research enquiries. I have lost count of the number of enquiries I get which end in a dead end because I couldn’t find ‘the rather fine bat’ that was once described in the museum.

Help museums help you. For biologists who work with museum material, you may see us as the ‘annoying forms people’ who won’t let you saw the only known specimen of species X in half so an undergrad student can do a speculative analysis. In museums we fight extremely hard on your behalf to remain accessible, often at no extra cost to students and researchers at all levels be it specimen access, specimen loans, image reproduction or assistance with identification. Particularly now, where there is pressure to generate income across museum services, we fight hard to keep that access as inexpensive as possible. For many natural history museums, although we work with a range of audiences, scientific publications using our specimens is one of our key measures of success, justifying our staff and existence to local, national and international funding bodies. We’re ecstatic when publications using the collection come out because it’s one of the reasons we are there in the first place. We’re actively keen to promote your research on our collections to our audiences through exhibitions, online and social media. All we ask for in return (and often this is on one of those forms you sign) is that you let us know when you publish, give us a copy of the paper or book and cite the specimens properly. We’re happy to prioritise giving accession numbers to new specimens and even check the numbers (and the name of the museum) are correct. In the sad event of a museum closure, we can prioritise ensuring that cited specimens are transferred to another publicly accessible institution. If you don’t cite our specimens or let us know you’ve cited them, it’s not guaranteed we’ll know.

We’re getting better. Museums have problems. Many of us have backlogs. In the ‘Dark Ages’ for natural history museums, from the 1930s through to the 1980s, thousands of specimens and specimen data were lost, destroyed and misplaced. Many museums are run on a shoestring. It can be hard for researchers to find museum specimens and there’s room for improvement responding to your occasionally less than helpful enquiries. Many of our collections still don’t have accession numbers but we prioritise doing this for specimens that will be published and we actively collect cited material (ideally between acceptance and publication) as a repository of important material to be made available to the worldwide community of scientists (and all of our audiences). I give every visiting researcher a form with all the relevant information on and explain how important properly citing material is and I encourage colleagues who work with researchers in other museums to do the same thing.

Check your citations. In the most recent exercise of hunting down publications there’s a large variation in the visibility of references. For publishing researchers, I’d really recommend taking the time to see if you can find your own publications through free and paywalled portals. At Oxford University, I’m fortunate to have the University’s access to libraries and journals. Even with that privilege it’s very interesting to see what can be found through search engines, archives and publisher’s websites. In some instances, it was impossible to find fairly recent papers even when I had the specific title to search. In other instances, references were only caught on people’s staff pages, blogs or research profiles. Again, in the age where citations and impact are important metrics it’s really worth checking that your papers aren’t ‘fired off and forgotten’ as even with the fairly strict search criteria of finding research on Oxford University Museum of Natural History specimens between 2010 and 2016 a journal by journal approach it sometimes more fruitful than through scholarly portals. Important also to note that many expensive or smaller journals do not come up at all through Google Scholar even with strict search terms.

Editors note. This is also a plea for journal editors pick up sloppy citations more or even update guidelines about referencing specimens. It’s equivalent to not referencing properly but seems to get through more than it should, particularly in open access journals. I’ve seen photographs without citations, unrecognisable institution names, phylogenies with uncited sources of information and even plaster casts and models cited without mention of the fact that it isn’t biological material. As I said above, most museums highly value research undertaken on their collections so do check in with your friendly neighbourhood museum curator to check that the museum information is present and correct and make your science even better.

This isn’t to say that there isn’t great research practice out there and this is a shout out to those researchers who do diligently cite material, deposit their material with us in a timely fashion, keep us informed with progress on their research and send us copies of their publications for us to archive.

References

Leach, 1817, Zoological Miscellany; being Descriptions of New or Interesting Animals, 3(30): 137–141 [141].

11 thoughts on “How and why to cite museum specimens in research

  1. Enjoyed that article. It was lovely to read about the perspective of museums. This – in my opinion – does not reflect many issues I experience.

    Especially molecular biodiversity research has a huge collection bias – only researches museum collections. Fair enough, but not all museums are well-organised and actually care about their collected biosamples much.

    Samples get lost, are being destroyed, and researchers who had access make parts of them disappear apart from failing to have proper expertise to identify the specimens according to scientific (rather than religious or money-driven) principles.

    Museums have become places where science is being blocked, held back and blackmailed in my area (mollusca). Samples are hidden away by ambitious curators who claim taxonomist expertise rather than curative ambition/skills.

    New specimens of new species are systematically excluded from being officially collected. And the collective academia is looking away and keep playing the pseudo scientific game.

    Anyway, the question to you would be: are there efforts to have Museum’s Access Numbers and Vouchers of all collected material available in a global database? Not just individually for well-funded/established organisations.

    Furtermore, in times of political funding uncertainties, how can the loss of seemingly worthless collections be avoided?

    When even Museum CEOs openly admit dubious commercially focussed (and sometimes borderline illegal) collection efforts, how long – I ask provocatively – will we have and need museum collections (and curators by the way)?

    Thank you for your article.
    Regards
    Patrik Good

    P.S. Written from my mobile and hence not proofread and possibly not quite stringent which I apologise for.

    Like

  2. Patrik thank you for your comment and the problems you point out are pressing.

    There are efforts to make museum specimens available online, ideally in one portal, but as you point out it tens to be larger museums that experiment with a potential portal and then move on. In the UK we’ve been trying to redress this issue with a project called NatureData, to create a portal that allows scaleable data to be added (so starting with who has ‘frogs’ to families to species to specific specimens) however, as you point out the key problem is always the original data doesn’t exist yet. In the past this has been explained away because we have ‘too many specimens’ which is not satisfactory to me.

    The project is also currently without resourcing.

    The other question I have about creating these portals is where is the user base? In my experience working at two university museums, many researchers start with the museums that they know and aren’t using online databases, where they exist. Coupled with that the fact that many museums are still not very user friendly for researchers (even finding contact details is prohibited at some institutions) and we’re missing those connections. I cover this a bit in my blog post about finding biological specimens. GBIF is probably the largest of these portals but not many researchers know it exists AND there are only three(?) UK museums with some data on there. Myself and others are pushing to train new generations in the art of using museums and trying to encourage them to think beyond the Natural History Museum when it comes to looking for samples.

    The loss of collections is something that the Natural Sciences Collections Association works hard to prevent http://www.natsca.org/ and it is difficult work as often closures of collections are sensitive and secretive. Its not so much wholesale collections closures which cause the issues but the loss of subject specialist staff at institutions. So although the collection may be mothballed it’s effectively ‘inaccessible’ without a ‘curator’.

    Personally, I think we will always need museum collections and we need growing ones at that, however, the fate of natural history museums as research institutions is tied to that of whole organismal biology, which is ever a shrinking world despite, taxa by taxa, us knowing very little about any given species and it being increasingly important to have ‘organismal people’ in the field and lab.

    Like

  3. A bit of a tangent Mark but I wondered how you automate (if you do) the “combing through for references” to collection items ? Or is it just a case of knowing who has contacted you and what journals are likely to have citations, and then lots of very manual trawling through them?

    Like

  4. Hi Richard, sorry for the moderation delay. A lot of researchers do keep us up to date and we’re fortunate enough to have taxonomic experts who keep on top of the publications in their group.

    For the yearly search, I hit up Google Scholar, Web of Science and the portals for the major science publishers. Some of this tools are useful, some have really limited search engines. E.g. Google Scholar has issues if you run a standard search and organise by date. The whole thrust of the post above is that there isn’t one or two useful terms to search as the name of the museum or the accession number format is incorrect. It also makes setting up notifications quite tricky.

    Many ‘museological’, publications don’t appear on these, presumably because they are too small of not ‘high impact’ enough so I manually search the likely suspects. There are a handful of museum publishers too that I search under.

    Unfortunately, as we get so many researchers and because the gears of science publishing can be quite slow it’s not possible to try to track the research from the researcher (they also move about quite a bit).

    I normally set aside an afternoon a year and without fail, iy has always yielded new research on the collections the museum was previously unaware of.

    Like

    • Damn, I was hopeful you had some magic search term you were using to retrieve them all in one easy query! I ask because I’m looking into the need for us (V&A) to setup an institutional repository, and am assuming it will be a future requirement for museum data would fit into that ‘somehow’. Given the attempts to give research data citable identifiers (e.g. https://www.datacite.org/), I’m puzzling out where collections data could/would fit into that model. But obviously adding another identifier doesn’t solve the basic problem of people not bothering to put in the citation in the first place.

      Like

  5. Hi Richard. Even for a quite distinctly titled museum you’ve got, the V&A (502 Google Scholar 2016 publications), V and A (82), Victoria and Albert Museum and Victoria & Albert Museum (2120) and not much coming up for combinations with VAM in it….

    Like

  6. Pingback: Links 6/28/16 | Mike the Mad Biologist

  7. Pingback: Recommended reads #81 | Small Pond Science

  8. Pingback: Which is larger, giant squid or the Moon? | Fistful Of Cinctans

  9. Pingback: Documenting Cephalopods Part 1 It Started With A Spreadsheet | Fistful Of Cinctans

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.