Sep 062016

University of Edinburgh/RBGE student David Bell, studying for the Masters degree in the Biodiversity and Taxonomy of Plants; thesis submitted August 2009.

Supervisors: Dr David Long and Dr Michelle Hart.


David used plastid DNA barcode markers rbcL (from 34 accessions) and psbA-trnH (from 36 accessions) to look at the four species of Herbertus in Europe, H. aduncus subsp hutchinsiae (British Isles, Norway and Faroes), H. stramineus (British Isles, Norway and Faroes), H. borealis (Scotland and Norway) and H. sendtneri (European Alps).

In addition to the four recognised taxa, David’s study identified a fifth species, later named as H. norenus, that occurs in Norway and the Shetland Isles.

A paper based on David’s MSc thesis work was published in Molecular Ecology Resources in 2012.

Herbertus norenus, photographed by David Long

Mixed sward including Herbertus norenus, photographed in Shetland by David Long


Bell et al. 2012, MER



Other student projects at the Gardens:

Student projects at RBGE: DNA barcoding British liverworts: Lophocolea

Student projects at RBGE: Barcoding British Liverworts: Plagiochila (Dumort.) Dumort.

Student projects at RBGE: Barcoding British Liverworts: Metzgeria

Aug 302016
EDNA label printer

The EDNA label printer in the office

Over the years, many different people have used the molecular laboratories at RBGE, to work on a multitude of projects on a multitude of plants and fungi. Some are staff members who stay for decades, others students who are only in the lab for a matter of months. Every time DNA is extracted and used in a molecular project, the amplified gene regions are processed and then the plastic tubes that they were in are sent for recycling – but the extracted DNA is kept in a DNA bank, in case it is needed for further research. Logistically, managing this DNA can be problematic. Scientists like to use their own numbering systems when they’re working (mine used to be one of the commonest – my initials followed by consecutive numbers, a system which worked perfectly until some of my extractions ended up in the same freezer as extractions by Dr Linda Fuselier), something quick and easy to scrawl onto the plastic tubes. This can link to collection information written in a lab-book, including who collected the plant, what date it was collected, and what country it came from. However, as people move on, and as the years pass, it becomes increasingly difficult to find any particular sample or set of samples, particularly when several sets of people share the same initials – and this is compounded by having to rummage through boxes of frozen DNA samples being kept at either -20° or -80°C. Few places at the Botanics are less pleasant than the dank room that contains our -80°C freezers!


Printed labels and EDNA tubes, Lab 32

Printed labels getting stuck onto EDNA tubes, Lab 32

The frustrations associated with rooting through inconsistently labelled DNA collections led Dr Michelle Hart and Alex Clarke, in 2006, to instigate a standardised format for DNA labelling, with samples of DNA identified as part of the RBGE DNA bank and assigned EDNA numbers, the format of which consists of the year the DNA was banked, followed by a multi-digit identification number. For example, the last EDNA number that we have issued is EDNA16-0045851, for DNA extracted from the moss Weissia controversia. Due to uncertainties about institutional databases, in its early years the DNA bank was curated through Excel spreadsheets; this was revamped and upgraded in 2011 to the database that we still use today. Information about the methods and date of DNA extraction, the material’s collector, and the place of collection are all stored and easily retrieved, critical information if the DNA is going to be used to provide data for future publications. The EDNA number stays on all downstream files that are created from the DNA – lab books, raw sequence files, and it is also included as the isolate number in GenBank submissions – meaning that all molecular data generated at RBGE is still valuable after people have moved on and lab books have been mislaid.


EDNA tube

A labelled EDNA tube ready for the DNA sample, Lab 32

As to what happens to the actual DNA extraction, long-term storage involves transferring the liquid into a small barcoded and labelled tube in a lockable and numbered 96-tube rack, which will be kept on a labelled shelf in a -80°C freezer. The system is not perfect, however – banking or recovering the DNA samples still involves a trip to our mildewy bank room…


Pipelling DNA samples into EDNA tubes, Lab 31

Pipelling DNA samples into labelled EDNA tubes, Lab 31

Aug 232016

When people extract DNA in the RBGE molecular lab, we insist that it’s given something we call an EDNA (Edinburgh DNA) number. This links to a database that is internal to RBGE.

evilednaThe EDNA number is used for all internal molecular lab processes – it’s written on the tube of DNA, used to refer to the sample in lab books, and part of the file name for all DNA sequences that are generated from that sample. Using this standard system across all projects means that we can keep track of what DNA we have, we can store it in a way that makes it relatively easy to retrieve, it can be used in other projects, and critical information like which specimen voucher is linked to a DNA extraction is not lost if people move on from RBGE.

Getting an EDNA number involves filling in a simple Excel spreadsheet with some basic collection information, and uploading it to a database. The Excel spreadsheet is accessible to RBGE lab users on an internal server (DNA, Molecular lab registration forms, EDNA (DNA), EDNA_submission_sheet_v01), and has two sets of fields, required and additional. If anything’s missing from the required fields, an EDNA number will not be issued, whereas the additional data is recommended but not essential… However, the more fully complete the data entry is, the faster it is to use it to generate GenBank submissions and publication voucher tables, justifying spending a little extra time on getting the forms completed.

Two points to remember when filling in the spreadsheet are not to use special characters, and not to make any of the entries too long, as there’s a maximum character number.



Taxon name: this should not have authority information (Bellis perennis L.), just the genus and specific epithet (Bellis perennis).

Collector name: this cannot begin with an initial (J. Smith) as it will be rejected by the database; either use a full Christian name (John Smith), or put the surname first (Smith, J.).

Collector number: if there is none, s.n. is accepted.

Country code: two-letter standard codes; when filling in the spreadsheet, there is a tab with all the codes that you can look up (e.g. DE for Germany).

Material type: drop-down menu choices – fresh, frozen, herbarium, seed, silica gel dried.

Extraction type: drop-down menu choices include tissue maceration type, e.g. pestle, or mixer mill, and chemistry used, e.g. CTAB, Plant DNeasy minikits, Qiextractor.



User DNA ID: this is the number that was given to the extraction in the lab; it’s extremely useful to have this for various troubleshooting in the lab – it can help match accessions to tubes, sort out issues with sample order, etc.

Extraction Date: entered in standard format year-month-date. Again, this can be useful for later troubleshooting, e.g. for separating batches of extractions by date, in case something went wrong on a particular day.

Herbarium barcode: this is ONLY for RBGE herbarium barcodes, not those from other institutes. If this is available, filling this in will propagate specimen data from the herbarium database. However, the required fields still need to be filled in.

Living Accession Number: this is ONLY for RBGE living accessions, not those from other institutes. If this is available, filling this in will propagate specimen data from the living collection database. However, the required fields still need to be filled in. The qualifier letter should not be filled in here.

Living Qualifier: this field is for any alphabetical character after the Living Accesion Number.

Silica Gel Box Number: this field is best left empty unless silica material came from a box numbered in the same format as “SGN12345”.

Sample note: free field, but there is a limit on how many characters are allowed, so should be kept short, and free from special characters. It may be useful to note e.g. if the extraction was from sporophyte versus gametophyte tissue, or flower versus leaf.

Location: free field, but there is a limit on how many characters are allowed, so should be kept short, and free from special characters.

Coordinates: free field, but there is a limit on how many characters are allowed, so should be kept short.

Decimal longitude:

Decimal latitude:

Collection Date Verbatum: this is for dates that cannot be turned into the correct date format, e.g. “Spring 1920”, “October 1976”.

Collection Date: entered in standard format year-month-date. This can be very useful in relation to DNA quality. If this is filled in, there is no point also filling in the Collection Date Verbatum field.

Note: free field, but there is a limit on how many characters are allowed, so should be kept short, and free from special characters.


Once the EDNA form is filled in, it can be uploaded to the EDNA database, which is available to users at RBGE who have a Username and Password.

Once logged on, the tab ‘Importer’ becomes highlighted; at the bottom of the Importer screen is a “Load” button.  The information in the excel sheet should be pasted into the ‘Load data’ window, and mapped to the fields. This will leave four fields that need to be filled in manually, three required fields: User (the lab user’s name, available from a drop-down list); Project (again, from a drop-down, e.g. MSc, barcoding, Leguminosae); Contact (a permanent staff member who will take long-term responsibility for the project, chosen from the drop-down list) – and one optional field, EDBANK Format (how the DNA will be stored long term – Plate, Strip or Tube; for most phylogenetics projects DNA will be stored in individual tubes, while for some population genetic project it will be stored in strips or plates – check with the molecular lab staff if unsure which format to chose).

After this information is filled in, the tab “Validate” becomes available. The entered data is screened for things like collector names that start with initials, accession numbers, dates, latitudes and longitudes that are in the wrong format, or other errors. If any are found, then these need corrected in the excel spreadsheet and the information all needs reloaded and re-entered. If there are no validation errors, the “Import to EDNA” button becomes available. At this point, the data will either successfully import, or other errors will be identified (e.g. non-standard characters, or too many characters). Unfortunately errors identified at this later point only stop EDNA numbers being generated for individual samples rather than for the whole batch, and it is not possible to cancel the issued EDNA numbers. This means that, for example, if entering a plate of 96 DNA extractions to EDNA, it’s quite possible for some samples in the middle of the plate to not be assigned a number. Obviously this becomes a sample labelling headache that is optimally sorted by redoing the entire batch to get consecutive EDNA numbers for all the samples, although this will lead to apparent duplicates of samples in the database. Molecular lab staff should be informed of redundant numbers, so that the duplicates are not also assigned places in the DNA bank.

When the numbers have been generated, they can be downloaded from the database by clicking on the “Tasks” tab, and the “As Spreadsheet” option – this will return all the information that has just been entered, along with the EDNA accession numbers for each sample.


See also:

The RBGE DNA bank

Jan 302016

This last week I’ve actually managed to spend a bit of time in the lab, trying to get some gaps filled in a DNA barcoding matrix for simple thalloid liverwort Aneura. David Long and I are heading off to Trondheim in just over a week to combine our data set with one generated by Ana Maria Séneca Cardoso, working with Lars Söderström and Kristian Hassel at NTNU.

A fridge-full of DNA at RBGE

A fridge shelf piled with racks and plates of DNA in the RBGE PCR lab

Many of the DNA extractions that I have been trying to amplify are old (with a very few that were extracted 14 years ago at Southern Illinois University). Most of them have already been tried, and have previously failed to amplify, for the four selected barcode markers (three plastid genes, rbcL, rpoC1 and matK, and one plastid intergenic spacer region, psbA-trnH; a nuclear marker, ITS2, was originally included, but proved difficult to get good sequence data from). Most of the DNA extractions I’ve needed were scattered across a large number of Edinburgh DNA bank (EDNA) plates, while some of the rest of it had never been aliquoted out of the original QiaXtractor plates.

Our hard-working PCR machines wait for samples in the molecular lab

Our hard-working PCR machines wait for samples in the RBGE molecular lab

Because the amount of time needed to chase down the DNA samples was more than the amount of time needed to set up the reactions, and because my default protocol has changed, over the course of this project, from using CES as a PCR enhancer to using TBT-PAR as a PCR enhancer, I decided for three of the loci to test the amplification with both CES and TBT-PAR, building on from a previous Botanics Stories posting. (The exception was matK, for which we use a different polymerase, Invitrogen’s Platinum, and 5M betaine as a PCR enhancer; we have found that this gives better sequence reads than amplification with a standard Bioline taq does.)

getting ready for PCR - reagents defrosting on ice

Getting ready for PCR – my reagents defrosting on ice

Reactions were set up in 20 ul, using exactly the same reagents (Sigma water, 5x buffer, magnesium, dNPTs and forward and reverse primers, with 1 ul each of the Aneura DNA extractions), with the exception of the 4 ul of either CES, or TBT-PAR, per sample.

Inside the laminar flow hood and ready to go

Inside the laminar flow hood and ready to go

The PCR products were all run out on standard 1% agarose TBE gels, at 80 volts, for 40 minutes, stained with SybrSafe, and visualised under a blue light, to test for amplification success. As a size standard, 3.5 ul of an Invitrogen 1 kb ladder was loaded at intervals on each gel.

Several of these gels are shown below. In these images, DNA is stained so that it fluoresces in bright blue or UV light. The samples have been loaded into the gel in holes, or wells, that can be seen at the top of, and at regular intervals down, the gel, and migrate through an electric current towards the positive electrode that would be at the bottom of the image. Smaller fragments move faster through the gel, meaning that samples can be separated by the lengths of the DNA fragments.

The comparisons of amplification success with CES and TBT-PAR are unfortunately ambiguous.

For the psbA-trnH region, using TBT-PAR was more successful than using CES – as can be seen particularly clearly in the second row down, where only 3 of the 8 extractions amplified with the CES additive, but all 8 amplified with TBT-PAR.

Aneura DNA amplified for psbA-trnH region: left hand side - with CES additive; right hand side - with TBT-PAR additive

Aneura DNA amplified for the psbA-trnH region: left hand side – with CES additive; right hand side – with TBT-PAR additive

For both rbcL and rpoC1, it’s harder to get a clear picture. Some samples amplified with one additive rather than the other, while most samples amplified with both.

Aneura DNA amplified for rpoC1 region - first two rows with TBT-PAR additive; second two rows with CES additive

Aneura DNA amplified for rpoC1 region – first two rows with TBT-PAR additive; second two rows with CES additive

For a set of the rpoC1 amplifications, although all samples amplified using both additives, the bands from the reactions with CES (the three lowest rows in the gel image) are rather brighter than those from the reactions with TBT-PAR (the upper three rows), meaning that there is more amplified product in them. However, the CES bands do appear slightly more smeary, so it may be worth comparing the quality of sequence data from both sets of reactions as well as just considering amplification success.

The gels for the rbcL samples (shown below) are harder to interpret – the consistent bright bands are not the PCR product that I am looking for, but represent an artefact of the reaction known as “primer dimer”; the PCR product is the second slower (and therefore, longer) DNA fragment that is sometimes present. There seem to be more samples that have amplified with the CES additive, although a few samples that have failed with CES have instead amplified with TBT-PAR, meaning that in this instance, having used both protocols in parallel will allow me to generate DNA sequences from more accessions of Aneura than I would have been able to, had I just used one PCR protocol.

Aneura DNA amplified for rbcL plant barcode region with CES additive

Aneura DNA amplified for rbcL plant barcode region with CES additive

Aneura DNA amplified for rbcL plant barcode region with TBT-PAR additive

Aneura DNA amplified for rbcL plant barcode region with TBT-PAR additive (2 gel images)

160128 Aneura rbcL M745 TBT 7to9 crop

The only recommendation that I can think of from this is that, if time is pressing, the DNA samples are difficult to access, and you need as much amplification as possible as fast as possible, do two sets of PCR reactions, using both additives.

This does, however, have the unfortunate side effect of doubling the cost of your PCR, as well as giving you twice as many samples to load onto your gels…


See also: Botanics Stories: Sparking additions in the Molecular Lab.

Apr 122013

It’s a confusing world out there – betaine, DMSO, bovine serum albumin (BSA), trehalose, glycerol, formamide – the list of things that you can throw into a PCR to make it work better (or work at all) is long, and to make it even worse, there’s no need to stick to one additive; sometimes two or more can make the difference between getting no bands and getting bands on your gel. Then, of course, you can vary the amount of your selected additive/s (and when you’re done with that, you can vary the amount of magnesium, or polymerase, or template DNA, or all of them, and then you can vary the reaction times and temperatures; in fact, it would probably be possible to spend the rest of your life just optimizing a single PCR, if you were so inclined).

The additives themselves are a little like mops and irons. The mops (e.g. BSA, Tween-20) mop up ‘bad stuff’ that you’ve inadvertently added to your PCR mix along with your genomic DNA, and the irons (e.g. DMSO, betaine, trehalose) weaken bonds holding DNA together and straighten out DNA that has strong secondary structure, so that the polymerase enzyme can get in and work better. Although this applies particularly to loci like the nuclear ribosomal regions, where the DNA sequences are full of the stronger G-C bonds, even plastid regions that are not considered to have very strong secondary structure and are biased towards weaker A-T bonds often amplify better if you throw a few ‘irons’ into the mix.

At RBGE something called Combinatorial Enhancer Solution (CES) has become one of the most popular additives. The recipe we use contains betaine, DMSO and BSA, while Ralser et al.’s original published recipe also includes dithiothreitol (DTT). For many loci our default PCR recipe involves adding the pertient amount of a 5X CES stock and our success rates from these are usually rather good. However a couple of months ago a new enhancer solution, TBT-PAR, came to our attention. This solution contains trehalose, BSA, Tween-20 and a bit of Tris-HCl, and we have subsequently handed out trial aliquots to any of our lab users who are prepared to give it a go.

And how have we found it? Well, while it would be lovely to present a nice scientific study with replicates and statistical significance, showing the relative benefits of one additive over another, the more I thought about it, the more complicated it seemed – we already know that some additives work better for some loci, and other additives work better for others – so which loci should we try to amplify? Some additives also seem to work better for some taxa than for others, so how many taxa should be included? Even for a standard PCR, the amount of magnesium and the cycling conditions can also make so much difference – should these be tested too? Of course one would need more than one replicate per sample. Lastly, it’s not enough to just see bands on a gel to know whether the additive is successful; everyone who has put in their time at the gel-front will have either experienced or heard anecdotes about beautiful bright bands that simply refuse to sequence, and this is often blamed on PCR additives. Therefore each band should be Sanger sequenced in at least one direction. So, by now we’re talking about the sort of major project that eats up time and money, and recommendations resulting from which may only be applicable to one particular taxon for one particular locus. As you can probably tell, I quickly talked myself out of even starting.

However, that’s not quite the end of the story. What I have done is taken a set of Nepeta (catmint) DNAs from RBGE Honours‘ student Sarah Carlton’s work. These were extracted from herbarium samples of various ages (from 1931 to 2012), and compounding the problems of degraded DNA in this sort of material, mints and their relatives are known to contain secondary compounds that can inhibit PCR. We had initially tried to amplify these samples for the nuclear ITS region using the CES additive, with low success (there were bands for 12 out of 24 samples, but full length good quality sequences were only generated for 3 of these). Using the PCR product from this initial amplification for a second round of ‘nested PCR’ gave us many more bands – but on sequencing it became clear that a proportion of these were due to contamination and not from Nepeta at all, which then cast doubt on other sequences generated the same way (perhaps they were Nepeta, but there was no way to be sure they were the right Nepeta). As a last-ditch effort, I repeated the reactions using TBT-PAR instead of CES. The PCR  success rate shot up, and good quality sequences were generated for 18 of the samples (including one from 1931 and two from 1952).

PCR amplification of herbarium DNA using TBT-PAR additive

PCR amplification of herbarium DNA using TBT-PAR additive

We routinely generate molecular data from a huge taxonomic array of organisms, including flowering plants, conifers, ferns, liverworts, hornworts, mosses, diatoms, the fungal and algal partners in lichens, rusts and other fungi. Using TBT-PAR I have obtained sequences from Gesneriaceae (for nuclear ITS and plastid spacer psbA-trnH) and Fossombroniaceae (for plastid gene rbcL) that did not amplify using CES and have also heard promising noises from colleagues working on lichens, mosses and Begonia , who are now getting PCR products from DNA that was previously unamplifiable. On the other hand, for Castratella (Melastomataceae) it seems as if PCR success is similar whether CES or TBT-PAR is added.

So, my unscientific gut-feeling-for-now advice for sub-optimal plant DNA  is currently: sling some TBT-PAR into the mix first time around, and if that fails, give CES a go – unless, of course, you have another methodology that works better for you…





See also:

A mixed message on PCR additives in Aneura