Handbook

Search Handbook
 

Taxon ID Trees for Validation

The Taxon ID Tree on BOLD is a useful tool to identify problem sequences. Seven cases are described below.

Taxon ID TreeTaxon ID tree with seven problematic scenarios highlighted

Case 1: False outgroup resulted from a contamination

An outgroup may be caused by contamination or it may be a real phenomenon resulting from a genetically distant taxa. The only way to know if an outgroup is the result of a contaminant is by comparing the nucleotide sequence to the BOLD ID engine database.

To run a sequence against the BOLD ID engine:

  1. In the Project Console, select View All Records
  2. Select the Process ID of the outgrouped record to open the Sequence Page
  3. In the nucleotide sequence box, select Species DB (refer to the section on BOLD ID engine for information on the other databases)

sequence PageSequence page with the nucleotide sequence and database links

The Specimen Identification Request window will appear illustrating the top similarity matches as illustrated below. When the top match is at 99% similarity or higher and it does not agree with the taxonomic name provided, it usually indicates a contamination.

ID engine result Specimen identification request page. The numbers highlight useful tools on this page and are explained in detailed in the table below

Information available from the Specimen Identification Request results page
1. Top Hit Highlights the record with the most similar sequence.
2. Tree Based Identification Top matches are illustrated on a Identification Tree.
3. Summary Scores Graphical representation of the similary scores for the top 100 matches, including taxonomy hierarchy and record statistics.
4. List of Records List of matching records organized by maximum sequence similarity.

In this case, users can add annotation to the record to indicate the possible presence of a contamination. To add annotation to the affected records:

  1. Go back to the Sequence Page under the Annotation box and select Add Tags and Comments
  2. Select and add the Contaminated tag
Case 2: Real outgroup resulted from a genetically unrelated taxon

Real outgroups can sometimes be included on a tree. In order to determine if an outgroup is real or a contaminant, the sequence needs to be blasted against the BOLD ID Engine (refer to Case 1 - False outgroup resulted from a contamination for instructions on how to access the Identification Engine). If the outgroup represents a species new to BOLD, no record match will be displayed. The records should then be blasted against GenBank, which can be done directly from BOLD.

When the BOLD ID Engine fails to find a match, click Blast Sequence on GenBank to directly access the Standard Nucleotide BLAST on GenBank. If the resulting identification on GenBank matches the name provided in the tree by more than 99%, it can be concluded that the identification is correct. This is a real outgroup and does not need to be tagged.

id Engine no match Specimen identification request without any valid match

Case 3: Single branch resulted from unique record

Some species or haplotypes may appear as a single branch on the tree. It is important to check the identification of all single branches in a tree since these cannot be compared with other records within the same cluster. The Barcode Index Numbers (BIN) database can be used to confirm an identification; if the sequence meets the requirements to be clustered into BINs, then the record will have a BIN number. Refer to the Barcode Index Numbers (BIN) section in the Handbook.

To navigate to the BIN page:

  1. In the Project Console click View All Records
  2. Find the record you are interested in inspecting and select its associated BIN
  3. A new window for the BIN record will appear. Each BIN page contains information on the associated records including: distance summaries, taxonomy, collection locations, associated publications, specimen images, and sampling sites.

The data provided on the BIN page may help confirm the identity of single branch records on the tree, if other members of that species appear in other projects on BOLD. Where the correct identity of a single branch records cannot be confirmed right away, it is suggested that users monitor the BIN page for records over time as new specimens are being added to BOLD continously and activity on a BIN page is fluid.

BIN page Dicrostonyx richardsoniBIN page for DVWE001-12, Dicrostonyx richardsone

Case 4: Incomplete identification on a cluster

Some clusters on the tree may contain records that are identified to species and records that are not. It is possible to add full taxonomy to these records based on the tree and BOLD ID engine by sending a taxonomy update through the BOLD Submission Protocol.

Tips and Troubleshooting

When updating the taxonomy of a record based on the results from the identification engine, the Identified By field should be updated to "BOLD ID Engine". This informs other users that the identification provided was based on the record's nucleotide sequence without further examination of the voucher specimen and it should be reviewed by a taxonomic expert when possible. Further notes about taxonomic identifications can be added to the Taxonomy Notes and Identification Method fields.

Case 5: Single branch resulted from contamination or misidentification

When two or more records with the same species name appear on a tree in separate branches, it is often the result of a contamination or misidentification. If a misidentification can be concluded and the correct identification is known, it is recommended that the taxonomy be updated as soon as possible without tagging the record. If a misidentification is not certain or the correct name is unknown, the record should be tagged and re-examined in the future.

How to access record annotation:

  1. In the Project Console click View all Records
  2. Click the Process ID or Sample ID of the record to be tagged to open the Sequence or Specimen page, respectively.
  3. Click Add Tags and Comments
  4. Add the appropriate tag to the record. If the source of the issue is unknown, add both the Contaminated and Misidentified tags
  5. Add any additional comment or explanation as to why the record has been tagged

tagging optionsTagging options available

Case 6: Misidentified record in a cluster

Some species can be difficult to identify solely on morphological characteristics. Sometimes Taxon ID trees can cluster together records that were believed to belong to two or more species. In certain cases this can be easily resolved by updating the taxonomy of misidentified records. Refer to the section on Updating Specimen Data.

Tips and Troubleshooting

Before updating the taxonomy of any record in a project is important to check the sequence against the Identification Engine or BIN records (refer to the section on Identification Engine and BINs in this handbook) to ensure that the correct nomenclature matches other records on BOLD.

Case 7: Image mismatch

A mismatched image occurs when an incorrect picture is associated with a record. It is recommended to always create a matching image library when building a tree to examine records for this possible issue. Refer to the section on Taxon ID Trees in this handbook for more instructions on how to build a tree with matching images.

When building the tree, choose "Matching Images and Spreadsheet" in the parameters window. Then from the Tree Result window choose the option to View Image List. Each branch on the tree will be automatically assigned a number that will correspond to a photo in the image library. See the screenshots below for an illustration.

image library treeTaxon ID Tree with the Sorex hoyi cluster highlighted.

image library wrong image Matching image library for the Sorex hoyi cluster, showing that image [32] for BIOUG MCHU-0043 is incorrect

To correct an image mismatch:

  1. Email the BOLD support team support@boldsystems.org to request the deletion of the image by sending them the Sample ID and Process ID of the record with the incorrect image.
  2. All images associated with the record will be deleted.
  3. Re-upload the correct photos following the protocol described in the Image Submission section of this handbook.

If the image mismatch cannot be resolved immediately, add a tag to the image to inform other users that this issue has been acknowledged.

To add a tag on an image:

  1. Open the Specimen Page for the record in question.
  2. Under Photograph, click the "Add Tags and Comments" button.
  3. Select Edit Tags.
  4. Choose the Photo-Mixup tag.

taggeg specimen pageSpecimen Page with the photograph tagged as a Photo-Mixup

  • tag_specimen
  • tag_sequence
  • tag_image
  • tag_tracefile
  • tag_bin
  • tag_analysis
  • tag_annotation
  • tag_taxonomy

Back to Top