Handbook

Search Handbook
 

Managing Data

BOLD Systems provides an intuitive interface for uploading, accessing, and managing data. BOLD is a project and specimen centric database, which focuses on providing useful functionality for creating, modifying, and searching for records and projects. The User Console is the landing page once users log in, it provides an overview of the activities related to the user, and gives quick access to frequently visited projects.

New functionality have been added to the system to improve data management capabilities by the users. A improved search bar has now been added to the Project Console and Record List to increase user accessibility to records and new search options based on GPS polygons are now available. More information on the new features can be found on the sections below.


User Console

The User Console is BOLD's landing page when users log in. This console provides a quick overview of the user data management activities. Features available in this console include rapid access to frequently visited projects, reports on project activity, as well as powerful and improved search tools.

Users can rapidly view activities associated with their projects through the real-time Activity Report and quickly navigate to recently accessed projects in the Projects You Recently Accessed box, which automatically updates when the User Console is opened. Users can also view and create new datasets right from this console. Both the user's private datasets and all the public ones are visible. Public datasets are sorted by the dataset managers, so users can easily find a set of data by scanning the list of contributors.

Upload options are also provided here, so there is no need to open a specific project to upload images, trace files, or sequences. The system will match the Sample and Process IDs in the upload file and add the data to the correct records. New primers and publications can also be registered to the system from this console. These shortcuts allow users to upload data across multiple projects simultaneously.

New Integrated Search Bar

The newest version of BOLD introduces an integrated search bar accessible from the User Console , Project Console, and Project List. Users no longer have to go back to the User Console to search for records, projects codes, tags, or titles on BOLD. By selecting the Record Search from the Project Console and Record List , users now have the ability to choose if they want to search records within the project or within BOLD. For more information on the new search capabilities, click here.

User Console Illustration of the User Console

Definitions of objects on User Console
1. Project Search Jump directly to a project by entering the code or title in the project search bar. If the code is not known, generate a short list of matching projects by entering a tag on the project or part of the project title.
2. Project Management
  • Full Project List: A complete list of projects the user has access to, along with all public projects.
  • New Project: Create a new container or project to store specimen and sequence data. See the Create Project section for more details.
3. Your Data A breakdown of all records accessible to the user, separated by barcoding campaigns.
4. Your Datasets A list of the user's datasets as well as publicly accessible ones.
5. Collaborators A list of frequent collaborators ranked based on the number of shared projects.
6. Data Uploads Upload sequences, traces, and/or images directly from the User Console without having to go into the specific projects. These options are available to all users that own projects or have editing access to records. Every user has the ability to upload primers and bibliographies.
7. Recently Accessed Projects A list of recently accessed projects based on the user history. Click the Project Code to jump straight into the selected project.
8. Recent Activities
  • Search for specific events by typing in a user's name, action, date, or project into the search bar
  • Click on the action name to get a detailed report of the event and the records affected by it. Users can search the records from this report
  • Click on the Project Code to jump directly to the Project Console with its own activity feed
  • Events highlighted in blue identify actions performed by the user
Real-time Activity Reporting

The Activity Report can be viewed from the User Console and the Project Console, it logs all the activities pertinent to the user and project, respectively. This tool is very useful in keeping track of collaborative work, as users will be able to view the list of changes being made to their records. The activity report will log the addition of new specimens, images, traces, and sequences to the system, updates to specimen information, sequence deletions, and insertions of GenBank accessions. In addition, new tags, flags, and comments are also recorded, so project managers can quickly stay informed of the actions taken in their projects. Logs can be downloaded, allowing users to keep personal records and perform additional analysis.

Activity Log Activity Log from the User Console

Activity Detail Event Details from the Recent Activity Report

  • tag_new
  • tag_user
  • tag_account
  • tag_submission
  • tag_publication
  • tag_sequence
  • tag_image
  • tag_tracefile
  • tag_project

Back to Top

Searching the Workbench

The newest version of BOLD introduces an integrated search bar accessible from the User Console , Project Console, and Project List. Users no longer have to go back to the User Console to search for records, projects codes, tags, or titles on BOLD. By selecting the Record Search from the Project Console or Record List , users now have the ability to choose if they want to search records within the project or within BOLD. To search within BOLD, users simply check off the box to "Search within the Project", and will be able to conduct BOLD-wide searches.

The search form allows users to perform searches based on many parameters. These parameters can be used simultaneously to get an intersection of records (i.e., searching a taxon and a list of Sample IDs will return only the records matching both).

  • To search from all projects accessible to the user, select the Search Records button in the User Console or Project List. The form gives the option to add in all public records on BOLD in addition to just the records that a user has access to.
  • To search from within a merged set of projects or a single project or dataset, select the “Search Records” button from the Project Console.

Record Search Illustration of Workbench search form accessible from the User Console, Project Console, and Record List

Field Definitions for Workbench Search Box
Taxonomy Accepts specific taxonomic names (scientific names only). Multiple names can be space separated, but species names need to be enclosed by quotation marks ("").*
Geography Accepts country and/or province names.*
Marker Allows user to restrict the records returned to a specific marker.*
Sequence Length Allows user to set and minimum and/or maximum bp limit by entering a number.
Tags Accepts tags that appear on specimen or sequence records.
Depository Accepts depository institution names.
Extra Info Accepts terms that appear in the "Extra Info" field on specimen records.
Identifiers Accepts Sample IDs, Process IDs, GenBank Accessions, or BIN URIs. Users can search multiple IDs by putting each on a new line or copying from a spreadsheet.
Collection Dates Allows for a date range to be specified.
Public Records From the User Console, this option will appear and should be checked if all public records should be included in the search.
GPS Polygons Searches for records with GPS coordinates, which are found within a user defined polygon area on the map. A polygon can consist of many points but it must contain at least 3 points on the map and be close-looped. The returned results will have all of the records which have GPS coordinates within the polygon and to which the user has access to.
*Notes on use of the Workbench record search:
  • Quotations should be used for multi-word terms. For example: entering Bos taurus will return different results than "Bos taurus".
  • Multiple taxa, geography, markers, or tags may be entered in the same search, but have to be space separated. For example, entering Canada "United States" Mexico into the Geography field will return results from all these locations.
  • Multiple criteria can be searched simultaneously to get the intersection of records. For example, one can search for Aves and "Costa Rica" at the same time.
  • tag_new
  • tag_search
  • tag_account
  • tag_genbank
  • tag_bin
  • tag_taxonomy
  • tag_map

Back to Top

Create a Project

Once logged into BOLD, select Create New Project in the User Console. This opens the New Project Submission form that will need to be filled in with the appropriate information, to create a new project. Details on the information needed to create a project are presented below.

After a project is created, the Project Manager can modify the project properties and user permissions at any time by clicking on Modify Project Properties from within the Project Console. Only Project Code and Project Type cannot be changed after the project is created.

With the addition of datasets, BOLD has created a new type of permission to give project managers more control over the level of access users have in their projects. To allow users to add records datasets, project managers will need to ensure the "Add to Dataset" permission box is checked in.

New Project Project Properties form, which is used to create and modify BOLD projects

List of fields used in creating BOLD projects
Project Title A descriptive name for the project. This title has to be unique on BOLD
Project Code A 3-5 letter code for quick reference to a project. This code needs to be unique on BOLD and will not be able to be edited after a project is created. A good approach is to use initials and 2 or 3 other letters as an acronym for the title.
Project Type BOLD offers two types of projects: data projects, which contain specimen and sequence records; and folder projects, which contain other projects and are used to group various projects together, but cannot contain any records themselves.
Primary Marker CO1-5P is the default primary marker. The others available are:
  • Ribulose-bisphosphate carboxylase (RbcL)
  • Maturase K (MatK)
  • Region Interspacer Region (ITS)
  • 18S Ribosomal RNA (18S-3P)
  • 18S ribosomal RNA (18S-V4)
  • Intergenic Spacer Region between trnH and psbA (trnH-psbA)
Supporting Marker(s) Supporting marker(s) include other gene(s) the users have decided to sequence for their specimens, in addition to the primary marker. To register a new marker in BOLD, or request more than 4 markers on one project, please contact the BOLD Support Team..
Campaign Projects may be added to existing Barcoding Campaigns (e.g., Fauna of Germany) to group them with similar community projects.
Place in container Projects may be added to existing user-created folder projects (e.g., "LAB B - Plant Barcoding Projects"). Only folder projects that the user has access to will be listed.
Tags Project Tags are annotations that appear on the Project List and Project Console. BOLD offers several standard tags, to help organize projects, and manage large volumes of data. If a preexisting tag is not suitable, users can create their own tags by entering them one-by-one in the text box.
Project Description A summary of the use and intention of the project, this description will be displayed on the Project Console.
Bounding Box The bounding edges of the collection area covered by the project using 2 pairs of GPS coordinates for top left and bottom right position.
Project Access Check box to make project and records publicly visible to all BOLD users.
Project Manager The person who creates a project is automatically the Project Manager and has full specimen and sequence access. By clicking on ‘Modify Project Properties’ in the Project Console, the Project Manager can change project details and add/remove other users.
Assign Users Other BOLD users can be added or removed from a project at anytime by the Project Manager. Up to 20 users may be added to one project, and different levels of access are available depending on the amount of involvement for each user.

Sequence Access:

  • Analyze Only - users can perform analysis on the data, but cannot view sequences and related information.
  • View & Download - users can view or download the sequence data as well as analyze them.
  • Edit Sequences - users can upload and manipulate trace files and sequences, view and download the sequences, and analyze them.

Specimen Access:

  • Edit Specimens - users can edit sample identifiers, taxonomy, collection data, and images in the specimen: this edit permission level is recommended for project managers, collectors, and taxonomists only.

Dataset Access:

  • Add to dataset - users can add records to a "virtual project" which can used for analysis and publication purposes. For more information on datasets click here.

  • tag_account
  • tag_user
  • tag_project

Back to Top

Project Console

The Project Console presents the status of records in the project as well as an audit trail of the activities being performed. This console includes an overview of the number of specimens and sequences, counts of missing components in the records, a breakdown of the specimen depositories, and a list users, with their emails, who have access to the project.

Most of the information in the Project Console will be available to all users of the project, but access to certain tools will be determined by the level of permission granted by the project managers to specific users. Visit the Create New Project for more information on access permission.

Only project managers will have access to the Submit to GenBank and Modify Project Properties tools. Submit to GenBank allow project managers to send the records directly from the BOLD project to GenBank. Please click on GenBank and BOLD Public data submissions for more information on publishing records to GenBank. Modify Project Properties allow project managers to update the project title and description, add supplementary genetic markers and tags to the project, and add and remove users.

To view all records in the project, users can select View All Records. Alternatively, a subset of the project records can be searched using the Record Search menu at the top of the page.

New Integrated Search Bar

The newest version of BOLD introduces an integrated search bar accessible from the User Console , Project Console, and Project List. Users no longer have to go back to the User Console to search for records, projects codes, tags, or titles on BOLD. By selecting the Record Search from the Project Console and Record List , users now have the ability to choose if they want to search records within the project or within BOLD. For more information on the new search capabilities, click here.

Project Console BOLD Project Console

  • tag_new
  • tag_user
  • tag_project
  • tag_account
  • tag_download
  • tag_search
  • tag_submission
  • tag_analysis

Back to Top

Record List

To access the Record List, click on View All Records in the Project Console or use the Record Search to find a set of records. The Record List provides access to the specimen and sequence data for each record, it also displays the sequence length for the record, any important "extra information" about the record submitted by the user and the code for the BINs associated with each record. Select specific records to analyze or download using the checkboxes beside the records on the left hand side.

Ascending and Descending sorting

BOLD now offers the ability to sort Identification, Specimen Page, Sequence Page, Markers, Extra Info, and BINs in ascending or descending order. Click on the "up" or "down" arrows of the column of choice to reorganize the list.

Select Sample ID under the Specimen Page or Process IDs under the Sequence Page to access the Specimen Data and Sequence Data for each record, respectively.

Once the Record List is opened, users can choose the number of records they would like to see on the record list, by selecting from the menu in the upper right corner of the page.

New Integrated Search Bar

The newest version of BOLD introduces an integrated search bar accessible from the User Console , Project Console, and Project List. Users no longer have to go back to the User Console to search for records, projects codes, tags, or titles on BOLD. By selecting the Record Search from the Project Console and Record List , users now have the ability to choose if they want to search records within the project or within BOLD. For more information on the new search capabilities, click here.

Home Projects in the Project List

A new feature on BOLD v3.6 is the Home Project button on merged projects, and on searched record lists. Clicking on this button adds a "Project" column to the list, which shows the project codes associated to each record.

The project manager or user with full editing access, can move records from one project to another by selecting the appropriate records and clicking on Move Records to another Project; or add records to a dataset by clicking on Add Records to Dataset in the left hand side menu.

Flags on records:

  • Icons will appear next to a record to indicate the presence of certain data components.
  • A red-highlighted sequence length means that the sequence has more than 1% ambiguous characters. For COI, MatK, and RbcL sequences, this means they won’t meet Barcode Compliance Standard.

Record List BOLD Record List

List of BOLD Record icons
GPS GPS coordinates present for sample.
camera Record contains images of the specimen.
trace count The number of traces present for the sample.
Complaint Sequence is Barcode Compliant.
Stop Stop codons present in sequence.
contamination Contamination present in sequence.
flagged Flagged record, filtered from the ID engine.

 

Notes on Barcode Compliance:

Barcode Compliance flags on BOLD are now applied to plant barcode markers - MatK, RbcL, RbcLa, trnH-psbA; and fungi barcode markers - COI, ITS, and ITS2 with the same standards used on animal COI sequences. These standards include a minimum sequence length of 500bp, less than 1% ambiguous bases, the presence of two trace files, a minimum of low trace quality status, and the presence of a country specification in the record as set out by the Consortium for DNA Barcoding (CBOL).

  • tag_new
  • tag_downloads
  • tag_submission
  • tag_specimen
  • tag_sequence
  • tag_account
  • tag_bin
  • tag_search
  • tag_analysis
  • tag_taxonomy
  • tag_annotation

Back to Top

Specimen and Sequence Pages

Specimen Page

This window is opened by clicking on a Sample ID in the Record List. It highlights voucher information, taxonomy, collection data, and specimen details about the record. If a record has a BIN assignment, there will be a link to the specific BIN Page within the taxonomic section of the record. Specimen information can be edited by selecting Edit Specimen from the upper right corner of the screen. This button will only be visible to users who have editing access to the specimens.

An interactive Google map is also available. In this map users may zoom in to the exact collection site and, where available, select the street view of the area. This feature can be useful to gain an idea of the habitat conditions at the collection site. Collection locations are displayed on the map based on GPS coordinates only when these are provided in the submission.

Comments and Annotation can be added to the Specimen Page in different forms. There are two buttons available on the form to add a tag or a comment to the record. Comments or tags will be displayed for the whole record, or just for the image. By selecting the Add Tags and Comments button, users may choose to use one for the pre-created tags, create a new tag, or add a personal comment. More information on how and why annotate a record can be found here.

Specimen Page Specimen Data Page

By clicking on the main image, users will open a zoomable copy. This larger image is useful for a more extensive taxonomic examination of the specimen, by being able to zoom in to different parts of the organisms users can identify morphological features which may not be clear at the image's original size. This window will also display the licensing and attribution details for the image, including the email address of the license holder for those who may be interested in using the image.

Specimen Image Image Viewer accessible by clicking on the main image in Specimen Record. Hovering over the image produces a zoomed view of that section of the photo.

Sequence Page

The sequence page is opened by clicking on a Process ID from the Record List for users with sequence viewing permission. This page provides details on the trace file(s) and sequence(s) for the specimen. Where multiple markers were sequenced for the same specimen, individual sequence information can be accessed by using the tabs at the top of the screen. Sequence details include a sequence overview, the nucleotide and amino acid sequence, and an illustrative barcode.

Users are able to automatically run an identification request using the BOLD ID Engine by selecting on of the 4 databases available: Full DB, Species DB, Published DB, and Full Length DB .

Trace files can be viewed by selecting at least one trace file and clicking on the View Trace Files button. This will open a new window which displays information on the traces and quality scores as well as a graphical representation of the chromatograms.

BOLD currently supports a BOLD-LIMS report for samples processed in the Canadian Center for DNA Barcode (CCDB) which provides the user with a report on the progress of a sample, along with details on the protocol used.

If a publication is associated with a record, details on the publication title, authors, and source will be available on this page. By selecting the title of the publication, users will be provided with the details of the article, including publication date and abstract where the information is available. Publications are sequence dependent, which means they are directly associated with a specific marker.

Comments and annotations can be added to the Sequence Page and the Trace Viewer . By selecting the Add Tags and Comments button, users may choose to use one for the pre-created tags, create a new tag, or add a personal comment. More information on how and why annotate a record can be found here.

Edit and Delete Sequences Manually

Nucleotide sequences can now be edited or deleted directly on this page, by users with the adequate permission level. By clicking on Clear Sequence users will be able to remove the full nucleotide sequence from the record; while by clicking on Edit Sequence users will be able to add or remove base pairs in a text box format.

sequence Page Sequence Data Page

Trace files can be viewed by selecting at least one check box in the Sequence Runs in the Sequence Page. Upon submission, it may take up to 24 hours for users to be able to download or view trace files.

By clicking on View Trace Files, a new window will open, which details each trace run along with a graphical representation of the trace's chromatogram. In cases where quality scores for the traces are not provided in the submission package, values will be automatically calculated by BOLD.

Summary statistics on the overall quality scores for each trace are included, along with a visual display of individual quality values for each peak. Users can scroll through the chromatograms in order the interpret the intensity and quality of the traces. Every trace is displayed in the original orientation and direction, so that reverse traces are not displayed in the complementary alignment. By selecting View Sequence users may examine the nucleotide sequence for each individual trace.

Trace Viewer Trace Viewing Page accessed from the Sequence Page

Sequence Editor

The newest version of BOLD introduces a customized, built-in sequence editor. Users now have the ability to view and assemble trace files directly on BOLD. The Sequence Editor tool allows users to call and delete base pairs and then to upload the sequences directly into their records.

Sequence Editor Sequence Editor accessible from the Sequence Page

  • tag_account
  • tag_download
  • tag_specimen
  • tag_image
  • tag_genbank
  • tag_sequence
  • tag_tracefile
  • tag_bin
  • tag_analysis
  • tag_new
  • tag_taxonomy
  • tag_map

Back to Top

Project Summary Report

The Project Summary Report is found under the publication menu on the Project Console, Dataset Console, and the Record List. The report highlights the summary information for the project or dataset, including the day the report was generated, the number of specimens, and the number of sequences obtained. In addition, it includes a distribution map for the records in the project or dataset.

The report also lists basic data on the specimens including provisional identifications, Sample ID, Museum ID, Process ID, storing institution, country/ocean, and state/province. It contains additional details on the presence of GPS points and number of images in the record. Finally it lists the GenBank accessions associated to a record where one exists, and the number of traces uploaded.

BIN names added to the Project Summary

BIN names are now also included in the Summary Report to facilitate adding these identifications into publications. BIN names are listed for each record individually, so users can copy the information available on the table and add it directly to their publications.

Project Summary Project Summary Report Page

  • tag_genbank
  • tag_publicdata
  • tag_image
  • tag_tracefile
  • tag_bin
  • tag_taxonomy
  • tag_map

Back to Top

Datasets and DOIs

Datasets are virtual copies of a set of records that users can create when they want to analyze a subset of records over an extended period of time, submit some, but not all records in a project to GenBank, publish a subset of the data, and request a DOI on their dataset. By creating a dataset, a user has the ability to select records from a variety of projects on BOLD and group them together in one permanent list without having to move the records out of the home projects.

For instance, a researcher working on Lepidoptera may want to create a Dataset for one genus due to an upcoming publication on those records, but wishes to leave these records in their home projects. When adding records to a dataset, the actual records remain in the original projects where they can continue to be edited as necessary. Any changes to the actual records will appear simultaneously on the records in the dataset.

Creating a new Dataset

Once logged into BOLD, select the New Dataset button in "Your Datasets" section of the User Console. By choosing to make the dataset public, the user is provided with the opportunity to request a DOI, now available via a partnership between BOLD and DataCite. The assigned DOI can be used in manuscripts in place of supplementary tables, and can be referenced when used by others.

Dataset Properties The BOLD Dataset creation form with DOI request pop-up highlighted.

Fields in the BOLD Dataset properties form. (* denotes required fields)
Dataset Title* A descriptive name describing the scope of the Dataset.
Dataset Code* A 4-8 alphanumeric code that is unique in BOLD for quick reference. (This can not be modified after a Dataset is created).
Dataset Description* Summary of the use and intention of the Dataset. This description will be displayed on the Dataset Console.
Dataset Access Makes the Dataset and records publicly visible to all BOLD users. Dataset managers can request a DOI to be registered in association with their datasets once these are made public.
Bounding Box The bounding box of the collection area covered by the dataset using 2 pairs of GPS coordinates for top left and bottom right position.

Most of the information provided during the dataset creation can be edited by the dataset manager by clicking on Modify Project Properties from within the Dataset Console.

Adding and Removing records

Users can add any records they have access to, to a dataset in batches of up to 2,500 records at a time. Datasets can be overlapping, that is, the same record can be utilized in multiple Datasets by multiple users. A single Dataset can contain up to 25,000 records.

To remove records from a Dataset, select them from the Record List and click Remove Records from the menu on the left hand side.

  • tag_genbank
  • tag_publication
  • tag_account
  • tag_user
  • tag_download
  • tag_search
  • tag_submission
  • tag_analysis
  • tag_annotation

Back to Top

GenBank & BOLD Public data submissions

BOLD shares a tightly integrated data exchange pipeline with NCBI (GenBank) that allows for automatic submission of data to GenBank. Users are only required to fill in the author and publication information, which is sent to GenBank along with the specimen, sequence, and trace data (only for COI gene) which has been transformed to the required formats. GenBank responds directly to the user with the accessions for their records to be included in publications. Accessions are also sent to BOLD to ensure bidirectional linkage.

The data exchange pipeline is further utilized to send GenBank updates to records. Identifications of records submitted through BOLD to GenBank can still be refined and updated as new information is obtained. Changes to the taxonomy of BOLD records are automatically sent to GenBank so that GenBank can gain up to date information.

How to submit records from BOLD to GenBank

BOLD has a simplified submission tool for Project Managers to submit sequences along with trace files to GenBank.

Within the Project or Dataset Console, click on Submit to Genbank. Please note that only a project or dataset manager has access to this function. If only a subset of records need to go to GenBank, then the relevant records can be submitted via a Dataset.

After submission of the form, the project or dataset will be reviewed by the BOLD staff to make sure the data meet the requirements for GenBank. The accession numbers are returned by email from GenBank to the project manager, and are associated with the records on BOLD for quick reference in the Project Summary form. GenBank accessions will appear wherever the matching sequence appear on BOLD, and will provide direct links to the GenBank page for that sequence.

GenBank puts a default 1 year privacy period on records submitted through BOLD, where the records are deposited in GenBank but are still inaccessible to the public. This privacy period allows BOLD users to gain accessions early in the manuscript writing process and removes the need for rushing to gain accessions once the manuscript is in its final stages of acceptance by a journal.

GenBank Submission GenBank Submission Form with example data

Steps after publication
  • When the manuscript is published, user will need to make the BOLD Project or Dataset public so that the sequences are accessible to the general public. The project manager can do this by clicking on Modify Project Properties within in the project or dataset and checking off the box to “Make this project publicly visible”.
  • Submit a bibliography to the BOLD Publication Database following the directions in the Publication Submission Protocol. This process allows user to associate the publication details with the records on BOLD, using the GenBank Accession numbers.
  • Records submitted to GenBank through BOLD are automatically kept in hold-up on GenBank for 1 year to allow time for publication. Should the article be published in less than a year, or should the authors want the sequences public on GenBank sooner, the corresponding author should contact GenBank directly to request public release. For more details on GenBank's policies, please visit their site at http://www.ncbi.nlm.nih.gov/genbank/.

Contact the BOLD support team through support@boldsystems.org with questions on any aspect of the publication process.

  • tag_genbank
  • tag_publicdata
  • tag_submission
  • tag_publication
  • tag_account

Back to Top