What is the dbGuide database?

It is a a resource containing validated guide RNA sequences that have been used in gene editing experiments in human and mouse cells. We are continuously aggregating information from publications using the CRISPR/Cas9 technology as well as data generated by our lab assessing editing efficiency by Illumina sequencing. For the purpose of this help guide, we will be demonstrating how to use our database using the human BAP1 gene.

How to properly fill in information:

Step 1: From the drop-down menu, select the desired species: either human, or mouse.
Step 2: Search using either a HUGO gene symbol, Ensembl transcript ID, Ensembl gene ID, or genomic coordinates in BED format.

Below are the proper formats to search for the BAP1 gene:
  • HUGO gene symbol = BAP1
  • Ensemble transcript = ENST00000478368
  • Ensemble gene = ENSG00000163930
  • Genomic coordinates in BED format = chr3:52401008-52410008

    Results Page

    To get to this page below we selected in Step 1: Human(hg38), and for Step 2: input the HUGO gene symbol BAP1.

    As you scroll from left to right, you will see titles for each column. Each of these columns have an up and down arrow where you can sort each coulmn. Below is a description for each category seen in the results table.

    Column Name Description
    Select Allows the user to select individual rows to export information using the Copy, CSV, View Data, and Design buttons at the top of the table.
    Gene-Symbol HUGO gene symbol of the gene being searched
    guide_rna Guide RNA sequence
    cirspr_system Bacterial species and class of the CRISPR system
    position The sgRNA target site genomic location (including the PAM sequence). This is a clickable genomic coordinate which will link to the UCSC Genome Browser
    in_protein_coding_exon Indicates whether or not the guide RNA targets a protein coding exon as indicated by True.
    num_transcripts Number of transcripts the guide RNA can target
    transcript_id_list A clickable list of of Ensembl transcript IDs which link to the ENSEMBL database
    sgrnascorer Predicted activity using the sgRNA Scorer 2.0 algorithm (-3 to 3). Higher the value, greater the predicted activity. 28146356
    rule_set_2 Predicted activity using the Rule Set 2 algorithm (0 to 1). Higher the value, greater the predicted activity. 26780180
    guide_scan_off Guidescan off-target score (0 to 1). Higher the value, higher the specificity. 28263296
    forecast Favored Outcomes of Repair Events at Cas9 targets (FORECasT) score for predicting mutational outcomes (in frame indel %, lower values preferred for knockouts). 30480667
    total_nhej_rate Percentage of non-homologous end joining (NHEJ) mutation events observed from targeted sequencing. These events are filtered against SNPs/indels that are present in the control (untransfected) PCR amplicon.
    oof_rate Percentage of out of frame (OOF) events of all NHEJ mutations observed from targeted sequencing.
    num_publications Number of publications that the selected guide RNA has been cited in.
    ind_pmids A list of Pubmed IDs (PMIDs) referencing the publication the guide RNA has been cited in.
    num_screens The number of pooled screen publications in which the selected guide RNA was over/under represented.
    screen_pmids A list of PMIDs referencing the pooled screen publication where the guide RNA has been cited in.
    sources The sources from which sgRNAs were compiled from. Please see Supplementary Table 1 of our publication with the full list of sources.
    status If the guide has been used in a publication or has quantitative editing data, it has a status of validated, otherwise it is design. Simply typing in validated in the search box will filter only for guides that are validated.

    How to generate a CSV file:

    To make a CSV file, first we need to select guide RNA sequences. For the purpose of this demonstration, we are going to select the first 5 guides. Once the desired guides are selected, click on the CSV button indicated by the blue arrow in the image below. This generates a downloadable CSV file with the desired guide RNAs selected.

    How to view data for the selected guide RNAs:

    For this demonstration, we used the up and down toggle arrows indicated by the black arrow in the "total_nhej_rate" column. Next, we selected all the guide RNAs with listed NHEJ data. Finally, click on the view data button indicated by the blue arrow. The table opens up below the results table and can be seen in the image below.

    How to Design Oligo for the Desired Guide RNA Sequences:

    Finally, if you have decided that you want to use selected guide RNAs, you can use the "Design" button to design oligos for you. It provides you with two documents, a PDF and a text file.

    The Text File:
    Below is an image of the text file generated for the guide RNAs we selected from the results table. Once you decide how you would like to conduct your experiments, you can use the following information to order oligos fby simply inputing the sequences generated in this file into IDT. In addition, it provides the guide RNA sequences used as well as the genomic location in BED format.

    The Generated PDF File:

    The attached PDF is an example for the 9 selected guide RNAs that we used.
    In the PDF is a methods section for how to complete the experiments as well as catalog numbers for reagents to complete the experiments.

    Design - Generated PDF Download the generated PDF file which has the quick protocol here

    How to cite

    If you are using a published guide RNA for your experiments: please cite BOTH the original source publication and our database.

    If you are using a guide RNA that previousely published, but we had data for, please cite our database.

    Gooden AA, Evans CN, Sheets TP, Clapp ME, Chari R (2020). dbGuide: A database of functionally validated guide RNAs for genome editing in human and mouse cells. bioRxiv

