Rare Cancer Explore (RaCE) is a dedicated database and analytical platform focused on rare cancers. It provides innovative tools to explore and visualize complex data related to these uncommon cancer types. Designed to be user-friendly, it enables clinicians, researchers, and biologists—even those without bioinformatics expertise—to comprehensively investigate the clinical significance and biological functions of rare cancer biomarkers. Accessible online, RaCE simplifies data analysis, allowing users to gain valuable insights and contribute to advancements in rare cancer research.
Number of datasets
69
Number of samples
5451
Number of cancer types
13
This database includes more than 50 different cancer datasets. These datasets encompass a wide range of cancer types, providing comprehensive data for research and analysis. Each dataset includes detailed information on patient information, overall statistics, and outcomes.
This table provides an overview of the data used for the analysis. Select a dataset to view more information about it below.
This module provides the expression data for the selected cancer type and gene.
In this module, you are able to:
Please go through the generic guide on Tutorial page for cancer type and gene selection.
Unlike other modules, this module allows you to select multiple genes for expression data visualization. There are two modes for plotting using the same gene list:
Use the dropdown menu to add or type in genes to search for the gene(s) of interest. Use backspace to delete or left click on a selected gene and press delete key to remove it.
Switch between the single and multiple gene mode.
Single gene expression data plot: This plot shows the expression data of the selected gene in different datasets. Within each dataset, the samples are is sorted by expression from low to high.
Multiple gene expression data plot: This plot shows the expression data of the selected genes in different datasets. The genes are ordered exactly the same order as the selection in (1).
Step 1: Select a cancer type
Number of cancer types
13
Step 2: choose a gene to plot
Differentially Expressed Genes (DEG) Analysis provides a comprehensive view of gene expression changes between different groups of samples. In this module, one representative dataset is selected from each cancer type to perform DEG analysis.
In this module, you are able to:
adj.p
, usually set to 0.05.
The smaller the value, the more stringent the filter.p.adjust < 0.05
value terms will be shown.This page provides Differential Expressed Genes (DEG)Analysis.
Step 1: Select cancer type
Step 2: Select filters
Step 3: Show results
Survival analysis provides a comprehensive view of patient outcomes over time. This module allows you to explore the survival analysis of different cancer types and genes of interest using the Kaplan-Meier (KM) or Cox proportional hazards model.
In this module, you are able to:
Please go through the generic guide on Tutorial page for cancer type and gene selection.
Step 1: Select a cancer type, survival analysis method, and grouping
Step 2: choose a gene to plot
The Cancer Cell Line Encyclopedia (CCLE) is a comprehensive resource that provides detailed genetic and pharmacological information on a wide array of human cancer cell lines. The project is hosted by Broad Institute. Instead of archiving data of all cancer types, RCCLE in this database focuses only on selected rare cancer types and the gene effect in different cancer/cell line.
In this module, you are able to select your gene of interest and:
Step 1: Select a gene
Number of cancer types
12
Number of Cell Lines
168
Number of genes
17916
Somatic mutations are the driving force of cancer development. In this module, the top 15 most frequently mutated genes in each cancer type (according to the TCGA PanCancer Atlas) can be explored.
Samples of the selected cancer type are divided into high and low expression groups. The mutation frequency and tumor mutation burden (TMB) are calculated for each group.
In this module, you are able to:
Please go through the generic guide on Tutorial page for cancer type and gene selection.
On the plot, each little tile represents a sample. The red tile indicates the presence of a mutation in the oncomarker.
TMB difference is tested with the Wilcox test.
Step 1: Select a cancer type
Number of cancer types
10
Step 2: choose a gene to plot
This module calculates the immune infiltration score of immune cells in the tumor microenvironment based on the selected cancer type and gene of interest.
In this module, you are able to:
Please go through the generic guide on Tutorial page for cancer type and gene selection.
Step 1: Select a cancer type
Number of cancer types
13
Step 2: choose a gene to plot
DNA methylation is a key epigenetic modification that regulates gene expression by altering chromatin structure, playing a critical role in cancer initiation, progression, and prognosis. Aberrant DNA methylation patterns are frequently observed in rare cancers, making them valuable biomarkers for understanding disease mechanisms and developing targeted therapies.
This module integrates DNA methylation data from the UCSC Xena platform. Data were downloaded using the UCSC XenaTools package and focused on 9 rare cancer types from two major sources: the TCGA and TARGET. The methylation profiles were generated using the Illumina Human Methylation 450K BeadChip. Probe annotation was performed using the HM450.hg38.manifest.gencode.v36.probeMap file from UCSC Xena.
In this module, you are able to:
Please go through the generic guide on Tutorial page for cancer type and gene selection.
There are two major cutoffs in this module:
Step 1: Select a cancer type, survival analysis method, and grouping
Step 2: choose a gene to plot
Cancer immunotherapy survival analysis provides a comprehensive view of patient outcomes over time of different cancer treatment cohorts based on the selected cancer type and gene of interest.
In this module, you are able to:
The data used in this module are from different cancer cohorts. See following for details:
Cohort | Primary | Citation |
---|---|---|
Mariathasan | Bladder | Mariathasan S, Turley SJ, Nickles D, et al. TGFβ attenuates tumour response to PD-L1 blockade by contributing to exclusion of T cells. Nature. 2018 Feb 22; 554(7693):544-548. doi: 10.1038/nature25501. PMID: 29443960; PMCID: PMC6028240. |
Braun | Kidney | Braun DA, Hou Y, Bakouny Z, et al. Interplay of somatic alterations and immune infiltration modulates response to PD-1 blockade in advanced clear cell renal cell carcinoma. Nature Medicine. 2020 Jun; 26(6):909-918. doi: 10.1038/s41591-020-0839-y. PMID: 32472114; PMCID: PMC7499153. |
Jung | Lung | Jung H, Kim HS, Kim JY, et al. DNA methylation loss promotes immune evasion of tumours with high mutation and copy number load. Nature Communications. 2019 Sep 19; 10(1):4278. doi: 10.1038/s41467-019-12159-9. PMID: 31537801; PMCID: PMC6753140. |
Liu | Melanoma | Liu D, Schilling B, Liu D, et al. Integrative molecular and clinical modeling of clinical outcomes to PD1 blockade in patients with metastatic melanoma. Nature Medicine. 2019 Dec; 25(12):1916-1927. doi: 10.1038/s41591-019-0654-5. PMID: 31792460; PMCID: PMC6898788. |
Padron | Pancreas | Padrón LJ, Maurer DM, O’Hara MH, et al. Sotigalimab and/or nivolumab with chemotherapy in first-line metastatic pancreatic cancer: clinical and immunologic analyses from the randomized phase 2 PRINCE trial. Nature Medicine. 2022 Jun; 28(6):1167-1177. doi: 10.1038/s41591-022-01829-9. PMID: 35662283; PMCID: PMC9205784. |
Snyder | Ureteral | Snyder A, Nathanson T, Funt SA, et al. Contribution of systemic and somatic factors to clinical response and resistance to PD-L1 blockade in urothelial cancer: An exploratory multi-omic analysis. PLoS Medicine. 2017 May 26; 14(5). doi: 10.1371/journal.pmed.1002309. PMID: 28552987; PMCID: PMC5446110. |
Please go through the generic guide on Tutorial page for cancer type and gene selection.
Step 1: Select a cancer type, immunotherapy analysis method, and grouping
Step 2: choose a gene to plot
GDSC: The Genomics of Drug Sensitivity in Cancer (GDSC) project is a database focused on cancer cell drug sensitivity and molecular markers of drug response. GDSC provides a unique resource that combines drug sensitivity and genomic datasets to aid in the discovery of new therapeutic biomarkers for cancer treatment. The cancer genomic mutation information in the database includes point mutations, gene amplifications and deletions, tissue types, and expression profiles, among others. The database has characterized 1,000 human cancer cell lines and screened them with over 100 compounds. https://www.cancerrxgene.org/
CTRP: The Cancer Therapeutics Response Portal (CTRP) links genetic, lineage, and other cellular features of cancer cell lines to small-molecule sensitivity with the goal of accelerating discovery of patient-matched cancer therapeutics. The CTRP is a living resource for the biomedical research community that can be mined to develop insights into small-molecule mechanisms of action and novel therapeutic hypotheses, and to support future discovery of drugs matched to patients based on predictive biomarkers. https://portals.broadinstitute.org/ctrp/
PRISM: Developed by the Broad Institute of MIT and Harvard, Profiling Relative Inhibition Simultaneously in Mixtures (PRISM) is a novel DNA barcoding technology that allows for rapid, viability screening of more than 900 human cancer cell-line models in mixtures. These 900 cell lines represent more than 45 major lineages of cancer. https://www.theprismlab.org/
Test Set: A total of 13 cancer types were used, with one representative dataset selected for each cancer type as the test set.
Training Set: GDSC2、CTRP2、PRISM
Analysis Method and Screening Criteria: Ridge regression analysis was performed using the Oncopredict package. The expression levels of target genes for each cancer type were correlated with the predicted scores for each drug using Spearman correlation analysis. Correlation values less than -0.4 were selected, and the top 10 drugs were chosen based on the correlation coefficients from low to high.
Prediction Values: GDSC2 and CTRP2 provide IC50 values, while PRISM provides AUC values. Lower IC50 and AUC values indicate stronger targeting effects of the drug on the target gene.
Significance of IC50 and AUC Values:
IC50 (Half Maximal Inhibitory Concentration): IC50 refers to the concentration of a compound or drug required to inhibit a biological process or activity by 50% under certain conditions. It is commonly used to assess the biological activity of drugs. A lower IC50 value indicates a more potent drug, as it can inhibit the target biological molecule at lower concentrations.
AUC (Area Under Concentration-Time Curve): AUC represents the drug’s bioavailability, which is the degree and rate at which the active pharmaceutical ingredient from the formulation is absorbed into the systemic circulation. A higher AUC indicates higher bioavailability, while a lower AUC suggests lower bioavailability. AUC from 0 to ∞ refers to the total area under the concentration-time curve from time zero until all the parent drug is eliminated, reflecting the total amount of the drug entering the bloodstream. In general, a lower AUC value indicates increased sensitivity of the cells to treatment.
In this module, you are able to:
Please go through the generic guide on Tutorial page for cancer type and gene selection.
There are two major cutoffs in this module:
Gene - drug response correlation cutoff. This cutoff is used to filter out drugs that have a correlation coefficient lower than the cutoff value. Usually a negative correlation coefficient indicates the drug negatively contributes to the gene expression, and potentially inhibits the gene expression.
Top N drugs filter. Most times even if we set a correlation cutoff, there are still too many drugs left. This filter is used to select the top N drugs ranked by lowest correlation to the highest correlation.
Other settings, including font size, color, etc.
This plot shows the correlation between the selected gene and the drug response score (GDSC or CTRP), or AUC value (PRISM).
This plot shows the drug response prediction for the selected gene in the low and high gene expression groups. The low-high groups are divided by the median expression of the selected gene.
Step 1: Select a cancer type
Number of cancer types
13
Step 2: choose a gene to plot
This page provides a comprehensive guide on how to use the database.
Data
module to explore the available datasets,
and see what modules are supported for each dataset.The cancer type and gene selection is a common feature in most modules.
Select a cancer type of interest.
Some additional selections may be available depending on the module.
Cancer statistics: please wait for a few seconds for data and statistics to load. A panel of statistics will be shown. Different modules may have different statistics.
Select a gene of interest. You must make cancer type selection first. The available genes vary depending on the selected cancer type.
Most modules allow only one gene selection. Some modules may allow multiple gene selections. Please hover over to the icon for more information.
By default, a gene is selected. You can change the gene selection by:
To add a gene:
Click this button to generate the results. After changing any option/selection above, you must click this button to update the results.
Many modules provide plot options to customize the plot.
Most modules provide at least one data table that contains the data used to generate the plots.
CSV
button on the bottom left corner of each data table.
Clicking on this button will download the data table as a CSV file.For bulk download, we have provided a separate Download
module to download
massive data files. It can be accessed from the top navigation bar.
All plots can be viewed in full screen by hovering over the plot and a full screen icon will appear on the bottom right corner of the plot. Click on the icon to view the plot in full screen. Please wait for a few seconds for the plot to rerender and adjust to the full screen.
There is a dark mode available. You can switch between light and dark mode by clicking the or icon on the top right corner.
In this database, one can download the core data from different datasets. This includes:
However, there are some exceptions where the data is not available for download:
Select a dataset you want and click the download button below the table.
Will be updated once the manuscript is accepted.