We will illustrate a simple search by searching for proteins that are targeted by the drug Aspirin.
- 1.Type Aspirin into the search box. Select Aspirin (Drug) from the Autosuggest drop-down menu.
- Important: Clicking on "Complete Profile" in the drop down menu will retrieve all known relationships of Aspirin.
- 2.Type Protein into the search box and select Protein from the menu.
- 3.Now select "Targets Protein" as the relationship you are interested in.
- 4. Click Search. Aspirin’s protein targets will be presented in the Protein facet in the RESULTS tab.
NOTE 1: The search term could be concepts such as Genes, Proteins, Diseases, Drugs etc. or could even be specific names such as BRCA1, MAPK, Sitagliptin etc.
NOTE 2: DistilBio interprets the queries as a graph where the nodes are the search terms and the selected relationship is the edge. The query graph can be viewed in the QUERY tab.
Link to Result
DistilBio allows the user to ask complex queries and discover new and interesting connections
across the data.
A few complex queries are illustrated below
- 1. What are the common drug targets between Aspirin and Acetaminophen?
- 1. Type “Aspirin” in the search box and select the <type> drug.
- 2. Extend the query by typing “protein” and selecting <type> protein from the dropdown. Select “targets protein” as the connection.
- 3. Type “acetaminophen” and select <type> drug. Select “is target of drug” as the connection.
Link to Result
- 2. What are the protein targets of the drug Sitagliptin and what are the proteins interacting with these targets?
- 1. Type “Sitagliptin” in the search box and select <type> drug.
- 2. Type “protein” and select <type> protein from the dropdown. Select “targets protein” as the connection.
- 3. Again type “protein” and select type protein. Select “interacts with” as the connection.
Link to Result
- 3. What are the compounds targeting the protein CP19A and the assays associated with the compound? Also find assays performed only in humans
- 1. Type “CP19A” in the search box. CP19A in various organisms are displayed by the autosuggest. Select “CP19A_human” from the drop-down.
- 2. Type “compound” and select the <type> compound. The connection is automatically selected since there is only one available relationship.
- 3. Type “assay” and select assay from the drop-down list.
- 4. To find assays performed only in human, now type “human” and select <organsim> human.
Link to Result
DistilBio covers the following databases currently, and we're working on adding more.
|ALFRED*||Gene ->Mutation , Mutation Properties|
|BioGRID||Gene->Gene Interaction, Gene Interaction->Publication|
|CCLE||Drug->Screening Study, Gene->Mutation, Gene->Cell Line, Cell Line->Tissue, Cell Line->Disease, Cell Line->Mutation, Mutation->Tissue, Compound->Screening Study, Screening Study->Cell Line|
|Cell Image Library||Image->GO-Molecular function, Image->GO-Biological Process, Image->GO-Cellular Component, Image->Imaging Mode, Image->Organism, Image->Source Of Contrast, Image->Image Type, Image->Parameters Imaged, Image->Cell Type, Image->Publication, Cell Type->Image, Image->Visualization Method, Cross-Refs|
|CGP||Cell Line->Tissue, Cell Line->Disease, Cell Line->Gene, Cell Line->Publication, Mutation->Gene, Mutation->Cell Line, Mutation->Publication, Cross-Refs|
|Chebi||Compound->Patent, Compound->Publication, Compound->Compound Properties|
|Chembl||Activity->Publication, Activity->Assay, Assay->Publication, Assay->Protein, Assay->Organism, Compound->Protein, Compound->Assay, Compound->Activity, Compound->CellLine, Compound->Organism, Compound->Publication, Compound->Tissue, Compound->Compound Properties|
|ChemSpider||Cross-Refs for Compounds and Drugs|
|ClinicalTrials*||ClinicalTrial -> Drug, ClinicalTrial -> Disease, ClinicalTrial -> Agency|
|ClinVar*||Gene -> Disease|
|Cosmic||Cell Line->Publication, Cell Line->Tissue, Cell Line->Disease, Cell Line->Gene, Mutation->Publication, Mutation->Gene, Mutation->Cell Line|
|CTD||Drug->Disease, Drug->Publication, Disease->Publication, Cross-Refs|
|DailyMed*||Drug -> SideEffect|
|Drugbank||Drug->Protein, Drug->Drug Interactions, Drug->Patent, Drug->Publication, Protein->Publication, Drug->Drug Properties|
|eggNOG*||Protein -> Protein [orthologs] , OrthologousGroup -> Protein ,OrthologousGroup -> Properties|
|Entrez Gene||Gene->Gene, Gene->Disease, Gene->Organism, Gene->Publication, Gene->GO-Molecular Function, Gene->GO-Biological Process, Gene->GO-Cellular Component, Disease->Publication, Gene->Gene Properties, Cross-Refs|
|GeneOntology*||Protein -> GO-MolecularFunction, Protein -> GO-CellularComponent, Protein -> GO-BiologicalProcess, GO-BiologicalProcess -> Publication, GO-MolecularFunction -> Publication, GO-CellularComponent -> Publication, Gene -> GO-MolecularFunction, Gene -> GO-CellularComponent, Gene -> GO-BiologicalProcess|
|Grammene*||Gene -> Pathway ,Gene -> AnatomicalEntity ,Gene -> DevelopmentStage , Pathway -> BiochemicalReaction ,Pathway -> GO-CellularComponent , Pathway -> Compound ,BiochemicalReaction -> Component , BiochemicalReaction -> Publication , BiochemicalReaction -> Protein ,BiochemicalReaction -> GO-CellularComponent , BiochemicalReaction -> Compound ,Compound -> GO-CellularComponent , Organism -> Pathway|
|HomoMINT||Protein->Protein Interaction, Protein Interaction->Publication|
|ICGC||Mutation -> Gene, Disease -> Gene, Disease -> Patient, Patient -> Specimen, Specimen -> Specimen Properties, Mutation -> Mutation Properties, Patient -> Patient Properties, Cross-Refs|
|IntAct||Protein->Protein Interaction, Protein Interaction->Publication|
|InterPro||Protein->Publication, Protein Domain->Publication, Protein->GO-Cellular Component, Protein Domain->GO-Biological Process, Protein->Protein Domain, Protein->GO-Molecular Function, Protein Domain->GO-Cellular Component, Protein->GO-Biological Process, Protein Domain->GO-Molecular Function|
|Methdb||Gene -> GenomicRegion, GenomicRegion -> Experiment, GenomicRegion -> MethylationPattern, GenomicRegion -> Tissue, GenomicRegion -> MethylationProfile, GenomicRegion -> MethylationContent, Experiment -> Experiment Properties, GenomicRegion > GenomicRegion Properties, MethylationContent -> MethylationContent Properties, MethylationPattern -> MethylationPattern Properties, MethylationProfile -> MethylationProfile Properties|
|MINT||Protein->Protein Interaction, Protein Interaction->Publication|
|NCBI Taxonomy||Organism Labels and Names|
|NCI||Protein->Disease, Protein->Drug, Gene->Drug, Gene->Publication, Gene->Disease, Disease->Publication, Protein->publication, Drug->Publication|
|NCI-DTP||Screening Study->Cell Line, Compound->Cell Line, Drug->Cell Line, Drug->Screening Study, Compound->Screening Study, Disease->Publication, Drug->Publication, Drug->Disease, Cross-Refs|
|OMIM||Disease->Publication, Disease->Disease Properties|
|Orphanet*||Disease -> Gene, Disease -> Protein, Disease -> Properties|
|PharmGKB||Drug->Disease, Drug->Publication, Disease->Publication|
|PhosphoSite*||Protein -> Protein (Substrates)|
|Phytozome*||Gene ->GO-MolecularFunction, Gene -> GO -> CellularComponent , Gene -> BiologicalProcess, Chromosome -> Mutation, Chromosome -> CDS , Chromosome -> Gene , Chromosome -> 5'UTR, Chromosome -> 3'UTR,Chromosome ->mRNA, Chromosome -> Exon|
|PlantOntology*||Gene -> DevelopmentStage , Gene -> AnatomicalEntity|
|Reactome||Protein -> Protein, Protein -> Publication, Protein -> Pathway, Compound -> GO-CellularComponent, BiochemicalReaction -> Component, BiochemicalReaction -> GO-CellularComponent, BiochemicalReaction -> Publication, BiochemicalReaction -> Complex, BiochemicalReaction -> Organism, BiochemicalReaction -> Protein, BiochemicalReaction -> Compound, BiochemicalReaction -> GO-MolecularFunction, BiochemicalReaction -> PhysicalEntity, BiochemicalReaction -> Image, BiochemicalReaction -> GO-BiologicalProcess, PhysicalEntity -> GO-CellularComponent, BiochemicalReaction -> Catalyst, Pathway -> Organism, Pathway -> GO-BiologicalProcess, Pathway -> Image, Pathway -> BiochemicalReaction, Pathway -> Protein, Pathway -> GO-MolecularFunction, Pathway -> GO-CellularComponent, Pathway -> Publication, Pathway -> Compound, Pathway -> Complex, Pathway -> PhysicalEntity, Complex -> GO-CellularComponent|
|Sanger*||CellLine -> Disease, CellLine -> Publication, Disease -> Publication, Compound -> ScreeningStudy, Compound -> Publication, Drug -> ScreeningStudy, Drug -> Publication, ScreeningStudy -> CellLine, ScreeningStudy -> Publication, ScreeningStudy Properties|
|String||OrthologousGroup -> Protein, Protein -> Protein Orthologs|
|Swissprot||Protein->GO-Molecular Function, Protein->GO-Biological Process, Protein->GO-Cellular Component, Protein->Organism, Protein->Gene, Protein->Sites, Protein->Regions, Protein->Pathway, Protein->Protein Interaction, Protein->Disease, Protein->Publication, Protein->Protein Properties|
|TAIR*||Gene -> Publication, Gene -> GO-CellularComponent , Gene -> GO-BiologicalProcess , Protein ->Protein (interactions) , Gene -> GO-MolecularFunction , Gene -> AnatomicalEntity, Gene ->DevelopmentStage ,DevelopmentStage -> Publication, AnatomicalEntity -> Publication|
|Urinary Protein Biomarker Database*||Protein -> Disease[biomarker], Protein -> Publication, Disease -> Publication|
|VirusMINT||Protein->Protein Interaction, Protein Interaction->Publication|
|WikiPathways*||Pathway -> Organism , Pathway -> Protein , Pathway -> Gene , Pathway -> Compound|
* Newly added databases.
- AHFS code
- ATC code
- Brand name
- CAS number
- Disease indication
- Inchi key
- Half life
- Mechanism of action
- Molecular weight
- Protein binding
- Route of elimination
- Affected Organism Name
- Brand Mixtures
- Food Interactions
- Molecular Formula
- Volume Of Distribution
- Water Solubility
- Protein- Activity
- Protein-Calcium Binding Region
- Protein-Coiled-Coil Region
- Protein-Compositional Bias Region
- Protein-DNA Binding Region
- Protein-Glycosylation Site
- Protein-Intramembrane Region
- Protein-Lipid Binding Region
- Protein-Metal Binding Site
- Protein-Mutagenesis Site
- Protein-Nucleotide Binding Region
- Protein-Transmembrane Region
- Protein-Zinc Finger Region
- Sequence Length
- Sequence Mass
- Sequence Similarity
- Related Gene Name
- Related Pathway Description
- Related Polymorphism Description
- Related Disease Description
- Recommended FullName
- Alternative FullName
- Recommended ShortName
- Alternative ShortName
- Sequence Similarity
- Induction Description
- Developmental Stage Expression Description
- Subcellular Location Description
- Tissue-Specificity Description
- Subunit Structure Description
- Domain Description
- Cofactor Description
- Disruption Phenotype Description
- Enzyme Regulation Description
- RNA-Editing Description
- Catalytic Activity
- Redox Potential
- Pharmaceutical Use
- Biotechnological Use
- Allergen Property
- Toxic Dose
- Redox Potential
- RNA-Editing Description
- Compound-Cell line
- Acidic pKa
- Basic pKa
- Canonical SMILES
- Chemical structure
- H-bond acceptor
- H-bond donor
- Inchi key
- Log D
- Log P
- Molecular formula
- Molecular species
- Molecular weight
- Standard Inchi
- Standard Inchi key
- Mutation-Cell line
- Somatic Status
- GRCh genome Position
- NCBI genome Position
- CDS Mutation
- Codon Change
- Amino Acid Mutation
- Validation platform
- Sequence Coverage
- Validation status
- Read count
- Tumour genotype
- Reference genome allele
- Consequence type
- Nucleotode change
Frequently Asked Questions
- 1. How do I get started?
You can start your search by simply typing a query in the search box. Your query could be a gene, protein, drug, compound or a biological process. As you type, the autosuggest will make relevant suggestions regarding the query term. You can add more terms to the query and select connections between the terms. For instance, you could select "Complete profile" from the autosuggest to view all the connections available for your term in DistilBio.
The query that you build appears in the query tab with each node representing a search term and the connections between the search terms. Results can be viewed by clicking on the result tab.
Take a tour by clicking on "Take a tour" on the homepage.
Examples can be found here
- 2. How complex can my query be?
Queries can range from simple keywords such as “aspirin” or “colon cancer” or complex ones such as proteins associated with colon cancer and the drugs targeting these proteins. In this case, query would read as colon cancer > associated protein > is target of drug.
More examples can be found here
- 3. Can I get a tour of DistilBio?
Yes. Clicking on "Take a tour" button on the left side of the DistilBio homepage to get an idea of how to query and view results in DistilBio.
- 4. What is "Complete profile"?
"Complete profile" gives you a birds eye view of all the connections available for your search term in DistilBio.
- 5. What does the graph in the query tab represent?
The graph is a visual representation of your query with each node representing a search term and the connections between the search terms.
- 6. Can I use the results to make a new query?
Yes, you can select any of the results and extend your query by clicking the “Extend Search” button that appears on the facet.
- 7. How do I clear a query?
You can clear your query by clicking on the “clear” button in the search box.
- 8. Where can I find my recent searches?
You can find your recent searches in the panel to the left of the page marked “Recent Searches”. The last 10 searches will be listed in the panel.
- 9. How can I view the results?
You can view the results by clicking on the “Results” tab next to the query tab.
- 10. I have a list of terms and I want to run a search in DistilBio to find connections. What can I do?
You can now upload your list of terms by clicking on the "Upload list of terms" link below the search box. This will take you to a login page where you can sign up if you are new to DistilBio or login if you already have a username and password. Once you login, follow the instructions and fill in the details of your data and upload to find all connections for your terms.
- 11. How does “Upload sequence” work in DistilBio?
Once you upload your sequence in the box, DistilBio runs a BLAST of your sequence against SWISSPROT. Once the results are fetched from the server a page lists the matching proteins. Select the protein(s) of your interest to use them in your query.
- 1. Can I have a brief overview of the results page?
- The results are displayed in a faceted view with one facet per concept and the connecting link between the facets. Each facet contains information regarding the number of results fetched and “Show details” link to view more details about the particular term.
- Clicking on the connecting link loads the source of the data in a new pop-up facet named “Source”.
- Clicking on "Show evidence" in the source facet loads curated references from the source databases in a new pop-up window.
- Results can also be exported by clicking on the “Export” button.
- 2. What is the "Table view" tab?
The "Table view" tab displays the results of your query in a table format.
- 3. How do I minimize/maximize the facets?
Facets can be minimized/maximized by clicking on the “+” at the top of the facet as shown below.
- 4. How do I select items listed in a facet?
To select an item in the facet click the check-box to the left of the item. To select all the items in a facet click on the check-box at the top left side of the facet.
- 5. What are the various ways of filtering the results?
There are 2 ways of filtering the results. The results can be selected and filtered to display only instances and their relationships that a user is interested in. For instance, if a query is created for Aspirin – Protein – Disease, generally multiple target proteins and associated diseases are shown. Selecting a particular disease (using the check boxes) narrows the results showing only proteins connected with the selected diseases.
You can also filter the results by the source. For instance, if you want to look at data for the Protein - Disease connection only from NCI, selecting NCI in the source box displays results from NCI.
- 6. I’d like to see the query graph in the results page. Where is it?
Click the “Query” Tab to view the query graph and also to modify your query.
- 7. Why can’t I find any data for my query?
DistilBio currently covers specific databases and the data that you are looking for may not be available in the data source. We are constantly adding new datasets. Please use the feedback form to let us know what you were looking for and also if you would like to add any databases. Thank you for taking the time to give us your feedback.
- 8. How do I see more results than the ones displayed in the facet?
If more data is available for your query DistilBio displays “Get more results” at the top of the result page. In queries with a single search term, you can click on the “More” button for each facet to view more results for the particular facet.
- 9. How do I find evidence for a triple?
You can find the evidence for a triple by clicking on the connecting link between two facets. This displays the source of the data in a new pop-up facet. Clicking on "Show evidence" displays publication evidences for the connection in a new pop-up window. These are curated evidences for source databases.
- 10. Why can’t I find publication evidences for some of the connections?
The publications displayed are curated evidences from the source databases. If publications have not been displayed, either they are not available from the source database or we are in the process of integrating the data.
- 11. What are the various connections available?
Check all the connections here
- 12. Why don’t I see some of the results I found previously?
New versions of source databases are released and DistilBio updates data on a regular basis. Newer versions may have changes in data leading to new connections or loss of some connections.
- 13. How are the publications shown in the Evidence Card relevant?
The publications in the tab “Evidence for Connection” are curated publications that discuss a possible link between the 2 concepts that the user has queried for. These publications have been derived from the source databases from which the data has been integrated. The source database for a connection is mentioned in the box to the right of the page. The papers have not been curated by DistilBio.
- 14. Can I view the structure of a protein?
You can view the structure of a protein by going to the “Structure” tab in the right panel and selecting the structure you would like to view. This opens a new window with the structure that you have selected.
- 15. Can I export the results I have obtained?
You can export the results in csv format by clicking on the “Export” button. This opens a new window from where you could copy the contents into a spreadsheet or text editing program or click on “Create download link” to create a downloadable link.
- 16. Can I save my searches?
Yes, you can save your searches by registering on DistilBio (registration is free). Your searches can be retrieved any time later.
- 1. What are the databases currently covered by DistilBio?
Check all the databases here
- 2. What are the relationships currently covered by DistilBio?
Check all the relationships here
- 3. How often do you update the data?
Release dates of our source databases are varied and we try to keep our data as updated as possible.
- 4. Does DistilBio curate data?
Currently, DistilBio only curates data for the drug - disease connection from DrugBank.
- 5. What does “DistilBio inferred” mean?
In the Drug – Disease connection, DistilBio infers data by connecting the drug to its target proteins and the proteins to the diseases associated with it. This forms the “DistilBio inferred” connection for Drug – Disease via the protein targets. Similar results are available for Pathway related genes. This is inferred via the proteins related to the pathway.
- 6. What does the connection "associated" mean?
This is a "generic" connection and is used across several concepts. The "associated" connection type does not specify causality. "Associated" has been used to link concepts such as Gene-Disease, Gene-Pathway, Disease-Protein. For example, Gene BRCA2 is associated with diseases Breast cancer, prostate cancer, Ovarian cancer etc.
- 7. What does the connection "related" mean?
This is a "generic" connection and is used across several concepts. In the Drug-Disease relationship, "Related" refers to diseases where the drug may play a role either as a treatment for the disease or in the etiology of the disease i.e., Drug X maybe the cause of Disease Y. In the Drug-Protein connection, "related" refers to proteins that could be targets of the drug or may be associated with the drug either in its metabolism or as carrier, transporter etc. For example, Drug Ximelagatran has diseases drug induced abnormalities, embolism, thrombosis as related diseases. Ximelagatran causes drug induced abnormalities whereas it is used in the treatment of thrombosis. Drug Omeprazole has 25 proteins related to it and includes targets ATP4A, ATP4B and Cytochrome P 450 enzymes (CYP1A1,CYP1A2,CYP2C9 etc) that play a role in metabolism of drug.
- 8. Where can I find the source of the data?
Sources for the results are displayed at the left of the page in the box named “Source”. Selecting the protein/drug etc. displays the data sources such as DrugBank, Uniprot.
- 9. What if I find an error in the data?
Please use the feedback form to let us the details of the error. Thank you for taking the time to inform us about the error.
- 10. Can I suggest a database to be included in DistilBio?
Yes, absolutely! Send us an email at firstname.lastname@example.org