Kategorier: Alle - integration - resources - redundancy - tools

af xien low 10 år siden

349

Assignment 2: NAR database

Databases are crucial in the field of genomics as they store vast amounts of genome data, which can be used to enhance human health and disease treatment. By preventing resource loss and ensuring data is current, these databases support comparative genomics across diverse organisms.

Assignment 2: NAR database

SQBU3723 (01) Biocomputation and Bioinformatics Assignment 2: Nucleic Acid Research (NAR) Database Date: 6-3-2014 Group 7 CHAN ZHI XIN A11QB0054 LAI CHONG PAU A11QB0073 LOO FONG FONG A11QB0014 LOW HUI XIEN A11QB0082 NG NAN SEE A11QB0001

Nucleic Acid Research (NAR) Database

Organisation of databases and its major grouping

Cell Biology
Eg. NCBI Bookshelf
Eg. MethylomeDB
Eg. ExoCarta
Eg. CloneDB
Immunological Databases
Eg. The Immune Epitope Database (IEDB
Eg. Protegen
Eg. Epitome
Eg. AntigenDB
Plant Databases
Eg. Chloroplast Genome Database
Other plants
Rice
Arabidopsis thaliana
General plant databases

Eg. MetaCrop

Eg. GeneFarm

Organelle Databases
Eg. Plant Organelles Databases
Eg. Organelle genomes
Eg. Organelle DB
Mitochondrial genes and proteins

Eg. MitoGenesisDB

Eg. MitoDrome

Eg. Human MtDB

Other Molecular Biology Databases
Eg. PubMed
Eg. BioModels
Molecular probes and primers

RTPrimerDB

probeBase

PrimerBank

Human OligoGenome Resource

Drugs and drug design

Eg. SuperNatural

Eg. BacMet

Proteomics Databases
Eg. PeptideAtlas
Eg. 2D-PAGE
Microarray Data andother Gene Expression Databases
Eg. GeneTrap
Eg. GeneNote
Eg:Gene Expression Barcode
Eg. CycleBase
Human Genes and Diseases
Gene-, system-, disease-specific databases
Cancer gene databases
General polymorphism databases
General human genetics databases
Human and other Vertebrate Genomes
Human ORFs

Eg. PReMod

Eg. PlasmID

Eg. GeneSpeed

Human genome databses, maps and viewers

Eg. GeneLoc

Eg. GeneAnnot

Model organisms, comparative genomics

Eg. TreeFam

Eg. Mouse Phenome Database

Eg. Animal Genome Size Database

Metabolic and Signaling Pathways
Signalling pathway

Eg. SPIKE

Eg. Quorumpeps

Eg. Networkin

Protein-protein interaction

Eg. SynSysNet

Eg. VirusMINT

Eg. GeneNet

Eg. EndoNet

Metabolic pathwaqys

Eg. Reactom

Eg. Bionemo

Enzymes and enzymes nomenclature

Eg. MultiTaskDB

Eg. FunTree

Eg. NMPDR-National Microbial pathogen Data Resource

Genomics Databases (non-vertebrate)
Invertebrate genome databases

Eg. NEMBASE

Fungal genome databases

Eg. YEASTRACT

Eg. YeastNet

Unicellular eukaryotes genome databases

Eg. TBestDB

Eg. Camparasite

Prokaryotic genome databases

Eg. MicroScope

Eg. AlterORF

Viral genome databases

Eg. HIV Drud Resistance Database

Eg. HepSeq

General genomics databases

Eg. GenoList

Eg. BacMap

Taxonomy and identification

Eg. MetaRef

Eg. GeneTrees

Genome annotation terms, ontologies ad nomenclature

Eg. MetaBase

Eg. IUPAC Nomenclature database

Eg. BioThesaurus

Structural Databases
Protein structure

Eg. SitesBase

Eg. IDEAL

Eg. DSDBASE

Nucleic acid structure

Eg. RNA FRABASE

Eg. 3DNALandscapes

Carbohydrates

Eg. Monosaccharide Browser

Eg. GlycoMapsDB

Small molecules

Eg. SuperToxic

Eg. SuperDrug

Eg. DrugBank

Eg. ChemBank

Protein Sequences Databases
Databases of individual protein families

Eg. SuperCYP

Eg. TransportDB

Eg. Histone Database

Eg. BACTIBASE

Protein domain databases; protein classification

Eg. OrthoDB

Eg. FunShift

Protein sequence motifs and active sites

Eg. PHOSIDA

Eg. eBLOCKS

Protein localization and targeting

Eg. PeroxisomeDB

Eg. CentrosomeDB

Eg. REFOLD

Eg. BindingDB

General sequence databases

Eg. UniProt

Eg. Patome

RNA Sequences Databases
The Small Subunit rRNA Modification Database
RNA Modification Database
Ribosomal Database Project (RDP-II)
HIV Sequence Database
Database for Bacterial Group II Introns
5S Ribosomal RNA Database
3D rRNA modification maps
16S and 23S Ribosomal RNA Mutation Database
Nucleotide Sequence Databases
Transcriptionnal regulator sites and transcription factors

Eg. TFClass

Eg. QuadBase

Gene structure, introns and exons, splice sites

Eg. Spliceosome Database

Eg. GeneTack

Coding and non-coding DNA

Eg. TranspoGene

Eg. Plant repeat database

Eg. MethDB

International Nucleotide Sequence Database Collaboration

Eg. The Sequence Read Archive (SRA)

Eg. NCBI Biosample/ BioProject

Eg. GenBank

Eg. DDBJ-DNA Data Bank of Japan

Why some databases are no longer in the the databases and dropped from it?

Limited budjets
Taken commercial route and no free version
Redundancy
Obsolete and non-responsive

Why databases are created and shared?

Ensure data available are up-to-date
As tools for genome analysis
To prevent loss of resources
For integration and cross-reference
Comparative genomics purposes between diverse organisms and species
Emphasis on genome data to improve human health and disease treatment

Why we need to group the databases?

To update the changes (deletion and insertion of databases)
To exploit the related resources or specific information easily and effectively
To track databases, methods and tools in specific field
To utilise the databases effectively when neccessary
To narrow down the search results
To manage an increasing number of informations in the systematic ways

Criteria for selection into NAR databases

Consideration of so-called ‘boutique’ databases, covering relatively narrow topics
Avoid accepting new EST databases
as these data have a home in the DDBJ, genbank and European Nucleotide Archive databases
particularly those dealing with individual species
Avoid accepting databases on gene expression
as the underlying data must be submitted to array express/ GEO
Supplement with convenient search tools and easy-to-use visualization
Providing a convenient one-stop source of disparate data not available elsewhere
Data warehouses, portals, cross-platform search tools and visualization tools
Freely available online as well as selected databases published elsewhere
Web-accessible databases that offer carefully curated data that are not available elsewhere
Degree of value added (usually in the form of manual curation) in the production of the database
Comprehensiveness of coverag
The general utility of the database to the scientific community

number available

online molecular biology database collection
1552 online databases

41 subcategories

15 major groups

The different types of databases that useful in searching information of haemoglobin

Protein sequence databases
Protein properties

BindingDB

3. For example, this database curates measured binding affinities for the association of haemoglobin with various small organic molecules and their inhibition effect respectively.

2. Provide a deep search for various detailed data related those assays.

1. From the full search results for haemoglobin, various haemoglobin-related assays are available.

General sequence database

NCBI Protein database

3. From the category of proteins, key in haemoglobin. The search results would show approximately 32656 records related to sequences for various haemoglobins, 898 records related to structures, 167 records associated with the sequence similarity-based protein clusters, followed by 16 records related to the conserved protein domains.

2. Search results for haemoglobin displayed various data types available and those data are nicely categorized into different categories such as literature, organisms, proteins, chemicals and pathways.

1. NCBI provides Global Cross-database Search.