Nucleic Acids Research Database
Organization
NAR Database Curation
Semi-automated identification of dormant, non-functioning databases
Communication with owner and/or author to determine commitment on database continuation and updating
Curating conducted to update current databases’ URLs, present recent activity, merge with other databases or withdraw antiquated databases based on their responsiveness
Categorization of active functioning databases into eight sections to facilitate navigation as grouping the relevant databases
Establishment of intuitive interconnection between the databases based on their submitted uniform-format attributes
Cooperation: NAR with Bioinformatics Links Directory
Develop community-driven database public repository with contextual annotation and functional relevancy organization for establishing biologist-friendly bioinformatics ‘resourceome’
Management improved by community efforts: suggest (new) database link; submit database; rating and review; comment and discuss; disseminate
Cataloguing databases on their functions and features by categorizing into relevant, representative biological subject; retrieving search result via matching queried phrase to similar word in link’s title, description or tags
Database links are also grouped whilst arranged by scholarly citation count and social media sharing count
Type of Database
Type I : Nucleic acid sequence, structure, and regulation
Identified database : GenBank (http://www.ncbi.nlm.nih.gov/genbank/)
Description : Database that contains nucleotide sequences obtained through submissions from individual laboratories and batch submissions from large scale sequencing projects.
Why it is useful
Provide nucleotide sequences and their protein translations in various organisms.
Publicly available, can be used freely by anyone.
Minimal delay in accessing latest information.
By using Entrez to explore GenBank, it maintains a history of your search, and allows you to combine and modify previous searchesand look at other resources provided by NCBI. Entrez is an important integrated database provided by NCBI, and provides powerful tools to find and explore biotechnology information.
Type II : Protein sequence and structure, motifs and domains.
Identified Database: UniProt (http://www.uniprot.org/)
Description : The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data. The UniProt databases are the UniProt
Knowledgebase (UniProtKB), the UniProt Reference Clusters (UniRef), and the
UniProt Archive (UniParc). The UniProt Metagenomic and Environmental
Sequences (UniMES) database is a repository specifically developed for metagenomic and environmental data.
Why it is useful
1. Central hub for the collection of functional information on proteins, with accurate, consistent and rich annotation
2. Can avoid the redundancy of multiple copies of protein in same databases or different databases by storing the protein in stable and unique identifier (UPI)
Type III : Metabolic and signalling pathways , enzymes.
identified databese : BRENDA databes (http://www.brenda-enzymes.org)
Description
It is a series of chemical reactions which are initiated by a stimulus (first messenger) acting on a receptor that is transduced to the cell interior through second messengers (which amplify the initial signal) and ultimately to effector molecules, resulting in a cell response to the initial stimulus.
Various controlling factors are involved to regulate cellular actions,affected by their changing internal and external environments.
Why it is useful
the enzyme induces cleavage of gluten-derived peptides predigested by pepsin and pancreatic enzymes an exhibiting a detoxifying effect in the host's gut.
Type IV : Viruses, Bacteria, Protozoa and Fungi
Identified database- TrypanoCyc (www.metexplore.fr/trypanocyc)
Description- This database describes the generic and condition-specific metabolic network of Trypanosoma brucei ( a parasitic protozoan) responsible for human and animal African Trypanosomiasis ( common known as sleeping sickness).
Why it is useful- This database give information about the protein ( acidin-pepsin) also known as glycine betaine
It molecular group
It chemical formula
Molecular weight
Monoisotopic molecular weight
Metabolic reaction of this compound
Type V : Human genome, model
organisms, comparative genomics.
Identified database : neXtProt (http://www.nextprot.org/)
Description : it is a database that provide various types of information on human proteins.
Why it is useful
It shows various types of a protein including it’s definition, function, and process which the protein involve.
Very easy to use.
Provide many information.
Type VI : Genomic variation, diseases and drugs
Identified Database: CancerPPD (http://crdd.osdd.net/raghava/cancerppd/)
Why it is useful?
1. Has peptide sequence search, which allow researcher to searching given peptide sequence against sequences of all peptides available in CancerPPD
2. Simple search, provides basic facility to retrieve data from the database. It allows users to perform keyword search on any field of the database
3. Complex query, facilitate users who wish to perform complex search to extract the desired information from CancerPPD
Type VII : PLANT DATABASE
identified database : Plaza 3.0 (http://bioinformatics.psb.ugent.be/plaza/versions/plaza_v3_dicots/a)
Description
Designed to make comparative genomics data for plants available through a user-friendly web interface.
Structural and functional annotation, gene families, protein domains, phylogenetic trees and detailed information about genome organization can easily be queried and visualized.
Why it is useful
Pepsin are part of the cleavage process. Pepsin also cleaves nonheme iron which is found in plant foods such as cereals, fruits and vegetables from a protein to facilitate absorption.
Type VIII : Other Databases
Identified database- HAMAP ( High-Quality Automated and Manual Annotation of Proteins) (http://hamap.expasy.org)
Description- This database provides automatic classification and annotation of protein sequences. It also allows precise annotation of individual functional variants within large homologous protein families.
Why it is useful- This database provides:
Determination of family membership of protein sequence
Determination of annotation rule
Annotating protein and gene names
Describe protein function including catalytic activity and etc.