Nucleic Acids Research Database

Organization

NAR Database Curation

1

Semi-automated identification of dormant, non-functioning databases

2

Communication with owner and/or author to determine commitment on database continuation and updating

3

Curating conducted to update current databases’ URLs, present recent activity, merge with other databases or withdraw antiquated databases based on their responsiveness

4

Categorization of active functioning databases into eight sections to facilitate navigation as grouping the relevant databases

5

Establishment of intuitive interconnection between the databases based on their submitted uniform-format attributes

Cooperation: NAR with Bioinformatics Links Directory

1

Develop community-driven database public repository with contextual annotation and functional relevancy organization for establishing biologist-friendly bioinformatics ‘resourceome’

2

Management improved by community efforts: suggest (new) database link; submit database; rating and review; comment and discuss; disseminate

3

Cataloguing databases on their functions and features by categorizing into relevant, representative biological subject; retrieving search result via matching queried phrase to similar word in link’s title, description or tags

4

Database links are also grouped whilst arranged by scholarly citation count and social media sharing count

Type of Database

Type I : Nucleic acid sequence, structure, and regulation

Identified database : GenBank (http://www.ncbi.nlm.nih.gov/genbank/)

Description : Database that contains nucleotide sequences obtained through submissions from individual laboratories and batch submissions from large scale sequencing projects.

Why it is useful

1

Provide nucleotide sequences and their protein translations in various organisms.

2

Publicly available, can be used freely by anyone.

3

Minimal delay in accessing latest information.

4

By using Entrez to explore GenBank, it maintains a history of your search, and allows you to combine and modify previous searchesand look at other resources provided by NCBI. Entrez is an important integrated database provided by NCBI, and provides powerful tools to find and explore biotechnology information.

Type II : Protein sequence and structure, motifs and domains.

Identified Database: UniProt (http://www.uniprot.org/)

Description : The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data. The UniProt databases are the UniProt
Knowledgebase (UniProtKB), the UniProt Reference Clusters (UniRef), and the
UniProt Archive (UniParc). The UniProt Metagenomic and Environmental
Sequences (UniMES) database is a repository specifically developed for metagenomic and environmental data.

Why it is useful

1. Central hub for the collection of functional information on proteins, with accurate, consistent and rich annotation

2. Can avoid the redundancy of multiple copies of protein in same databases or different databases by storing the protein in stable and unique identifier (UPI)

Type III : Metabolic and signalling pathways , enzymes.

identified databese : BRENDA databes (http://www.brenda-enzymes.org)

Description

It is a series of chemical reactions which are initiated by a stimulus (first messenger) acting on a receptor that is transduced to the cell interior through second messengers (which amplify the initial signal) and ultimately to effector molecules, resulting in a cell response to the initial stimulus.

Various controlling factors are involved to regulate cellular actions,affected by their changing internal and external environments.

Why it is useful

the enzyme induces cleavage of gluten-derived peptides predigested by pepsin and pancreatic enzymes an exhibiting a detoxifying effect in the host's gut.

Type IV : Viruses, Bacteria, Protozoa and Fungi

Identified database- TrypanoCyc (www.metexplore.fr/trypanocyc)

Description- This database describes the generic and condition-specific metabolic network of Trypanosoma brucei ( a parasitic protozoan) responsible for human and animal African Trypanosomiasis ( common known as sleeping sickness).

Why it is useful- This database give information about the protein ( acidin-pepsin) also known as glycine betaine

1

It molecular group

2

It chemical formula

3

Molecular weight

4

Monoisotopic molecular weight

5

Metabolic reaction of this compound

Type V : Human genome, model
organisms, comparative genomics.

Identified database : neXtProt (http://www.nextprot.org/)

Description : it is a database that provide various types of information on human proteins.

Why it is useful

1

It shows various types of a protein including it’s definition, function, and process which the protein involve.

2

Very easy to use.

3

Provide many information.

Type VI : Genomic variation, diseases and drugs

Identified Database: CancerPPD (http://crdd.osdd.net/raghava/cancerppd/)

Why it is useful?

1. Has peptide sequence search, which allow researcher to searching given peptide sequence against sequences of all peptides available in CancerPPD

2. Simple search, provides basic facility to retrieve data from the database. It allows users to perform keyword search on any field of the database

3. Complex query, facilitate users who wish to perform complex search to extract the desired information from CancerPPD

Type VII : PLANT DATABASE

Description

Designed to make comparative genomics data for plants available through a user-friendly web interface.

Structural and functional annotation, gene families, protein domains, phylogenetic trees and detailed information about genome organization can easily be queried and visualized.

Why it is useful

Pepsin are part of the cleavage process. Pepsin also cleaves nonheme iron which is found in plant foods such as cereals, fruits and vegetables from a protein to facilitate absorption.

Type VIII : Other Databases

Identified database- HAMAP ( High-Quality Automated and Manual Annotation of Proteins) (http://hamap.expasy.org)

Description- This database provides automatic classification and annotation of protein sequences. It also allows precise annotation of individual functional variants within large homologous protein families.

Why it is useful- This database provides:

1

Determination of family membership of protein sequence

2

Determination of annotation rule

3

Annotating protein and gene names

4

Describe protein function including catalytic activity and etc.