.

FTP Directory Structure

Introduction

The Ensembl FTP site provides biological sequence databases suitable for large-scale local sequence similarity search approaches, as well as MySQL table dumps of all underlying Ensembl databases. These table dumps are suitable for import into relational database management systems and allow installation of complete Ensembl mirror sites.

Please note: Ensembl supports downloading of many correlation tables via the highly customisable BioMart data mining tool. You may find exploring this web-based data mining tool easier than extracting information from our normalised database dumps.

The URL ftp://ftp.ensembl.org/pub/ is the basis of the directory structure outlined below. The structure is also described in the FTP site README. The latest data sets are available via directories prefixed 'current_'. For example 'current_embl' will always point to the latest data release files in EMBL format.

FTP Site Directories

The FTP directory has the following basic structure, although not all information is neccessarily available for each species.


   |-- embl	      Gene predictions annotated on genomic DNA slices of 1 Mb in EMBL format.
   |   | 
   |   |-- species
   |
   |
   |-- emf     	      Alignment dumps in EMF format
   |   | 
   |   |-- pecan                  * Pecan whole genome multiple alignments
   |   |                            with conservation scores for selected sets
   |   |-- ensembl_compara        * protein trees and protein multiple alignments
   |   |                            underlying orthologue/paralogue predictions
   |   |--species_variation       * resequencing data
   |
   | 
   |-- fasta	      Gene predictions in FASTA datatabase format
   |   |
   |   |--species
   |      |
   |      |-- cdna         * Transcript (cDNA) predictions
   |      |-- dna          * Genomic DNA in assembled entities
   |      |-- pep          * Translation (peptide) predictions
   |      |-- rna          * Non-coding RNA predictions
   |
   |            
   |-- genbank	      Gene predictions annotated on genomic DNA slices of 1 Mb in GenBank format.
   |   | 
   |   |-- species
   |
   |
   |-- gtf	      Gene annotation in GTF format
   |
   |-- mysql          MySQL database table text dumps
       |
       |-- core       General genome annotation information
       |
       |                * Genome sequence assembly
       |                * Ensembl gene predictions
       |                * Ab initio gene predictions
       |                * Marker information
       |                * ...
       |
       |-- otherfeatures  Additional genome annotation
       |
       |                * Gene predictions based on EST information
       |                * ...
       |
       |-- variation  Genetic variation information
       |-- vega       Manually curated gene sets
       |-- cdna       cDNA to genome alignments based on the latest EMBL database
       |
       |-- ensembl_compara      Cross-species comparative genomics data:
       |
       |                          * Orthologue/paralogue predictions
       |                          * Protein families
       |                          * Whole genome alignments
       |                          * Synteny information
       |
       |-- ensembl_go           Gene Ontology database
       |
       |-- ensembl_web_user_db  SQL table defintion for server-side user config database
       |
       |-- ensembl_website        Ensembl web site database:
       |                            * Context-sensitive help articles
       |                            * News articles
       |                            * Mini-ads
       | 
       |-- ensembl_mart              Cross-species data mining tables
       |
       |-- genomic_features_mart     Clone data sets
       |
       |-- ontology_mart
       |
       |-- sequence_mart             Genome sequences
       |
       |-- snp_mart                  Genetic variation information
       |
       |-- vega_mart                 Manually curated gene sets


 

© 2024 Inserm. Hosted by genouest.org. This product includes software developed by Ensembl.

                
GermOnline based on Ensembl release 50 - Jul 2008
HELP