The EMBL-EBI search and sequence analysis tools APIs in 2019

New and updated bioinformatics tools available through Job Dispatcher in 2019. The OpenAPI user interface for these tools is available from: https://www.ebi.ac.uk/Tools/common/tools/help

Category	Tools
Multiple Sequence Alignment (https://www.ebi.ac.uk/Tools/msa/)	Clustal Omega, Kalign, MAFFT, MUSCLE, T-Coffee, MView and WebPRANK
Pairwise Sequence Alignment (https://www.ebi.ac.uk/Tools/psa/)	Needle, Stretcher, Water, Matcher, LALIGN, and GeneWise
Phylogeny Analysis (https://www.ebi.ac.uk/Tools/phylogeny/)	Simple Phylogeny
Protein Functional Analysis (https://www.ebi.ac.uk/Tools/pfa/)	InterProScan 5, PfamScan, Phobius, Pratt, RADAR, HMMER3 phmmer and HMMER3 hmmscan
RNA Analysis (https://www.ebi.ac.uk/Tools/rna/)	Infernal cmscan and MapMi
Sequence Format Conversion (https://www.ebi.ac.uk/Tools/sfc/)	Seqret and MView
Sequence Operation (https://www.ebi.ac.uk/Tools/so/)	Seqcksum
Sequence Similarity Search (https://www.ebi.ac.uk/Tools/sss/)	NCBI BLAST+, PSI-BLAST, FASTA, SSEARCH, FASTM/S/F, GGSEARCH, GLSEARCH, PSI-Search and PSI-Search2
Sequence Statistics (https://www.ebi.ac.uk/Tools/seqstats/)	SAPS, Pepinfo, Pepstats, Pepwindow, Cpgplot, Newcpgreport, Isochore, Dotmatcher, Dottup, Dotpath and Polydot
Sequence Translation (https://www.ebi.ac.uk/Tools/st/)	Transeq, Sixpack, Backtranseq and Backtranambig

Category	Tools
Multiple Sequence Alignment (https://www.ebi.ac.uk/Tools/msa/)	Clustal Omega, Kalign, MAFFT, MUSCLE, T-Coffee, MView and WebPRANK
Pairwise Sequence Alignment (https://www.ebi.ac.uk/Tools/psa/)	Needle, Stretcher, Water, Matcher, LALIGN, and GeneWise
Phylogeny Analysis (https://www.ebi.ac.uk/Tools/phylogeny/)	Simple Phylogeny
Protein Functional Analysis (https://www.ebi.ac.uk/Tools/pfa/)	InterProScan 5, PfamScan, Phobius, Pratt, RADAR, HMMER3 phmmer and HMMER3 hmmscan
RNA Analysis (https://www.ebi.ac.uk/Tools/rna/)	Infernal cmscan and MapMi
Sequence Format Conversion (https://www.ebi.ac.uk/Tools/sfc/)	Seqret and MView
Sequence Operation (https://www.ebi.ac.uk/Tools/so/)	Seqcksum
Sequence Similarity Search (https://www.ebi.ac.uk/Tools/sss/)	NCBI BLAST+, PSI-BLAST, FASTA, SSEARCH, FASTM/S/F, GGSEARCH, GLSEARCH, PSI-Search and PSI-Search2
Sequence Statistics (https://www.ebi.ac.uk/Tools/seqstats/)	SAPS, Pepinfo, Pepstats, Pepwindow, Cpgplot, Newcpgreport, Isochore, Dotmatcher, Dottup, Dotpath and Polydot
Sequence Translation (https://www.ebi.ac.uk/Tools/st/)	Transeq, Sixpack, Backtranseq and Backtranambig

Table 1.

New and updated bioinformatics tools available through Job Dispatcher in 2019. The OpenAPI user interface for these tools is available from: https://www.ebi.ac.uk/Tools/common/tools/help

Category	Tools
Multiple Sequence Alignment (https://www.ebi.ac.uk/Tools/msa/)	Clustal Omega, Kalign, MAFFT, MUSCLE, T-Coffee, MView and WebPRANK
Pairwise Sequence Alignment (https://www.ebi.ac.uk/Tools/psa/)	Needle, Stretcher, Water, Matcher, LALIGN, and GeneWise
Phylogeny Analysis (https://www.ebi.ac.uk/Tools/phylogeny/)	Simple Phylogeny
Protein Functional Analysis (https://www.ebi.ac.uk/Tools/pfa/)	InterProScan 5, PfamScan, Phobius, Pratt, RADAR, HMMER3 phmmer and HMMER3 hmmscan
RNA Analysis (https://www.ebi.ac.uk/Tools/rna/)	Infernal cmscan and MapMi
Sequence Format Conversion (https://www.ebi.ac.uk/Tools/sfc/)	Seqret and MView
Sequence Operation (https://www.ebi.ac.uk/Tools/so/)	Seqcksum
Sequence Similarity Search (https://www.ebi.ac.uk/Tools/sss/)	NCBI BLAST+, PSI-BLAST, FASTA, SSEARCH, FASTM/S/F, GGSEARCH, GLSEARCH, PSI-Search and PSI-Search2
Sequence Statistics (https://www.ebi.ac.uk/Tools/seqstats/)	SAPS, Pepinfo, Pepstats, Pepwindow, Cpgplot, Newcpgreport, Isochore, Dotmatcher, Dottup, Dotpath and Polydot
Sequence Translation (https://www.ebi.ac.uk/Tools/st/)	Transeq, Sixpack, Backtranseq and Backtranambig

Category	Tools
Multiple Sequence Alignment (https://www.ebi.ac.uk/Tools/msa/)	Clustal Omega, Kalign, MAFFT, MUSCLE, T-Coffee, MView and WebPRANK
Pairwise Sequence Alignment (https://www.ebi.ac.uk/Tools/psa/)	Needle, Stretcher, Water, Matcher, LALIGN, and GeneWise
Phylogeny Analysis (https://www.ebi.ac.uk/Tools/phylogeny/)	Simple Phylogeny
Protein Functional Analysis (https://www.ebi.ac.uk/Tools/pfa/)	InterProScan 5, PfamScan, Phobius, Pratt, RADAR, HMMER3 phmmer and HMMER3 hmmscan
RNA Analysis (https://www.ebi.ac.uk/Tools/rna/)	Infernal cmscan and MapMi
Sequence Format Conversion (https://www.ebi.ac.uk/Tools/sfc/)	Seqret and MView
Sequence Operation (https://www.ebi.ac.uk/Tools/so/)	Seqcksum
Sequence Similarity Search (https://www.ebi.ac.uk/Tools/sss/)	NCBI BLAST+, PSI-BLAST, FASTA, SSEARCH, FASTM/S/F, GGSEARCH, GLSEARCH, PSI-Search and PSI-Search2
Sequence Statistics (https://www.ebi.ac.uk/Tools/seqstats/)	SAPS, Pepinfo, Pepstats, Pepwindow, Cpgplot, Newcpgreport, Isochore, Dotmatcher, Dottup, Dotpath and Polydot
Sequence Translation (https://www.ebi.ac.uk/Tools/st/)	Transeq, Sixpack, Backtranseq and Backtranambig

New sequence databases have been added to Job Dispatcher, which are available for Sequence Similarity Searches. These include MEROPS (24) databases, ChEMBL (25) targets, UniProtKB Reference Proteomes (8), as well as protein family databases used by HMMER3 (4). IPD-HMC, IPD-KIR and IPD-IHMGT/HLA (26) databases were also updated so that nucleotide sequences are split into CDS (coding sequence) and Genomic sequences. Table 2 lists all the databases currently provided in Job Dispatcher.

Table 2.

New and Updated Data resources available through Job Dispatcher in 2019

Category	Datasets
UniProtKB protein sequences	UniProtKB, SwissProt, SwissProt Isoforms, TrEMBL, UniProtKB Taxonomic Subsets (13 subgroups, including: bacteria, archaea, eukaryota, etc.), Reference Proteomes, Representative Proteomes (15, 35, 55, 75), UniProt Reference (UniRef 50, 90 and 100), UniParc, Unimes and UniProtKB-PDB
Patent protein sequences	EPO, JPO, KIPO, UPSPTO
Structures protein sequences	PDBe and PSI structure targets
Protein families	Pfam, TIGRFAM, Superfamily, Gene3D, PIRSF and TreeFam
Other protein sequences	Enzyme Portal, IntAct, IPD-IMGT/HLA, IPD-KIR, IPD-MHC, MEROPS (MP, MPEP and MPRO), ChEMBL and Quest for Orthologs
ENA nucleotide sequences	ENA sequence releases and updates for Coding, Non-coding, Barcode, Geospatial and others (10 subgroups, including: Expressed Sequence Tag, Genome Survey Sequence, etc.)
Ensembl Genomes sequences	Genomes from Bacteria, Fungi, Plants, Metazoa and Protists
Structures of nucleotide sequences	PDBe
Other nucleotide sequences	IMGT/LIGM-DB, IMGT/HLA (CDS and genomic), IPD-KIR (CDS and genomic) and IPD-MHC (CDS and genomic)

Category	Datasets
UniProtKB protein sequences	UniProtKB, SwissProt, SwissProt Isoforms, TrEMBL, UniProtKB Taxonomic Subsets (13 subgroups, including: bacteria, archaea, eukaryota, etc.), Reference Proteomes, Representative Proteomes (15, 35, 55, 75), UniProt Reference (UniRef 50, 90 and 100), UniParc, Unimes and UniProtKB-PDB
Patent protein sequences	EPO, JPO, KIPO, UPSPTO
Structures protein sequences	PDBe and PSI structure targets
Protein families	Pfam, TIGRFAM, Superfamily, Gene3D, PIRSF and TreeFam
Other protein sequences	Enzyme Portal, IntAct, IPD-IMGT/HLA, IPD-KIR, IPD-MHC, MEROPS (MP, MPEP and MPRO), ChEMBL and Quest for Orthologs
ENA nucleotide sequences	ENA sequence releases and updates for Coding, Non-coding, Barcode, Geospatial and others (10 subgroups, including: Expressed Sequence Tag, Genome Survey Sequence, etc.)
Ensembl Genomes sequences	Genomes from Bacteria, Fungi, Plants, Metazoa and Protists
Structures of nucleotide sequences	PDBe
Other nucleotide sequences	IMGT/LIGM-DB, IMGT/HLA (CDS and genomic), IPD-KIR (CDS and genomic) and IPD-MHC (CDS and genomic)

Table 2.

New and Updated Data resources available through Job Dispatcher in 2019

Category	Datasets
UniProtKB protein sequences	UniProtKB, SwissProt, SwissProt Isoforms, TrEMBL, UniProtKB Taxonomic Subsets (13 subgroups, including: bacteria, archaea, eukaryota, etc.), Reference Proteomes, Representative Proteomes (15, 35, 55, 75), UniProt Reference (UniRef 50, 90 and 100), UniParc, Unimes and UniProtKB-PDB
Patent protein sequences	EPO, JPO, KIPO, UPSPTO
Structures protein sequences	PDBe and PSI structure targets
Protein families	Pfam, TIGRFAM, Superfamily, Gene3D, PIRSF and TreeFam
Other protein sequences	Enzyme Portal, IntAct, IPD-IMGT/HLA, IPD-KIR, IPD-MHC, MEROPS (MP, MPEP and MPRO), ChEMBL and Quest for Orthologs
ENA nucleotide sequences	ENA sequence releases and updates for Coding, Non-coding, Barcode, Geospatial and others (10 subgroups, including: Expressed Sequence Tag, Genome Survey Sequence, etc.)
Ensembl Genomes sequences	Genomes from Bacteria, Fungi, Plants, Metazoa and Protists
Structures of nucleotide sequences	PDBe
Other nucleotide sequences	IMGT/LIGM-DB, IMGT/HLA (CDS and genomic), IPD-KIR (CDS and genomic) and IPD-MHC (CDS and genomic)

Category	Datasets
UniProtKB protein sequences	UniProtKB, SwissProt, SwissProt Isoforms, TrEMBL, UniProtKB Taxonomic Subsets (13 subgroups, including: bacteria, archaea, eukaryota, etc.), Reference Proteomes, Representative Proteomes (15, 35, 55, 75), UniProt Reference (UniRef 50, 90 and 100), UniParc, Unimes and UniProtKB-PDB
Patent protein sequences	EPO, JPO, KIPO, UPSPTO
Structures protein sequences	PDBe and PSI structure targets
Protein families	Pfam, TIGRFAM, Superfamily, Gene3D, PIRSF and TreeFam
Other protein sequences	Enzyme Portal, IntAct, IPD-IMGT/HLA, IPD-KIR, IPD-MHC, MEROPS (MP, MPEP and MPRO), ChEMBL and Quest for Orthologs
ENA nucleotide sequences	ENA sequence releases and updates for Coding, Non-coding, Barcode, Geospatial and others (10 subgroups, including: Expressed Sequence Tag, Genome Survey Sequence, etc.)
Ensembl Genomes sequences	Genomes from Bacteria, Fungi, Plants, Metazoa and Protists
Structures of nucleotide sequences	PDBe
Other nucleotide sequences	IMGT/LIGM-DB, IMGT/HLA (CDS and genomic), IPD-KIR (CDS and genomic) and IPD-MHC (CDS and genomic)

The data resources available in EBI Search are grouped by biological categories (Table 3). Since the last update, Europe PMC (27) (enriched replacement of Medline), BioSamples (28), Rfam (29), reviewed ChEMBL (25), OLS (30), dbGaP (31), EVA (32), InterPro 7 (12) and bio.tools (33) (replacement of ELIXIR registry) have been added as new resources. PomBase (34), MEDLINE and the ELIXIR registry have been retired.

Table 3.

Data resources available through EBI Search in 2019

Category	Data resources
Genomes and metagenomes	Ensembl Genomes, Ensembl, HGNC, DGVa, EGA, LRG, WormBase ParaSite, MGnify
Nucleotide sequences	ENA, RNAcentral, Rfam, NRNL1, NRNL2, IMGT/HLA, IPD-KIR, IPD-MHC
Protein sequences	UniProtKB, UniParc, UniRef, EPO, JPO, KIPO, USPTO, NRPL1, NRPL2
Macromolecular structures	PDBe, EMDB
Bioactive molecules	ChEBI, ChEMBL, Ligands
Gene expression	ArrayExpress, Expression Atlases, GEO, dbGaP
Molecular interactions	IntAct
Reactions, pathways	Rhea, Reactome, BioModels, MetaboLights, MetabolomeExpress, Metabolomics Workbench
Protein families	InterPro, TreeFam, Pfam, MEROPS, GPCRDB
Protein expression data	PRIDE, GNPS, GPMdb, MassIVE, PeptideAtlas, LINCS, Paxdb, jPOST
Enzymes	IntEnz, Enzyme Portal
Literature	Europe PMC, Patent families
Samples and ontologies	Taxonomy, GO, EFO, SBO, MESH, BioSamples, Identifiers.org registry, ORCID data claims, OLS, bio.tools
Diseases	OMIM, Human diseases

Category	Data resources
Genomes and metagenomes	Ensembl Genomes, Ensembl, HGNC, DGVa, EGA, LRG, WormBase ParaSite, MGnify
Nucleotide sequences	ENA, RNAcentral, Rfam, NRNL1, NRNL2, IMGT/HLA, IPD-KIR, IPD-MHC
Protein sequences	UniProtKB, UniParc, UniRef, EPO, JPO, KIPO, USPTO, NRPL1, NRPL2
Macromolecular structures	PDBe, EMDB
Bioactive molecules	ChEBI, ChEMBL, Ligands
Gene expression	ArrayExpress, Expression Atlases, GEO, dbGaP
Molecular interactions	IntAct
Reactions, pathways	Rhea, Reactome, BioModels, MetaboLights, MetabolomeExpress, Metabolomics Workbench
Protein families	InterPro, TreeFam, Pfam, MEROPS, GPCRDB
Protein expression data	PRIDE, GNPS, GPMdb, MassIVE, PeptideAtlas, LINCS, Paxdb, jPOST
Enzymes	IntEnz, Enzyme Portal
Literature	Europe PMC, Patent families
Samples and ontologies	Taxonomy, GO, EFO, SBO, MESH, BioSamples, Identifiers.org registry, ORCID data claims, OLS, bio.tools
Diseases	OMIM, Human diseases

Table 3.

Data resources available through EBI Search in 2019

Category	Data resources
Genomes and metagenomes	Ensembl Genomes, Ensembl, HGNC, DGVa, EGA, LRG, WormBase ParaSite, MGnify
Nucleotide sequences	ENA, RNAcentral, Rfam, NRNL1, NRNL2, IMGT/HLA, IPD-KIR, IPD-MHC
Protein sequences	UniProtKB, UniParc, UniRef, EPO, JPO, KIPO, USPTO, NRPL1, NRPL2
Macromolecular structures	PDBe, EMDB
Bioactive molecules	ChEBI, ChEMBL, Ligands
Gene expression	ArrayExpress, Expression Atlases, GEO, dbGaP
Molecular interactions	IntAct
Reactions, pathways	Rhea, Reactome, BioModels, MetaboLights, MetabolomeExpress, Metabolomics Workbench
Protein families	InterPro, TreeFam, Pfam, MEROPS, GPCRDB
Protein expression data	PRIDE, GNPS, GPMdb, MassIVE, PeptideAtlas, LINCS, Paxdb, jPOST
Enzymes	IntEnz, Enzyme Portal
Literature	Europe PMC, Patent families
Samples and ontologies	Taxonomy, GO, EFO, SBO, MESH, BioSamples, Identifiers.org registry, ORCID data claims, OLS, bio.tools
Diseases	OMIM, Human diseases

Category	Data resources
Genomes and metagenomes	Ensembl Genomes, Ensembl, HGNC, DGVa, EGA, LRG, WormBase ParaSite, MGnify
Nucleotide sequences	ENA, RNAcentral, Rfam, NRNL1, NRNL2, IMGT/HLA, IPD-KIR, IPD-MHC
Protein sequences	UniProtKB, UniParc, UniRef, EPO, JPO, KIPO, USPTO, NRPL1, NRPL2
Macromolecular structures	PDBe, EMDB
Bioactive molecules	ChEBI, ChEMBL, Ligands
Gene expression	ArrayExpress, Expression Atlases, GEO, dbGaP
Molecular interactions	IntAct
Reactions, pathways	Rhea, Reactome, BioModels, MetaboLights, MetabolomeExpress, Metabolomics Workbench
Protein families	InterPro, TreeFam, Pfam, MEROPS, GPCRDB
Protein expression data	PRIDE, GNPS, GPMdb, MassIVE, PeptideAtlas, LINCS, Paxdb, jPOST
Enzymes	IntEnz, Enzyme Portal
Literature	Europe PMC, Patent families
Samples and ontologies	Taxonomy, GO, EFO, SBO, MESH, BioSamples, Identifiers.org registry, ORCID data claims, OLS, bio.tools
Diseases	OMIM, Human diseases

Examples of using the apis

A number of clients and docker images are available from https://github.com/ebi-wp/webservice-clients for easily accessing the services. Below are some basic examples using the Job Dispatcher APIs for the Clustal Omega tool:

Submit a job to the Clustal Omega service using dbfetch to obtain sequences:

curl -X POST –header ‘Content-Type: application/x-www-form-urlencoded’ -d ‘stype=protein&sequence=uniprot:wap_rat,uniprot:wap_mouse,uniprot:wap_rabit, &email=<youremail here>’ ‘https://www.ebi.ac.uk/Tools/services/rest/clustalo/run’

The above returns a string, which corresponds to the job identifier or jobID,which has the following form: clustalo-I20190408-092628-0974-9944177-p1m

Check the status of a job:

curl ‘https://www.ebi.ac.uk/Tools/services/rest/clustalo/status/<jobID>’

Check which result types are available:

curl –header ‘Accept: application/xml’ ‘https://www.ebi.ac.uk/Tools/services/rest/clustalo/resulttypes/<jobID>’

Get the alignment output:

curl –header ‘Accept: text/x-clustalw-alignment’ ‘https://www.ebi.ac.uk/Tools/services/rest/clustalo/result/<jobID>/aln-clustal_num’

Similarly, below are some simple examples of using the EBI Search API:

Do a search across all domains for ‘BRCA1’:

curl –header ‘Accept: application/xml’ ‘http://www.ebi.ac.uk/ebisearch/ws/rest/?query=brca1’

Find all domains having associations with a given UniProt entry (BRCA1_HUMAN):

curl –header ‘Accept: application/xml’ ‘http://www.ebi.ac.uk/ebisearch/ws/rest/uniprot/entry/brca1_human/xref’

Given the UniProt entry for BRCA1 product, find and display the associated entry or entries in Ensembl:

curl –header ‘Accept: application/xml’ ‘http://www.ebi.ac.uk/ebisearch/ws/rest/uniprot/entry/brca1_human/xref/ensembl_gene’

Also, it is possible to find the UniProt entry from the associated entries of ensembl_gene by the bi-directional cross-reference:

curl –header ‘Accept: application/xml’ ‘http://www.ebi.ac.uk/ebisearch/ws/rest/ensembl_gene/entry/ENSG00000012048/xref/uniprot’

Further details and examples about how to use these APIs can be found at: https://www.ebi.ac.uk/Tools/common/tools/help and https://www.ebi.ac.uk/ebisearch/swagger.ebi

Usage statistics

Over 2017 and 2018, Job Dispatcher services have seen a continuous increase in usage. In 2017, ∼140 million jobs were performed, from over 900,000 unique IPs worldwide. In 2018, the number of jobs increased to 146 million. Usage through the website accounted for 7.6%, whereas REST programmatic access accounted for 88.2% and 4.2% via the SOAP interface.

Similarly, there has been continuous growth in traffic to EBI Search. The number of requests was about 282 million in 2017 and 550 million in 2018. In 2017 and 2018, ∼295 000 and 311 000 unique IPs, respectively, accessed the search system from across the globe.

DISCUSSION

Job Dispatcher and EBI Search are core services used extensively by other resources at the EMBL-EBI, and collaboration between these two projects is essential. The use of cross-references provided in the results of sequence similarity search tools is a good example of this collaboration. Further work is underway to add functions from EBI Search in the Job Dispatcher sequence similarity search tools, such as faceting to give users the ability to filter matches by, e.g. taxonomy status, keywords, GO Terms, etc. Importantly, efforts involving all the teams using these services, on improving the synchronicity between data releases, are being made. Future work plans for the Job Dispatcher framework include extending the usage of CWL to improve integration of the services into analysis pipelines and workflows. This will allow for an enhanced interoperability between the various tools as well as the data. In addition, development will focus on overhauling the entire frontend to use modern JavaScript frameworks. Interactive graphics will also be included to improve the display of common tool outputs, such as multiple sequence alignments, phylogenetic trees and protein three-dimensional structures.

On the EBI Search front the focus is on continued addition of features in response to user feedback and further relaxation of existing constraints to give more search power to the users. Further investment is being made to maintain the scalability and stability of the search system. Upgrading major software libraries and removing software and data legacy is high on the work agenda, as is migrating to the latest storage and compute infrastructures, improving service development and delivery using modern technologies (e.g. GitLab (https://gitlab.com), Docker (https://docker.com), Kubernetes (https://kubernetes.io/)).

DATA AVAILABILITY

The Bioinformatics Tools are accessible from https://www.ebi.ac.uk/services. EBI Search is available from https://www.ebi.ac.uk/ebisearch or from many pages on the EMBL-EBI's web site. Sample Web Service Clients as well as CWL workflows are available in the following GitHub repositories: https://github.com/ebi-wp/webservice-clients, and https://github.com/ebi-wp/webservice-cwl, respectively.

ACKNOWLEDGEMENTS

The authors wish to acknowledge Simone Badoer, Ijaz Ahmad, Philip Lewis and Ravi Mahankali for web administration support. We would like to also thank all EMBL-EBI teams for their invaluable help in providing biological data, applications and expertise.

FUNDING

European Molecular Biology Laboratory (EMBL); THOR H2020-EINFRA-2014-2, project number 654039. Funding for open access charge: EMBL.

Conflict of interest statement. None declared.

REFERENCES

1.

Tarkowska

A.

,

Carvalho-Silva

D.

,

Cook

C.E.

,

Turner

E.

,

Finn

R.D.

,

Yates

A.D.

Eleven quick tips to build a usable REST API for life sciences

.

PLoS Comput. Biol.

2018

;

14

:

e1006542

.

2.

Camacho

C.

,

Coulouris

G.

,

Avagyan

V.

,

Ma

N.

,

Papadopoulos

J.

,

Bealer

K.

,

Madden

T.L.

BLAST+: architecture and applications

.

BMC Bioinformatics

.

2009

;

10

:

421

.

3.

Pearson

W.R.

,

Lipman

D.J.

Improved tools for biological sequence comparison

.

Proc. Natl. Acad. Sci. U.S.A.

1988

;

85

:

2444

–

2448

.

4.

Potter

S.C.

,

Luciani

A.

,

Eddy

S.R.

,

Park

Y.

,

Lopez

R.

,

Finn

R.D.

HMMER web server: 2018 update

.

Nucleic Acids Res.

2018

;

46

:

W200

–

W204

.

5.

Jones

P.

,

Binns

D.

,

Chang

H.Y.

,

Fraser

M.

,

Li

W.

,

McAnulla

C.

,

McWilliam

H.

,

Maslen

J.

,

Mitchell

A.

,

Nuka

G.

et al. .

InterProScan 5: Genome-scale protein function classification

.

Bioinformatics

.

2014

;

30

:

1236

–

1240

.

6.

Chojnacki

S.

,

Cowley

A.

,

Lee

J.

,

Foix

A.

,

Lopez

R.

Programmatic access to bioinformatics tools from EMBL-EBI update: 2017

.

Nucleic Acids Res.

2017

;

45

:

W550

–

W553

.

7.

Park

Y.M.

,

Squizzato

S.

,

Buso

N.

,

Gur

T.

,

Lopez

R.

The EBI search engine: EBI search as a service - Making biological data accessible for all

.

Nucleic Acids Res.

2017

;

45

:

W545

–

W549

.

8.

Bateman

A.

,

Martin

M.-J.

,

Orchard

S.

,

Magrane

M.

,

Alpi

E.

,

Bely

B.

,

Bingley

M.

,

Britto

R.

,

Bursteinas

B.

,

Busiello

G.

et al. .

UniProt: a worldwide hub of protein knowledge

.

Nucleic Acids Res.

2018

;

47

:

D506

–

D515

.

9.

Silvester

N.

,

Alako

B.

,

Amid

C.

,

Cerdeño-Tarrága

A.

,

Clarke

L.

,

Cleland

I.

,

Harrison

P.W.

,

Jayathilaka

S.

,

Kay

S.

,

Keane

T.

et al. .

The European Nucleotide Archive in 2017

.

Nucleic Acids Res.

2018

;

46

:

D36

–

D40

.

10.

Kersey

P.J.

,

Allen

J.E.

,

Allot

A.

,

Barba

M.

,

Boddu

S.

,

Bolt

B.J.

,

Carvalho-Silva

D.

,

Christensen

M.

,

Davis

P.

,

Grabmueller

C.

et al. .

Ensembl Genomes 2018: An integrated omics infrastructure for non-vertebrate species

.

Nucleic Acids Res.

2018

;

46

:

D802

–

D808

.

11.

Burley

S.K.

,

Berman

H.M.

,

Bhikadiya

C.

,

Bi

C.

,

Chen

L.

,

Costanzo

L. Di

,

Christie

C.

,

Duarte

J.M.

,

Dutta

S.

,

Feng

Z.

et al. .

Protein Data Bank: the single global archive for 3D macromolecular structure data

.

Nucleic Acids Res.

2018

;

47

:

D520

–

D528

.

12.

Mitchell

A.L.

,

Attwood

T.K.

,

Babbitt

P.C.

,

Blum

M.

,

Bork

P.

,

Bridge

A.

,

Brown

S.D.

,

Chang

H.-Y.

,

El-Gebali

S.

,

Fraser

M.I.

et al. .

InterPro in 2019: improving coverage, classification and access to protein sequence annotations

.

Nucleic Acids Res.

2018

;

47

:

D351

–

D360

.

Crossref

13.

Dana

J.M.

,

Gutmanas

A.

,

Tyagi

N.

,

Qi

G.

,

O’Donovan

C.

,

Martin

M.

,

Velankar

S.

SIFTS: updated Structure Integration with Function, Taxonomy and Sequences resource allows 40-fold increase in coverage of structure-based annotations for proteins

.

Nucleic Acids Res.

2018

;

47

:

482

–

489

.

Crossref

14.

Sweeney

B.A.

,

Petrov

A.I.

,

Burkov

B.

,

Finn

R.D.

,

Bateman

A.

,

Szymanski

M.

,

Karlowski

W.M.

,

Gorodkin

J.

,

Seemann

S.E.

,

Cannone

J.J.

et al. .

RNAcentral: a hub of information for non-coding RNA sequences

.

Nucleic Acids Res.

2018

;

47

:

D221

–

D229

.

15.

Perez-Riverol

Y.

,

Bai

M.

,

da Veiga Leprevost

F.

,

Squizzato

S.

,

Park

Y.M.

,

Haug

K.

,

Carroll

A.J.

,

Spalding

D.

,

Paschall

J.

,

Wang

M.

et al. .

Discovering and linking public omics data sets using the Omics Discovery Index

.

Nat. Biotechnol.

2017

;

35

:

406

–

409

.

16.

Amstutz

P.

,

Crusoe

M.R.

,

Tijanić

N.

,

Chapman

B.

,

Chilton

J.

,

Heuer

M.

,

Kartashov

A.

,

Leehr

D.

,

Ménager

H.

,

Nedeljkovich

M.

et al. .

Common Workflow Language, v1.0

.

2016

;

doi:10.6084/m9.figshare.3115156.v2

.

Google Preview

17.

Rice

P.

,

Longden

L.

,

Bleasby

A.

EMBOSS: The European molecular biology open software suite

.

Trends Genet.

2000

;

16

:

276

–

277

.

18.

Waterhouse

A.M.

,

Procter

J.B.

,

Martin

D.M.A.

,

Clamp

M.

,

Barton

G.J.

Jalview Version 2—a multiple sequence alignment editor and analysis workbench

.

Bioinformatics

.

2009

;

25

:

1189

–

1191

.

19.

Ettwiller

L.

,

Paten

B.

,

Souren

M.

,

Loosli

F.

,

Wittbrodt

J.

,

Birney

E.

The discovery, positioning and verification of a set of transcription-associated motifs in vertebrates

.

Genome Biol.

2005

;

6

:

R104

.

20.

Jareborg

N.

,

Birney

E.

,

Durbin

R.

Comparative analysis of noncoding regions of 77 orthologous mouse and human gene pairs

.

Genome Res.

1999

;

9

:

815

–

824

.

21.

de Castro

E.

,

Sigrist

C.J.A.

,

Gattiker

A.

,

Bulliard

V.

,

Langendijk-Genevaux

P.S.

,

Gasteiger

E.

,

Bairoch

A.

,

Hulo

N.

ScanProsite: Detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins

.

Nucleic Acids Res.

2006

;

34

:

W362

–

W365

.

22.

Katoh

K.

,

Standley

D.M.

MAFFT multiple sequence alignment software version 7: Improvements in performance and usability

.

Mol. Biol. Evol.

2013

;

30

:

772

–

780

.

23.

Berger

S.A.

,

Krompass

D.

,

Stamatakis

A.

Performance, accuracy, and web server for evolutionary placement of short sequence reads under maximum likelihood

.

Syst. Biol.

2011

;

60

:

291

–

302

.

24.

Rawlings

N.D.

,

Barrett

A.J.

,

Thomas

P.D.

,

Huang

X.

,

Bateman

A.

,

Finn

R.D.

The MEROPS database of proteolytic enzymes, their substrates and inhibitors in 2017 and a comparison with peptidases in the PANTHER database

.

Nucleic Acids Res.

2018

;

46

:

D624

–

D632

.

25.

Gaulton

A.

,

Hersey

A.

,

Nowotka

M.L.

,

Patricia Bento

A.

,

Chambers

J.

,

Mendez

D.

,

Mutowo

P.

,

Atkinson

F.

,

Bellis

L.J.

,

Cibrian-Uhalte

E.

et al. .

The ChEMBL database in 2017

.

Nucleic Acids Res.

2017

;

45

:

D945

–

D954

.

26.

Robinson

J.

,

Halliwell

J.A.

,

Hayhurst

J.D.

,

Flicek

P.

,

Parham

P.

,

Marsh

S.G.E.

The IPD and IMGT/HLA database: allele variant databases

.

Nucleic Acids Res.

2015

;

43

:

D423

–

D431

.

27.

Gou

Y.

,

Graff

F.

,

Kilian

O.

,

Kafkas

S.

,

Katuri

J.

,

Kim

J.H.

,

Marinos

N.

,

McEntyre

J.

,

Morrison

A.

,

Pi

X.

et al. .

Europe PMC: A full-text literature database for the life sciences and platform for innovation

.

Nucleic Acids Res.

2015

;

43

:

D1042

–

D1048

.

28.

Faulconbridge

A.

,

Burdett

T.

,

Brandizi

M.

,

Gostev

M.

,

Pereira

R.

,

Vasant

D.

,

Sarkans

U.

,

Brazma

A.

,

Parkinson

H.

Updates to BioSamples database at European Bioinformatics Institute

.

Nucleic Acids Res.

2014

;

42

:

D50

–

D52

.

29.

Kalvari

I.

,

Argasinska

J.

,

Quinones-Olvera

N.

,

Nawrocki

E.P.

,

Rivas

E.

,

Eddy

S.R.

,

Bateman

A.

,

Finn

R.D.

,

Petrov

A.I.

Rfam 13.0: Shifting to a genome-centric resource for non-coding RNA families

.

Nucleic Acids Res.

2018

;

46

:

D335

–

D342

.

30.

Jupp

S.

et al. .

Malone

J

A new Ontology Lookup Service at EMBL-EBI

.

2015

;

Proceedings of SWAT4LS International Conference 2015

.

Google Preview