CloudPhylo Phylogeny reconstruction on large-scale datasets


Phylogeny reconstruction is a routine analysis for most evolutionary related studies, determining and picturing evolutionary relationships among many genes or species. However, most existing tools for phylogeny reconstruction are simply based on single process model or traditional parallel paradigms, such as PThread, OpenMP etc., and therefore, cannot scale well with the dramatically increasing size of input dataset. To tackle this challenge, BIGD (Big Data Center) presents a Spark-based tool, CloudPhylo, to handle large dataset for fast and scalable phylogeny reconstruction Spark is a newly proposed cloud computing framework, which incorporates MapReduce paradigm and efficiently caches internal calculation results, significantly boosting the performance of CloudPhylo and enabling CloudPhylo to be used for largescale phylogenetic tree inference.

CloudPhylo is not only the world’s first phylogeny reconstruction tool available for large-scale dataset, but also the first Spark-based bioinformatics software in China. According to the comparison results, CloudPhylo achieves high efficiency and good scalability, and is well suited for largescale phylogenetic tree inference


  1. CloudPhylo: a fast and scalable tool for phylogeny reconstruction
    Cite this
    Xu,Xingjian, Ji,Zhaohua, Zhang,Zhang, 2016/10/11 - Bioinformatics


  1. Xingjian Xu

    Beijing Institute of Genomics (BIG), BIGD, China

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Tool TypeApplication
TechnologiesJava, Spark
User InterfaceTerminal Command Line
Latest Release1.0 (June 26, 2017)
Download Count285
Submitted ByXingjian Xu