Isyslab is a research group composed of several professors from Computer Science, Electronics, Software Engineering and Biology. We aim to design and implement the intelligent software systems for the field of Multidisciplinary/Cross-disciplinary. Currently, we are interested in text mining and Image/Video Processing, Medical and Biological Data Processing, Protein structure and function prediction, techniques in software R&D.

Research


People


Zhidong XUE Professor zdxue at isyslab.org
Yan WANG Associate Professor wangyan at isyslab.org
Yan Fu Associate Professor fuyan at isyslab.org
Zehua Lua Associate Professor lvzehua at isyslab.org
Yujiang Zeng Associate Professor zengyujiang at isyslab.org
Shiqi OU Assistant Professor smartvision at sina.com
Weiya CHEN Assistant Professor weiya_chen at hust.edu.cn
Qiang SHI Post-doc Fellow shiqiang at isyslab.org
Jingxiang LU Project Manager  lujingxiang at isyslab.org

Selected Publications


  1. Yan Wang,Huang Zhang,Haolin Zhong,Zhidong Xue*.Protein domain identification methods and online resrouces.Computational and Structural Biotechnology Journal,19:1145-1153,2021
  2. Qiang Shi,Weiya Chen,Siqi Huang,Yan Wang and Zhidong Xue*,Deep learning for mining the protein dataBriefings in Bioinformatics,22(1):194-218,2021
  3. Weiya Chen,Chun Yao,Yingzhong Guo,Yan Wang, zhidong Xue*.pmTM-align:scalable pairwise and multiple structure alignment with Apache Spark and OpenMP.BMC Bioinformatics,21(1):426,2020
  4. Yan Wang#, Qiang Shi#, Pengshuo Yang#, Chengxin Zhang#, S. M. Mortuza, Zhidong Xue*, Kang Ning*, Yang Zhang*, Fueling ab initio folding with marine metagenomics enables structure and function predictions of new protein families. Genome Biology, 20: 229 (2019)
  5. Qiang Shi, Weiya Chen,Siqi Huang,Fanglin Jin,Yinghao Dong,Yan Wang* and Zhidong Xue*. DNN-Dom:predicting protein domain boundary from sequence alone by deep neural network. Bioinformatics,doi: 10.1093/bioinformatics/btz464(2019) [Server]
  6. Qiang Shi#, Weiya Chen#,Ye Pan,Shan Yin,Yan Fu,Jiacai Mei* and Zhidong Xue*. An Automatic Classification Method on Chronic Venous InSufficiency Images. Scientific Report,doi: 10.1038/s41598-018-36284-5 (2018) [Dataset]
  7. Jian Wang#,Tailang Yin#, Xuwen Xiao,Dan He,zhidong Xue, Xinnong Jiang*,Yan Wang*. StraPep:a Structure Database of Bioactive Peptides.Database (Oxford).2018: bay038(2018) [Server]
  8. Yan Wang, Jian Wang, Ruiming Li, Qiang Shi, Zhidong Xue*, Yang Zhang*. ThreaDomEx: a unified platform for predicting continuous and discontinuous protein domains by multiple-threading and segment assembly. Nucleic Acids Research, doi: 10.1093/nar/gkx410 (2017). [Server]
  9. Yan Wang, Jouko Virtanen, Zhidong Xue*, Yang Zhang*. I-TASSER-MR: automated molecular replacement for distant-homology proteins using iterative fragment assembly and progressive sequence truncation. Nucleic Acids Research, doi: 10.1093/nar/gkx349 (2017)
  10. Yan Wang, Jouko Virtanen, Zhidong Xue, John J. G. Tesmer, Yang Zhang.Using iterative fragment assembly and progressive sequence truncation to facilitate phasing and crystal structure determination of distantly related proteins. Acta Crystallographica Section D, 72: 616-28 (2016)
  11. Jianyi Yang, Yan Wang, Yang Zhang*. ResQ: An approach to unified estimation of B-factor and residue-specific error in protein structure prediction. Journal of Molecular Biology, 428: 693-701 (2016).
  12. Zhidong Xue* , Richard Jang, Brandon Govindarajoo, Yichu Huang, Yan Wang*. Extending Protein Domain Boundary Predictors to Detect Discontinuous Domains. PLOS One .DOI: 10.1371/journal.pone .0141541
  13. Richard Jang, Yan Wang, Zhidong Xue*, Yang Zhang*. NMR data-driven structure determination using NMR-I-TASSER in the CASD-NMR experiment. Journal of Biomolecular NMR, 2015(62): 511-525
  14. Yan Wang, Mingxia Wang, Sanwen Yin, Richard Jang, Jian Wang, Zhidong Xue*, Tao Xu*. NeuroPep: a Comprehensive Resource of Neuropeptides. Database (Oxford). 2015 Apr 29. doi: 10.1093/database/bav038
  15. Z Xue, D Xu, Y Wang, Y Zhang. ThreaDom: Extracting Protein Domain Boundary Information from Multiple Threading Alignments. Bioinformatics, 2013, 29: i247-i256
  16. Wang, Y; Xue, Z; Shen, G; Xu. J. PRINTR: Prediction of RNA binding sites in proteins using SVM and profiles. AMINO ACIDS,2008,2:295-302

Software


Bioinformatics tools and database

ThreaDomEx
Protein domains are subunits that can fold and evolve independently. The identification of protein domains is essential for protein structure determination and functional annotations. ThreaDom2 and DomEx3 are two methods recently developed for protein domain boundary recognition and especially discontinuous domain prediction. ThreaDomEx combines ThreaDom and DomEx into a unified on-line server system for more accurate and user-friendly domain predictions on sequences of both continuous and discontinuous domain structures. ThreaDomEx takes the amino acid sequence of the query protein as input. It first creates multiple threading alignments to recognize homologous and analogous template structures, from which a domain conservation score is then calculated for deducing the domain boundaries. Next, a boundary clustering method is used to optimize the domain model selections. For discontinuous domain structures, a symmetric alignment algorithm is applied to further integrate and refine the domain assignments. Output of the server consists of: (a) the predicted domain boundaries and discontinuous domains; (b) the visualized distribution of domain conserve score, predicted secondary structure and solvent accessiblity; (c) the threading templates used by ThreaDomEx. The server allows users to interactively edit, save, or re-detect the domain models of the proteins.

ThreaDom
ThreaDom is Protein Domain prediction tools based on several protein threading programs. It not only can report protein domain boundaries more accurately than most protein domain predictors especially for the medium and hard target, but also can detect the discontinuous domains based on boundary clustering.

DomEx
DomEx enables domain boundary predictors to detect discontinuous domains by assembling the continuous domain segments. Discontinuous domains are predicted by matching the sequence profile of concatenated continuous domain segments with the profiles from a single-domain library derived from SCOP and CATH, and Pfam. Then the matches are filtered by similarity to library templates, a symmetric index score and a profile-profile alignment score. Compared with ThreaDom DomEx recalled 26.7% discontinuous domains with 72.7% precision in a benchmark with 29 discontinuous-domain chains, where ThreaDom failed to predict any discontinuous domains. The source code and datasets are available here.

NeuroPep
Neuropeptides are involved in a number of physiological processes and serve as potential therapeutic targets for the treatment of some nervous-system disorders. We have developed a comprehensive neuropeptide database, NeuroPep, which is currently the most complete neuropeptide database.
The current release of the database (version 1.0, 2014-11-26) holds 5949 non-redundant neuropeptide entries originating from 493 organisms belonging to 65 neuropeptide families. The data were collected from various resources including MEDLINE abstracts, full papers, UniProt,database at www.neuropeptides.nl and Neuropedia. All the entries in NeuroPep have been manually checked.

StraPep
StraPep is a database dedicated to annotation of bioactive peptides with known structure, which often function as modulator in physiological processes by means of interacting with specific receptor. There are 1199 bioactive peptides and 3536 PDB chains in CBPS. All the peptides are classified into 6 categories: Toxin and Venom peptide (441), Antimicrobial peptide (365), Cytokine and Growth factor (199), Hormone (132), Neuropeptide (37) and Others (25). All the structure information of peptides were collected from Protein Data bank, and the sequence information were mainly from Uniprot. Users can easily access peptide interested by user-friendly browser and search engine. For each entry, the detail page provides both primary information and structure information. Tools like BLAST, Map and Secondary structure composition are also provided.

I-TASSER-MR
I-TASSER-MR has been developed to test whether the success rate for structure determination of distant-homology proteins could be improved by a combination of iterative fragmental structure-assembly simulations with progressive sequence truncation designed to trim regions with high variation. The pipeline was tested on two independent protein sets consisting of 61 proteins from CASP8 and 100 high-resolution proteins from the PDB. After excluding homologous templates, I-TASSER generated full-length models with an average TM-score of 0.773, which is 12% higher than the best threading templates. Using these as search models, I-TASSER-MR found correct MR solutions for 95 of 161 targets as judged by having a TFZ of >8 or with the final structure closer to the native than the initial search models. The success rate was 16% higher than when using the best threading templates. I-TASSER-MR was also applied to 14 protein targets from structure genomics centers. Seven of these were successfully solved by I-TASSER-MR. These results confirm that advanced structure assembly and progressive structural editing can significantly improve the success rate of MR for targets with distant homology to proteins of known structure.

Medical Images

An image dataset for CVI (Chronic Venous Insufficiency) [download]

Research Position


Excellent post-docs, graduate students and undergraduate students interested in the related research and projects of isyslab are always encouraged to apply, by sending an email to Prof. Zhidong XUE at zdxue@isyslab.org.

For the excellent post-doc, annual salary starts at RMB 200,000.00 (About USD 30,000.00). We also welcome excellent graduate students and undergraduate students to join us.