top of page

vClean: assessing virus sequence contamination in viral genomes

Summary

Viral metagenomics and single-virus genome analysis have enabled obtaining individual genome information for unknown environmental virus genomes. However, there have been concerns that these virus genomes may be contaminated with other types of virus sequences, i.e., DNA sequence "contamination." However, there were no tools to detect this contamination. In this study, we applied machine learning to develop a tool called "vClean" that learns the characteristics of the base sequences and genetic patterns of viral genomes and detects "contamination" in the target genome.

Using this tool, it is possible to detect contamination in viral genomes obtained through viral metagenomics analysis or single-virus genome analysis, and to select highly reliable information.


Mr. Wagatsuma, a doctoral student at the Takeyama Laboratory at Waseda University, primarily conducted this research.


vClean: assessing viral sequence contamination in viral genomes.

Wagatsuma R, Nishikawa Y, Hosokawa M, Takeyama H.

NAR Genom Bioinform. 2025 Jan 7;7(1):lqae185. doi: 10.1093/nargab/lqae185.


bottom of page