Daehwan Kim is an Assistant Professor in the Lyda Hill Department of Bioinformatics, a UT Southwestern Endowed Scholar in Medical Research, and a Scholar of the Cancer Prevention Institute of Texas (CPRIT). Dr. Kim’s research is focused on developing computer algorithms and statistical methods that enable accurate and rapid analysis of biological data, in particular sequencing data. Among his several first-author papers, his paper on the TopHat2 software program (published in 2013 in Genome Biology) and another on HISAT (published in 2015 in Nature Methods) have been cited over 3,470 and 320 times, respectively.
Currently one important obstacle facing analyses of sequencing data is their reliance on the human reference genome to align sequencing reads. The human reference genome was assembled using only a few samples and thus does not reflect genetic diversity across individuals and populations. This reliance on a single reference genome can introduce significant biases in downstream analyses, and it can miss important disease-related genetic variants if they occur in regions not present in the reference genome.
To address these challenges, Dr. Kim recently developed a novel indexing scheme using a graph approach that captures a wide representation of genetic variants and has low memory requirements. He has built a new alignment system, HISAT2 (ccb.jhu.edu/software/hisat2), that enables fast search through the index. HISAT2 is the first and only practical method available for aligning sequencing reads to a graph at the human genome scale while only requiring a small amount of memory typically available on a conventional desktop. The graph-based alignment approach enables much higher alignment sensitivity and accuracy than linear reference-based alignment approaches, especially for highly polymorphic genomic regions such as HLA genes, DNA fingerprinting loci, and LINEs. The system also has the potential to perform unbiased alignment irrespective of which individual genome is sequenced.
Building off of HISAT2, Dr. Kim plans to develop a practical software solution that can accurately analyze an individual’s genome and its >20,000 genes within a few hours on a desktop computer. The availability of an individual’s genetic information made possible by this proposed work is essential to promoting personalized medicine. The software will enable researchers to more efficiently perform unbiased analyses for next-generation sequencing experiments, further improving our understanding of tumorigenesis and finding personalized treatments for cancer patients. Anyone who has access to sequencing data will be able to easily perform these functions using just one software package.
- (2017), Computer Sciences
- Graduate School
- (2017), Computer Sciences
- DNA, RNA, and bisulfite sequence alignment
- Graph alignment to population of genomes and genotyping
- Personalized medicine with a focus on cancer diagnosis
- Department of Bioinformatics (2017)