|11/17||4:00pm||Jacob Barger||UMSL||How regression can be effectively used in place of classification in the context of protein folding||Link|
|10/29||3:00pm||Reza Tourani||SLU||Towards a Data-Centric Security-as-a-Service model at the Network Edge||Link|
|10/23||10:00am||Ted Ahn||SLU||Advanced Computational Techniques for Genomics and Clinical Research||Link|
Title: How regression can be effectively used in place of classification in the context of protein folding
Abstract: Protein contact prediction, a binary classification problem in bioinformatics, lies at the heart of a six-decade-old problem of protein folding. Dozens of methods, based on almost all kinds of machine learning and deep learning algorithms, have been published over the last two decades for predicting contacts. Recently, many groups including Google DeepMind have demonstrated that reformulating the problem as a multi-class classification problem is a more promising direction to pursue. As yet another alternative approach, we recently proposed to formulate and attack the problem as a regression problem - the way the information exists in nature. Nuances of protein three-dimensional structures make this formulation a unique and tempting regression problem. In this work, we discuss novel ways of output label engineering (different from feature engineering) through the use of a variety of data transformation functions, and demonstrate, for the first time, that deep learning methods for real-valued protein distance prediction can deliver distances as precise as the binary classification methods. We also demonstrate how the more granular information contained in our real-valued distances can be used to build significantly more accurate three-dimensional protein models. We believe that our work will stand as a milestone marking the dawn of real-valued distance prediction.
Bio: Jacob Barger is a Master's student under the supervision of Dr. Badri Adhikari. This is his thesis defense.
Title: Towards a Data-Centric Security-as-a-Service model at the Network Edge
Abstract: The prevailing network security measures are often implemented on proprietary appliances that are deployed at fixed network locations with constant capacity. Such a rigid deployment is sometimes necessary, but undermines the flexibility of security services in meeting the demands of emerging applications, such as augmented/virtual reality, autonomous driving, and 5G for industry 4.0, which are provoked by the evolution of connected and smart devices, their heterogeneity, and integration with cloud and edge computing infrastructures. This talk share the vision of a data-centric SECurity-as-a-Service (SECaaS) framework for elastic deployment and provisioning of security services at the edge of the network. This vision employs the novel Named-Data Networking architecture to loosen the rigid deployments of security services. In particular, we discuss two security services that are suitable for edge deployment, a Distributed Denial of Service (DDoS) countermeasure and an access control enforcement system.
Bio: Reza Tourani is an assistant professor with the computer science department at Saint Louis University. He received Ph.D. in computer science from New Mexico State University in 2018. His research areas include in the areas of security and privacy, Future Internet Architecture, smart grid communication, and cyber-physical systems. He has authored more than 25 peer-reviewed IEEE/ACM journal articles and conference proceedings.
Title: Advanced Computational Techniques for Genomics and Clinical Research
Abstract: Bioinformatics is an interdisciplinary field spanning multiple research areas in computer science and biology to research, develop, and apply computational tools and approaches to manage and process large sets of biological data. The exponential advances in the sequencing technologies and informatics tools for generating and processing large biological data sets are promoting a paradigm shift in the way we approach biomedical problems. Next-generation sequencing (NGS) platforms have the great potential to sequence many samples in parallel with low cost. Theses cost-effective technologies made a tremendous effect on genomics and clinical research that have contributed innumerably to open the secret of processes for human health and diseases. The vast amount of resulting sequencing data can help the diagnosis and precision medicine, but poses a challenge: accurately characterizing a sample in a computationally feasible fashion, despite specimen diversity. In this talk, I will describe how to solve bioinformatics and biomedical research problems using advanced computational techniques for near real-time biological sequence analyses. First, I will present metagenomic sequence analysis by classifying metagenomic samples using microbiota information and machine learning techniques to identify and predict disease samples precisely. Second, I will briefly introduce scalable overlap-graph reduction algorithms to help computationally expensive genome assembly using Apache Spark on cloud or a local cluster. Last, I will talk how to apply statistical and deep learning methods for immune cell sequencing analysis for diagnosis of vaccination and viral infection with high accuracy.
Bio: Dr. Tae-Hyuk (Ted) Ahn received the B.S. degree in Electrical Engineering from the Yonsei University, in 2000, and worked for several years at Samsung SDS in South Korea. He received the M.S. degree in Electrical and Computer Engineering from the Northwestern University, in 2007, and the Ph.D. degree in Computer Science from the Virginia Tech, in 2012. After he worked three years at Oak Ridge National Laboratory, he joined the Department of Computer Science at Saint Louis University in 2015 where he is currently an associate professor. He is also a core faculty member of the MS Program in Bioinformatics and Computational Biology. His research interests include bioinformatics, biomedical informatics, high-performance computing, big data analytics, and machine/deep learnings.