|5/5||11:00am||Matthew Bernardini||UMSL||DISTFOLD: Distance based protein folding||Link|
|4/21||3:00pm||Sharlee Climer||UMSL||Linear Programming Meets AI||Link|
|3/10||3:00pm||Christopher Cruzen||UMSL||Gamepad Aiming: Reverse Engineering the Algorithm that Revolutionized the Gaming Industry||Link|
|11/17||4:00pm||Jacob Barger||UMSL||How regression can be effectively used in place of classification in the context of protein folding||Link|
|10/29||3:00pm||Reza Tourani||SLU||Towards a Data-Centric Security-as-a-Service model at the Network Edge||Link|
|10/23||10:00am||Ted Ahn||SLU||Advanced Computational Techniques for Genomics and Clinical Research||Link|
Title: DISTFOLD: Distance based protein folding
Abstract: Protein structure prediction and its associated key sub-problems such as distance map prediction are of significance importance in biology and bioinformatics. The inter-residue distance prediction problem, or distance prediction in short, is to predict the physical distance between amino acids in a three-dimensional (3D) space, given a protein's one-dimensional sequence information. While there exist many methods to predict distance maps, there are currently no methods that can take those predicted distance maps and build 3D models from them in an ab initio way, i.e., without using any other information. This works aims to fill this gap by: a) developing a method that accepts predicted distance maps (2D information) as input and builds 3D models, and, b) investigating the prospect and limitations of distance-guided 3D modeling (reconstruction). DISTFOLD is a Perl and Python based script that wraps around a well-established 3D modeling tool known as CNS-Suite. To test our DISTFOLD implementation, we first benchmarked it on a small subset of the SCOPe dataset representative of the entire protein data bank. In addition to developing DISTFOLD, we also investigated (a) how various distance thresholds for selecting distance restraints impact the reconstruction accuracy, (b) how secondary structure information influences the reconstruction accuracy, and (c) how the reconstruction accuracy changes when predicted distances are used instead of true. Using two representative sets consisting of 1583 proteins and 259 proteins, we show that our method, DISTFOLD, is capable of building accurate models in an array of settings. Our results also show that the value of threshold chosen to filter-out/keep distances can drastically affect reconstruction accuracy. We also show that including secondary structures, when available, can benefit reconstruction in the absence of local distance information. When predicted distances are used instead of true, we found out that the reconstruction accuracy drops significantly and that distances predicted at thresholds higher than 11 or 12 angstroms are not significantly useful for reconstruction. DISTFOLD is publicly available at https://github.com/ba-lab/distfold/.
Bio: Matthew Bernardini is a Master's student under the supervision of Dr. Badri Adhikari. This is his thesis defense.
Title: Linear Programming Meets AI
Abstract: Combinatorial problems are ubiquitous across diverse fields in the sciences and industry. Despite rapid advances in computational power, most problems of interest stubbornly remain intractable and approximate solutions are typically employed. This presentation will briefly overview the use of network models for approximately solving combinatorial problems, then introduce our emerging research for optimally solving combinatorial problems using a unique combination of linear programming and AI. We have applied our approximate and exact methods in diverse domains, from psoriasis to climatype maps, but primarily focus on Alzheimer disease. Most recently, we have turned our efforts to search for biomarker patterns predictive of the enigmatic outcomes of COVID-19.
Bio: Sharlee Climer is an Assistant Professor in the Department of Computer Science at the University of Missouri - St. Louis. She is also a faculty member of the Center for Neurodynamics, at the University of Missouri - St. Louis, and the Hope Center for Neurological Disorders, at Washington University in St. Louis. Dr. Climer was awarded UMSL's Junior Faculty Investigator of the Year Award in 2020, UMSL's Department of Mathematics and Computer Science Outstanding Research Award in 2019, ACM's Gordon Bell Prize in 2018, Henning Andersen Prize in 2012, National Defense Science and Engineering Graduate (NDSEG) Fellowship in 2001, and Olin Fellowship in 2001. She is the founder of WomenCAN, a student group dedicated to promoting recruitment and retention of women in computer science, and has served on the program committees for the Association for the Advancement of Artificial Intelligence (AAAI) and Platform for Advanced Scientific Computing (PASC) for more than five years. She has published more than 30 refereed research manuscripts, as well as a book, Limit Crossing, in which she presents strategies for utilizing upper- and lower-bounds in combinatorial optimization. Dr. Climer's current research focuses the development of approximate and exact methods for identifying combinatorial patterns in big data, with a focus on genetic patterns associated with complex diseases.
Title: Gamepad Aiming: Reverse Engineering the Algorithm that Revolutionized the Gaming Industry
Abstract: For the first decade of their existence, it was near universally accepted that first person shooter (FPS) games would never be satisfying to play on a controller. The limited movement of a toggle stick pales in comparison to the sweeping, but precise motion a mouse provides. Yet 30 years later, the gamepad has never been more ubiquitous. Even modern titles built for PC are expected to feature robust controller support. Ultimately, the input method's surprising success story traces back to one algorithm buried deep in the source code of Halo: Combat Evolved. This talk consists of a detailed examination of that algorithm and an exploration of the challenges of rebuilding it in a modern game engine.
Bio: Christopher Cruzen earned his Bachelor's degree in Computer Science from the University of Missouri in St. Louis. A graduate of the Launch Code program, he spent four years developing enterprise-level mobile applications at Express Scripts before transitioning into the Game Development space. His areas of interest include 3D modeling, computer graphics, and procedural data visualization.
Title: How regression can be effectively used in place of classification in the context of protein folding
Abstract: Protein contact prediction, a binary classification problem in bioinformatics, lies at the heart of a six-decade-old problem of protein folding. Dozens of methods, based on almost all kinds of machine learning and deep learning algorithms, have been published over the last two decades for predicting contacts. Recently, many groups including Google DeepMind have demonstrated that reformulating the problem as a multi-class classification problem is a more promising direction to pursue. As yet another alternative approach, we recently proposed to formulate and attack the problem as a regression problem - the way the information exists in nature. Nuances of protein three-dimensional structures make this formulation a unique and tempting regression problem. In this work, we discuss novel ways of output label engineering (different from feature engineering) through the use of a variety of data transformation functions, and demonstrate, for the first time, that deep learning methods for real-valued protein distance prediction can deliver distances as precise as the binary classification methods. We also demonstrate how the more granular information contained in our real-valued distances can be used to build significantly more accurate three-dimensional protein models. We believe that our work will stand as a milestone marking the dawn of real-valued distance prediction.
Bio: Jacob Barger is a Master's student under the supervision of Dr. Badri Adhikari. This is his thesis defense.
Title: Towards a Data-Centric Security-as-a-Service model at the Network Edge
Abstract: The prevailing network security measures are often implemented on proprietary appliances that are deployed at fixed network locations with constant capacity. Such a rigid deployment is sometimes necessary, but undermines the flexibility of security services in meeting the demands of emerging applications, such as augmented/virtual reality, autonomous driving, and 5G for industry 4.0, which are provoked by the evolution of connected and smart devices, their heterogeneity, and integration with cloud and edge computing infrastructures. This talk share the vision of a data-centric SECurity-as-a-Service (SECaaS) framework for elastic deployment and provisioning of security services at the edge of the network. This vision employs the novel Named-Data Networking architecture to loosen the rigid deployments of security services. In particular, we discuss two security services that are suitable for edge deployment, a Distributed Denial of Service (DDoS) countermeasure and an access control enforcement system.
Bio: Reza Tourani is an assistant professor with the computer science department at Saint Louis University. He received Ph.D. in computer science from New Mexico State University in 2018. His research areas include in the areas of security and privacy, Future Internet Architecture, smart grid communication, and cyber-physical systems. He has authored more than 25 peer-reviewed IEEE/ACM journal articles and conference proceedings.
Title: Advanced Computational Techniques for Genomics and Clinical Research
Abstract: Bioinformatics is an interdisciplinary field spanning multiple research areas in computer science and biology to research, develop, and apply computational tools and approaches to manage and process large sets of biological data. The exponential advances in the sequencing technologies and informatics tools for generating and processing large biological data sets are promoting a paradigm shift in the way we approach biomedical problems. Next-generation sequencing (NGS) platforms have the great potential to sequence many samples in parallel with low cost. Theses cost-effective technologies made a tremendous effect on genomics and clinical research that have contributed innumerably to open the secret of processes for human health and diseases. The vast amount of resulting sequencing data can help the diagnosis and precision medicine, but poses a challenge: accurately characterizing a sample in a computationally feasible fashion, despite specimen diversity. In this talk, I will describe how to solve bioinformatics and biomedical research problems using advanced computational techniques for near real-time biological sequence analyses. First, I will present metagenomic sequence analysis by classifying metagenomic samples using microbiota information and machine learning techniques to identify and predict disease samples precisely. Second, I will briefly introduce scalable overlap-graph reduction algorithms to help computationally expensive genome assembly using Apache Spark on cloud or a local cluster. Last, I will talk how to apply statistical and deep learning methods for immune cell sequencing analysis for diagnosis of vaccination and viral infection with high accuracy.
Bio: Dr. Tae-Hyuk (Ted) Ahn received the B.S. degree in Electrical Engineering from the Yonsei University, in 2000, and worked for several years at Samsung SDS in South Korea. He received the M.S. degree in Electrical and Computer Engineering from the Northwestern University, in 2007, and the Ph.D. degree in Computer Science from the Virginia Tech, in 2012. After he worked three years at Oak Ridge National Laboratory, he joined the Department of Computer Science at Saint Louis University in 2015 where he is currently an associate professor. He is also a core faculty member of the MS Program in Bioinformatics and Computational Biology. His research interests include bioinformatics, biomedical informatics, high-performance computing, big data analytics, and machine/deep learnings.