Discrete models for understanding mechanisms of
RNA-mediated DNA break repair
Margherita Ferrari 1,2 & Youngkyu Jeon 1,3
Southeast Center for Mathematics and Biology 1
University of South Florida 2
Georgia Institute of Technology 3
ABSTRACT: Precise DNA repair is required to maintain genome integrity. Particularly, repair of DNA double-strand breaks (DSBs) needs to be efficient and accurate not only to avoid mutations in DNA, but also to prevent gross chromosome rearrangements. RNA had been considered as an intermediate molecule between DNA and proteins, and a variety of new functions of RNA have been uncovered in the last decades. Recently, we found that RNA can be used as a template for DNA DSB repair directly in yeast. We are developing and optimizing a system to study RNA-mediated DNA repair in mammalian cells. Our biology systems are built on plasmid DNA molecules and are combined with the CRISPR/Cas tools of genome editing to induce DSBs efficiently and on specific locations of the plasmid DNAs. This experimental design makes our system modular. We show sequencing results of RNA-mediated DNA repair after DSB, and that transcript RNA may help to recover functional properties of a gene after a break. We use graph-theoretic properties to study the sequence variants that appear during the DNA repair process. Relationships among sequences are displayed in a graph where vertices are sequences and an edge between two vertices exists if the two sequences differ by one nucleotide mutation. By studying these graphs, we can better understand the molecular process that play a role during DNA break repair.
Applications of independent set algorithms to
benchmarking with sequence data
NSF-Simons Center for Mathematical and Statistical Analysis of Biology
ABSTRACT: We use independent set algorithms to split biological sequence data into training and test sets for benchmarking. In typical machine learning applications, one assigns the data to training and test sets at random, trains a model using the training data, and evaluates the accuracy of the model on the test data. Biological sequence datasets frequently include many sequences that are very similar, and so a random division of the data would result in some test sequences that are very similar to training sequences. In order to meaningfully predict the accuracy of a model on data it has not yet seen, it is necessary for the training and test sequences to be sufficiently far apart. Inspired by algorithms that find independent sets in graphs, we design algorithms to split sequence data into training and test sets such that no training-test sequence pair is too close. Finally, we evaluate the performance of our algorithm on the protein sequence families in the Pfam database.
Agent-Based Modeling of Emergent Patterning Within Stem Cell Colonies
Southeast Center for Mathematics and Biology
Georgia Institute of Technology
ABSTRACT: The differentiation of stem cell colonies into specified tissue types is possible through local and long-distance intercellular communication; however, it is unclear which mechanisms take priority in context-specific situations. Here we consider human induced pluripotent stem cells (hiPSCs) whose therapeutic potential arises from their ability to differentiate into all germ lineages. Prior work in the literature suggests that both cell-autonomous and non-autonomous (e.g. positional) mechanisms determine cell fate during the differentiation of hiSPCs, producing patterns and other system-level features in the process. Informed by experimental data, we develop a collection of agent-based models (ABMs) whose agents (i.e. cells) are each equipped with local rules that govern how the agents interact with their environment and with each other. The purpose of each ABM is to simulate the early differentiation of hiPSCs according to a different set of biological assumptions, with some ABMs using a Boolean network to model the FGF/ERK pathway as a potential mechanism for intercellular communication. We also extend an existing mathematical framework by M. Yereniuk and S.D. Olson which formalizes ABMs in order to estimate long-term model behavior with respect to time. Our extensions introduce the birth and death of agents into the framework, and our estimates aim to establish connections between local interactions and certain system-level observations. Thus, we study both the emergent behaviors of our ABMs and the dynamics of the local rules governing each agent in order to ascertain which modes of intercellular communication determine cell fate.
Zebrafish airineme optimize between ballistic search and diffusive search
Center for Multiscale Cell Fate Research
Univeristy of California, Irvine
ABSTRACT: In addition to diffusive signals, cells in tissue also communicate via long, thin cellular protrusions, such as airinemes in zebrafish. Before establishing communication, cellular protrusions must find their target cell. Here we demonstrate airinemes in zebrafish are mathematically consistent with a finite persistent random walk model. The probability of contacting the target cell is maximized for a balance between ballistic search (straight) and diffusive (highly curved, random) search. We find that the curvature of airinemes in zebrafish, extracted from live cell microscopy, is approximately the same value as the optimum in the simple mathematical model. We also explore the ability of the target cell to infer direction of the airineme's source, finding the experimentally observed parameters to be at a Pareto optimum balancing directional sensing with contact initiation.
Understanding the shape of movement in C. elegans
Ashleigh Thomas1,2 & Shivesh Chaudhary1,3
Southeast Center for Mathematics and Biology1
University of Florida2
Georgia Institute of Technology3
ABSTRACT: One way to think about movements of an animal is as a sequence of poses. This data has two distinct parts: the poses (the shape of the animal’s body at each point in time) and the temporal information that connects those poses (the order in which the poses occur). We construct a movement space where each point in the space contains both pose and temporal information from the movement of C. elegans. The movement space’s shape is analyzed using persistent homology to inform on an organism’s behavior and physical abilities.
We use this technique to study the effects of aging on C. elegans. Our analysis is guided by the goal of answering questions like “What kinds of behavior and ability changes do C. elegans go through as they age?” and, “Can we quantify the ‘youthfulness’ of a population of C. elegans?” We are currently able to differentiate between populations of different ages and identify various characteristic behaviors of individual organisms. We hope this analysis can be a tool for comparing chronological age (how old you are) and biological age (how old you act) as researchers seek ways to slow biological aging and its associated deterioration.
Topological data analysis of zebrafish skin patterns
NSF-Simons Center for Quantitative Biology
ABSTRACT: Wild-type zebrafish (Danio rerio) are characterized by black and yellow stripes, which form on their body and fins due to the self-organization of thousands of pigment cells. Mutant zebrafish and sibling species in the Danio genus, on the other hand, feature altered, variable patterns, including spots and labyrinth curves. The long term goal of my work is to better link genotype, cell behavior, and phenotype by helping to identify the specific alterations to cell interactions that lead to these different fish patterns. Using a phenomenological approach, we develop agent-based models to describe the behavior of individual cells and simulate pattern formation on growing domains. In this talk, I will highlight how topological techniques can be used to quantitatively describe our simulated patterns and in vivo images.
Algebraic Systems Biology
University of Oxford
ABSTRACT: Signalling pathways in molecular biology can be modelled by polynomial dynamical systems. I will present models describing two biological systems involved in development and cancer. I will overview approaches to analyse these models with data using computational algebraic geometry, differential algebra and statistics. Finally, I will present how topological data analysis can provide additional information to distinguish wild-type and mutant molecules in one pathway. These case studies showcase how computational geometry, topology and dynamics can provide new insights in the biological systems, specifically how changes at the molecular scale (e.g. molecular mutations) result in kinetic differences that are observed as phenotypic changes (e.g. mutations in fruit fly wings).