The Tech
The L++ computing language
The Kim lab is developing a complete computer programming language, called L++, for researchers to write code that precisely and unambiguously describes biosystems at a very detailed resolution. With this language, computers will be able to understand, parse, and translate the code into a virtual organism. L++ is specifically designed for biologists with intuitive syntax for describing biochemical pathways and chemical processes. Using the L++ language, his lab has begun to write 100+ biochemical pathways of E. coli in an estimated 500 pages of L++ code as a reusable template by integrating research on E. coli from an array of heterogeneous databases and resources. Researchers can also use and repurpose this code through simulations and visualizations, allowing them to efficiently create biochemical pathways, reprogram cells, and design and eventually create new bacterial strains.
Essence Neural Networks
Essence Neuronal Networks (ENNs) are a new style of Artificial Neural Network inspired by cognitive theory on symbolic reasoning. In contrast to conventional feed-forward deep learning neural networks, which are trained by gradient descent, the ENN parameters are hierarchically trained using support vector machines (SVM). Practically ENNs perform on par with deep learning approaches on standard benchmarks such as image classification. But ENNs overcome fundamental limitations of deep learning: they are interpretable at the single neuron level, and generalize well beyond what they are trained on (i.e. can construct mechanistic models that perform well beyond their training data). This allows an end-to-end pipeline in which labeled data is directly transformed into rules or algorithms that describe how the data map to the labels, a process for automated discovery of scientific models from high dimensional data.
P.J. Blazek, M.M. Lin (2021). Explainable neural networks that simulate reasoning. Nature Computational Science 1: 607-618 https://doi.org/10.1038/s43588-021-00132-w
P. J. Blazek, K. Venkatesh, M.M. Lin (2021). Deep Distilling: automated code generation using interpretable deep learning. arXiv. https://doi.org/10.48550/arXiv.2111.08275
Titratable CRISPRi
The Reynolds lab has developed a new titratable CRISPR-interference (CRISPRi) approach for quantifying the relationship between gene expression and E. coli growth rate in high throughput. These measurements can be used to assay the effects of combinatorial variation in the expression of multiple genes and environments, with an efficiency approaching 104 measurements per experiment. These data can be used to train an interpretable machine learning model that both (1) provides summary statistics quantifying gene-gene and gene-environment couplings and (2) allows prediction of growth rates for combinatorial perturbations in gene expression and environmental conditions. We envision applying these tools to the optimization of gene abundance and experimental media conditions for enhanced biosynthetic pathway yield.
A.D. Mathis, R.M. Otto, K.A. Reynolds (2021). A simplified strategy for titrating gene expression reveals new relationships between genotype, environment, and bacterial growth. Nucleic Acids Research (49): e6. https://doi.org/10.1093/nar/gkaa1073
R.M. Otto, A. Turska-Nowak, P.M. Brown, K.A. Reynolds (2022). A continuous epistasis model for predicting growth rate given combinatorial variation in gene expression and environment. BioRxiv. https://doi.org/10.1101/2022.08.19.504444
High throughput bacterial genome editing
The Saunders lab has developed a new toolkit for genetically modifying E. coli and other bacteria, called ORBIT (Oligo Recombineering followed by Bxb-1 Integrase Targeting). Originally pioneered in Mycobacteria, the Saunders lab has now adapted and extended these tools for E. coli. ORBIT enables the fundamental operations of genomic deletion and insertion with unprecedented efficiency and speed. Mutations are specified using only commercially available DNA oligonucleotides that are transformed directly into bacterial cells, eliminating the need for molecular cloning to target different genomic sites. This process works effectively for creating individual strains or mutant libraries and is uniquely capable of creating both large deletions (>134 kb) and large insertions (>11 kb), which makes ORBIT an extremely flexible tool for both basic biology and bioengineering.
Saunders, S. H., Ahmed, A. M. (2023). ORBIT for E. coli: Kilobase-scale oligonucleotide recombineering at high throughput and high efficiency. bioRxiv. https://doi.org/10.1101/2023.06.28.546561