A simulating winter!

That’s a graph automatically generated by the cell signalling pathway simulation automation software I developed over the winter break. It’s a bit ugly — a quick and dirty CDF rendered in R of activation times of ERK in the TCR (T-Cell Receptor) pathway model for a particular endogenous pMHC ligand dosage. Fancy lingo, eh?

The past several weeks I participated in CMACS, an NSF-funded program intended to get undergrads to go on to grad school. You mean I’m not in grad school yet? Oh, right. More on CMACS later. I joined the Treespace research group at Lehman, where I’ll be expanding on my limited knowledge of phylogenetics and drawing some pretty graphs with Java, Python, and Cytoscape. I continue to be involved at Hunter with the EvoBioLab where I’m reading up on bacterial recombination and mutation and working on genomic visualization tools. The hackerspace I started back in September is thriving, with fifteen paying members, and we just filled up a 100 seat event we’re hosting at the NYPL. I’m working on organizing an iGEM team at Hunter and looking into starting some of that work at Genspace — we already have a rather nice team lineup. I hope to continue the work I started this past summer with my esteemed colleague James Estill. Yes, I’ll be taking some classes, too. Organic chemistry included. What, me nervous? Back to CMACS…

We looked at cell signalling pathways in cancer using stochastic simulations in BioNetGen. I got a bit carried away automating experiments and created ScanBatch. This new tool I whipped up runs arbitrary numbers of simulations over arbitrary numbers of models, species (molecule types), and concentrations of those species. To my delight, I’ve been asked to contribute ScanBatch to the BioNetGen project and continue work on it to expand the automated simulation capabilities. I’ll be working on accepting more complex parameter ranges, templating the simulation code, doing more complex combinations of parameters, validating model output, and making large batches of simulations scale. It’s an honor to be asked to help take such a cool piece of cutting edge research software to the next level. I look forward to working with Dr. Jim Faeder and co. on this. I’m digging Systems Biology!

If everything I’m working on pans out, I can expect to see my name on a few papers next year, which would be concrete proof that this whole scientist thing is really happening. I’ve been getting paid to do science related work since June 2011, and it’s a good feeling that this trend is ongoing. It was only months before that I threw down the gauntlet and challenged myself to do only science work. I’m feeling a strong pull towards systems and synthetic biology, but I may be biased since I’m a computer programmer, and the field seems to be especially welcoming towards my kind right now. I hope you, my friends, family, and colleagues, have had a wonderful and imagination-catalyzing winter so far.

PS. Convergence of ideas: Bacterial evolution simulation at EvoBioLab, Bacterial recombination as engine for computation in-vivo for combinatorial problems. Bacterial evolution of whole genomes plus gene expression simulation via BioNetGen. Visualization of genomic data, visualization of phylogenetic data, visualization of cell pathway data. DNA fingerprinting, DNA computing, cryptography and digital security. Okay, now I can sleep.

Posted in Uncategorized | Tagged , , , , , | Leave a comment

Tenatative steps towards genome visualization

Genomic data is rather beautiful in that the stirrings and mysteries of life can be glimpsed within it. I’ve been dabbling in making functional genomic art — mostly with spiral representations of bacterial DNA. Here’s a rendering of Borrelia burgdorferi plasmid cp26. We’re studying Borrelia in the lab.

My first experiments (here and here) are simplistic, but pleasant, I think. My initial goal was to squeeze a nice big chunk of DNA into a small space. To that end, I more or less succeeded with the spiral approach. It turns out (unsurprisingly) that the circular visualization of genomic data isn’t new, though I don’t know if anyone has used my particular approach. Visualization is an exceedingly fun thing to poke at and learn, and I look forward to coming up with more ways of making functional bio-art.

On the practical side of things, Dr. Qiu and my colleagues at the evolutionary bioinformatics lab have been enlightening me to the professional needs of biologists in the realm of visualization. Some of the things I’ve been advised to look into are: six frame translation, SNPs, GC base percentages, amino acid frequency, synteny (gene order), genetic drift, real time simulation visualization, and phylogeny. That’s quite a laundry list. One must be careful when asking for ideas and suggestions in the lab.

Six frame translation refers to the six ways to look at a DNA sequence — three forward and three backward. There are three in each direction because the DNA is broken up into codons that each have three bases. It’s impossible to know which base a gene starts on without trying all three in either direction. So, you look for long uninterrupted sequences of codons (an open reading frame, or ORF), at each of the six reading frames. Those could quite possibly be genes, and finding a gene is like striking gold.

Here’s my rendering of reading frame 1 on the same Borrelia plasmid. I’m nearly 100% sure that I got something wrong, because when I checked ORF Finder, I got different results. This is a demonstration of the concept for a tool for finding potential genes, like the NCBI orf finder. The coloring of the segments is exponentially scaled based on size, and they become delimited by a blue border when they reach a maximum threshhold. In other words, the red areas with a blue border are most likely to be genes. It’s basically a heat map for genes. Well, so goes the theory, in my limited understanding. I’m hoping to fix this up so that I get results consistent with NCBI for all reading frames.

I’m rendering these in 100% homegrown convoluted OpenGL/C++ code. Ultimately I suspect that any general purpose genome visualization tool I might end up with will want to live on the web, perhaps utilizing WebGL. Even if this turns out to be a command line tool, it could live on the web as an image generator. For the moment, I’m enjoying developing natively on my laptop as an idea scratchpad. Before I go implementing any more features, I’ll make sure the sequence data is coming through accurately so that the ORFs match NCBI. Then I’ll tackle six frame alignments, etc. At that point, hopefully I’ll have something worth sharing!

Any thoughts on genome visualization? Suggestions? Please leave a comment.

Posted in Uncategorized | Leave a comment

Eloquent EEG explanation:

One of my latest distractions is development of an EEG and biosensors workshop at Hack Manhattan, a NYC-based hackerspace I founded in September.  I wanted to share this really great explanation of brainwaves from the NeuroSky website.

The first law of thermodynamics states that energy can only change forms but cannot be created or destroyed. The energy used to power the pumps becomes stored as the charge differential between the inside and outside of the cell. When that differential is removed by the flow of positive ions into the cell, the stored energy is released in the form of small waves. Like little waves combining to create big waves in the ocean, as thousands of neurons fire, the little waves come together to create the larger waves knows as brain waves. It is these dominant brainwaves that are measured by NeuroSky devices.

Here I’m at Hack Manhattan viewing brainwave data from a NeuroSky MindSet EEG device on my thinkpad running Ubuntu. Puzzlebox open source software  provides a good starting point for hacking EEG technology.

Posted in Uncategorized | Tagged , , , | Leave a comment

Nucleotides are nice, but codons are capital!

I’ve updated my OpenGL DNA renderer to support amino acids, rather than just single nucleotides. I find these colorful DNA representations oddly aesthetically pleasing, so I’m pursuing it a bit further. It’s always possible I’ll find utility in being able to do some unique sequence rendering. I borrowed the color scheme for amino acids from RasMol software. According to RasMol, the colors were chosen “according to traditional amino acid properties”. This is the same arbitrarily chosen Borrelia plasmid as last time, except since we’re looking at a coding sequence of amino acids (and stop codons in black), there are about 1/3 as many entities to represent (3157 codons), so each one is much clearer. This is a first run, so there may be some artifacts/errors, of course.

Posted in Uncategorized | Tagged , , | Leave a comment

Pronounced Plasmids

Several months ago I posted a maze I generated in C++ with OpenGL. Being pre-occupied with molecular biology lately, I took my old code out and adapted the concept for rendering a bacterial plasmid. Specifically, this little beauty is: Borrelia valaisiana VS116 plasmid VS116_cp9, complete sequence. The generative sequence renders quads around in a spiral, trying to accommodate the 9400+ bases in the sequence in less than 725 lines of resolution. This is a circular plasmid, so the physical reality is that it loops back on itself, but rendeirng this as a circle make it impossible to discern any bases. I think you can almost make out the sequence, where the amino acids A/T/G/C are red/blue/green/yellow. Click image for higher res.

 

Posted in Uncategorized | Tagged , , | Leave a comment
  • The Native Inhabitant:

    Web and IT pro turned novice scientist. Currently studying computer science and bioinformatics at Hunter College.

    Here be: dragons, bio + engineering + medicine + ethics, vegan eats and fashion, music and words, gadgets and software, photography, design, DIY/maker/hacker culture, NYC, running/fitness, cyborg anthropology, et cetera.

    dp at danielpacker dot org