Blog

Perl for the Impatient

Maybe you want something a little more purpose-built than Bash to mine some text and need a refresher. Perhaps you're in Bioinformatics and want to extract some patterns from DNA (the original weird programming language). Maybe the eccentric, neck-bearded, regex-wizard in your office just retired and you've inherited a couple of scripts. Whatever the reason, you need to get (re)acquainted with Perl. I've got you covered with a slew of in-context examples of working code. Perl was originally developed in 1987 (just like yours, truly) by Larry Wall (unlike yours, truly). Before the language took off, he originally set out to automate the processing of text reports on Unix systems. Perl accordingly boasts a lot of support for text processing. Perl espouses the "There's more than one way to do it"…
Read More

Python Imports Sample Project + Explanations

Salutations! A friend of mine recently asked me for some pointers on Python import statements, and I thought I would share what I came up with. The import command allows us to use code from other libraries, directories, source files, etc. It's a fundamental tools that's been around since way before Python (think "#include <stdio.h>", and further). The structure is easier to demonstrate if we break from our go-to Jupyter format, so have a look at the project here. Have a look at the code in utils.py, OtherUtils/utils_one_dir_down.py, and then read the mains!
Read More

A Handshake with Scala

Scala is a JVM language that seeks to marry functional and object-oriented programming styles. It was was Designed by Martin Odersky and first appeared in 2004. While I think many projects that aspire to "best of both worlds" status end up ugly and chimeric, my explorations with Scala have revealed a really thoughtful and lovely creation that elegantly merges different paradigms. While it isn't one of the top-tier languages for for production code job hunting (think Java, C++, Python, JavaScript, etc), it made the GitHub top 15 list for most common languages recently. I suspect much of this is because it's becoming popular as a language for data science as a happy medium language between the breezy-to-write-but-relatively-slow Python and the incumbent production veteran Java that is faster but comes with…
Read More

Some light-hearted OOP (or, Some Circumstantial Evidence That I am Still Alive)

  Hello there! I am writing to let you know that I have cleared the summit of a couple of beastly projects and will be back to working on the site at at the usual semi-regular pace soon. I am also working on some pieces that are a little larger and more detailed than usual, but the delays should (hopefully) be offset with quality. In the meantime, enjoy some easy-going code humor that doesn't want to hurt no one as evidence I am still around. [embedit snippet="composedstevemiller"]
Read More

Dict-based Find and Replace Deluxe

Using a dict to perform find and replace tasks makes a lot of sense; they're simple to implement and they allow us to easily store the target with our desired replacement in one spot. There are a few hidden traps to keep be aware of, but they're easily avoided and we can be back on track reaping the benefits with just a little forethought. [embedit snippet="find_replace"] 
Read More

ASH: A Protoype Tool For Inspecting Potential Antigens/Epitopes

Hello! I have been interested in antigens/epitopes for some time now, and this is my prototype of a tool to help and informed user take a closer look at targets of interest for projects that involve them. It works using a simple scale that compares kmers on a residue to residue basis, and scores them for distinctness based on the presence of hydorphilic or hydrophobic residues and residues of structural complexity, which are widely known factors in immonogencity. Still, in the name of diligence, sources that informed these decisions can be seen below at the bottom of the post. The github to the project is here . There is a walkthrough, a readme, and a test package if you want to give it a spin. I'm actively working on this…
Read More

In Honor of MLK: “I Have A Dream” Wordcloud Example in R

Hello Everyone! I would like to wish everyone a happy MLK day. It isn't much, but as an act of respect and commemoration, I though I would share an example of how a beautiful speech can, with realative ease, be turned into a striking visual. We will, of course, be working with the monolithic "I Have a Dream" speech. I used the R programming language for this, as I have worked more extensively with text with it than Python, and don't use it as often and want to keep up my skills. If you'd like to see the rundown, here is the workbook, if not, the final product is below! Enjoy, and once again, happy Martin Luther King day! Note: This post was published on Jan 15 at approximately 9pm.…
Read More

Twitter Metadata Classifier: Trump or Clinton?

Hello all! I did a project for a data science class in which I classified tweets as being from @realDonaldTrump vs @HillaryClinton. I found a dataset on this on Kaggle. Because words and context trip up even the most clever programs on occasion, I decided to see if I could write and entirely numerical classifier. It read length of words, average words per sentence, that sort of thing. I wrote a Python function that engineers these features and a script to implement it that adds them to the original dataset. Below is the video: [embed]https://www.youtube.com/watch?v=ILCTxK-ZXJ0&t=16s[/embed] Please feel free to get in touch if you'd like to see a copy of the PDF.  
Read More