Tuesday, 25 March 2014

Monocle is a R toolkit for analyzing single cell expression experiments

---------- Forwarded message ----------
From: Cole Trapnell <cole cs.umd.edu>
Date: Mon, Mar 24, 2014 at 11:51 PM
Subject: [Bowtie-bio-announce] Monocle 0.99.0
To: bowtie-bio-announce lists.sourceforge.net

I am proud to announce the first release of the Monocle analysis toolkit for single-cell RNA-Seq and qPCR.  Monocle performs differential expression and time series analysis for single-cell expression experiments.  You can read about Monocle and its “pseudo time” analysis of biological processes in the paper, which just appeared on Nature Biotech’s AOP list:


The Monocle source code and support site is also live:


I hope you will consider using Monocle for your single-cell expression analysis workflow!  Please report any issues on the new Monocle google group:


Note that Monocle is currently considered an ALPHA release - new features and interfaces changes will be coming in future releases, along with significant new functionality. The release announcement for Monocle v 0.99.0 is below.



0.99.0 release - 3/23/2014

The first public release of Monocle is now available for download. Monocle is a toolkit for analyzing single cell expression experiments. It runs on the R statistical computing platform.

Monocle takes as input a matrix of gene or transcript expression values, along with information about each cell in the experiment, and some annotation about each gene/transcript, both provided as simple tables. It is designed for single-cell RNA-Seq experiments, but in principle Monocle can be used with other data types such as qPCR. For RNA-Seq, you can useCufflinks to estimate your expression values.

This software is a work in progress - it is a beta release, and new features will continue to be added over the next couple of weeks. To suggest a feature or report a bug, please post your comments on the Monocle user's group.

Cole Trapnell


Wednesday, 5 March 2014

Google Genomics -- Google Developers

https://developers.google.com/genomics/ Google genomics limited preview is out.
Been wondering when and how they might do genomics.
Now suddenly it's scarier to link all ur google accounts together. 

Maybe we will see gene association studies with different online surfing habits very soon! *chuckles*

Thursday, 27 February 2014

It's been a while ... Python 3 print is now a function

Gosh it's definitely telling that I haven't been coding in Python 3 for a while.

I didn't know that they have changed the print statement to a function. So now I need parentheses

for details see

Tuesday, 28 January 2014

Growth slows in cloud business

Has the cloud hype died down? Seagate reportedly misses analyst's estimates as growth slowed in it's cloud storage business. With the exception of Galaxy or Basespace, it's hard to get researchers to buy into the cloud analysis paradigm even though their sporadic usage pattern would fit nicely for the economics of cloud usage. Maybe the cost of storage space / moving files into the cloud is the biggest hurdle.  My two cents.

Wednesday, 13 November 2013

I have been coding in C with no luck [joke]

Consumer Grade HDD are OK for Data backup

Ah storage, who doesn't need more of it? Cheaply I might add.
The folks at Backblaze published their own field report on HDD failure rate which is interesting for any data center.
Earlier I had read about Google's study on how temperature doesn't affect HDD failure rate and promptly removed the noisy HDD cooling fans in my Linux box.
Their latest blog post at http://blog.backblaze.com/2013/11/12/how-long-do-disk-drives-last/ has me thinking that some of my colleagues elsewhere that are doing Backblaze like setups should switch to consumer grade HDDs to save on cost.
I do have a 80 Gb Seagate HDD that has survived the years. Admittedly I am not sure what to do with it anymore as it is too small(80 Gb) to be useful and too big (3.5") to be portable. It was used as a main HDD until it's size rendered it obselete hence it's sitting in a USB HDD dock that I use occasionally.
maybe you can find out the age by looking up the serial number but I use the SMART data info that you can see from the Disk Utility in Ubuntu.

My ancient 3.5" HDD

However the age as you can see from the screen shot is an estimate of the days it has been powered on.
Powered on only for 314 days!

Hmm pretty low mileage for a 80 Gb HDD eh?
Check out this 320 Gb IDE HDD

Even lower mileage! Not too sure of the history of this drive so can't really comment here.

Completely anecdotal but I have 3 Seagate 1 Tb HDD dying within a year from a software 5x HDD RAID array from within CentOS. When I checked on the powered on days it says it has been running for 3 years. So I am
1) confused how SMART data records HDD age
2) in agreement with Backblaze that HDD have specific failure phases. (where usage patterns play less of a role perhaps)
3)guessing that most of the data on Backblaze are archival in nature i.e. write once and forget until disaster strikes. So it would be great if  Backblaze can 'normalize' the lifespan of the HDD with data access patterns per HDD to make it more relevant for a crowd that has a slightly different usage pattern than pure data archival needs.

That said I think it's an excellent piece of reading if you are concerned about using consumer grade HDD. Kudos the Backblaze team who managed to 'shuck' 5.5 Petabytes of raw HDD to weather the Thailand crisis (wonder how that affected their economics of using consumer grade HDD)

As usual YMMV applies here. Feel free to use consumer grade HDD for your archival needs but be sure to build in redundancy and resilience into your system like the folks in Backblaze.

Datanami, Woe be me