Skip directly to content

ggheat : a ggplot2 style heatmap function

on Fri, 03/11/2011 - 14:52

I hope the code here is fairly self-explanatory with the inset annotations. I feel this is just a bit 'prettier' than heatmap.2 and has for me the right balance of options and extensibility. I have also found it difficult to produce high quality plots with heatmap.2- whereas ggplots especially with RStudio assistance in resizing PNG turn out better IMHO.

## m=matrix(data=sample(rnorm(100,mean=0,sd=2)), ncol=10)
## this function makes a graphically appealing heatmap (no dendrogram) using ggplot
## whilst it contains fewer options than gplots::heatmap.2 I prefer its style and flexibility

I love RStudio

on Tue, 03/01/2011 - 16:22

Typically if I am working with R I will have a lot of different windows open all at once. I'll have the R Console (for Mac) of course and a text editor (Xcode or TextEdit), plus there will usually be some help windows to examine particular functions or a vignette in a PDF window, then there will be the plots I am working with and often a browser (Safari) pointing at the Bioconductor mailing lists or ggplot2 website for examples. It can be worse still if I'm using Adobe to edit graphics or interfacing R with another programming language...... Euuuurgggh.

(more on) Pattern Matching for Transcription Factor Binding Sites

on Wed, 02/02/2011 - 11:22

I have published some initial script scribblings on this task about a week ago. After another week I'm posting some better formed and annotated code. The Biostrings and BSGenomes packages are new to me and I've gone through many many iterations and experimenations to arrive at the as yet incomplete code below. Whilst the packages seem powerful I'm yet to get the hang of the object structures and methods for the various DNAStrings, DNAStringSets, Views etc etc..-- and work out how best to functionally program with them.

DataMarket

on Mon, 01/31/2011 - 16:11

I have just discovered yet another public data site www.datamarket.com. Most of the data are time-series. It collects together things like World bank, Eurostat, Gapminder into the one place. It also allows you to download data as csv files or to create a graph and link for embeddign into a web-page-- as below.

 

 

occasionally it doesn't quite come out right but it's a nice tool to try...

 

Pattern Matching for Transcription Factor Binding Sites

on Mon, 01/24/2011 - 15:31

I'm trying to search for binding sites for the transcription factor MAF (i.e. TFBS for MAF) in the promoter regions of various genes. I initially started out looking at a precomputed database of binding sites MAPPER. However the TFBS models that have been used from TRANSFAC or JASPAR are quite unlike the human binding sites  shown in the recent literature (the sites in the recent literature are palindromic sequences called MAREs).

Unfortunately the palindromic sites shown in the literature do not have alignment or position-weight-matrices (PWM) and they are quite variable.

gnmplot

on Thu, 01/13/2011 - 16:32

I'm writing a new package that will create nice publication quality graphics of genome information. It's really an adaptor sitting between the biomaRt and ggplot2 packages. Here is the code so far:

## this function integrates 3 steps to creating a genome plot
## 1 query bioMart to get a GFF like data-frame from ensembl
## 2 add new elements to the gff to make it plot with ggplot
## 3 plot it with ggplot2
## the reason to split it like this is so that basic users may use this simple function
## and advanced users can try custom queries or altered plotting themes
 
gnmplot= function(filters=

prettyR

on Thu, 01/13/2011 - 13:50

I have just remembered a package called 'prettyR' that pretty much does what it says. It makes R code more readable.. so an example from my forthcoming genomeplot package (see forthcoming blog entry):

 

## this function integrates 3 steps to creating a genome plot
## 1 query bioMart to get a GFF like data-frame from ensembl
## 2 add new elements to the gff to make it plot with ggplot2
## 3 plot it with ggplot2
## the reason to split it like this is so that basic users may use this simple function
## and advanced users can try custom queries or altered plotting themes with the sep functions
Tags: 

survival curves for Leonid

on Fri, 01/07/2011 - 13:18

Leonid asked me to do a quick survival analysis of two different types of mouse (m430 and m210) with surgically implanted tumours (or something like that). The data was in the wrong format but after transforming it looked like this:

In my opinion there are far too few samples to detect any significant difference between the mouse types in anything but the most extreme case. Nevertheless it is useful for him to go through the motions and publish what he has....or rather what I do for him.

I used the survival package for R . Here is the script and results.

##attach column names to the

A fracturing of NHS cancer services.

on Fri, 01/07/2011 - 11:55

I published the following on another blog Left Foot Forward.

----------

Recently the newspapers have been filled with apprehension at plans for a revolutionary decentralization of NHS management. Yet for NHS cancer services a similar experiment is well underway. The Interim Cancer Drug Fund (ICDF) is £50m of extra money (over 3 months) to purchase drugs that have been rejected or not yet approved by NICE. In contrast to the current system decisions on which treatments to offer are being made by local panels of clinicians within each Strategic Health Authority (SHA).

The MacMillan Cancer

About the website

on Thu, 12/30/2010 - 17:29

I am starting this website:

  1. to get practice with authoring blogs, websites, html and so on.
  2. to keep all the writing that I have begun in the same place.
  3. perhaps to keep a record of what I am doing at work- as I don't think I have ever kept good records.

 

 

Not that most of the regularly updated material is over on the blog section. Plus as this is a pre-cooked website I have a few dead features (forums, mailing list) that I'm not sure how to remove.