Software and Web servers
The following is a list of software which I have developed or contributed to. Unless mentioned otherwise, software below is released under the GNU Public License. Feel free to contact me if you would like some clarification. If you are interested in something that is not listed here, please ask me.
As a courtesy to me, I would appreciate it if you let me know if you find any bugs or have any comments or suggestions. Thank you!
Starting from October 2014, some of these programs are no longer available for download here. Instead, they have been moved to my GitHub account: rwanwork.
The following information is available:
- Major projects
- Minor software (i.e., scripts, etc.)
A program and library for compressing quality scores.
Version 1.00.0, November 6, 2011
- FragSort (Fragment Sort)
A tool for sorting next generation sequencing data (BIBM 2010 Workshop).
Version 1.00.0, December 27, 2010
- HAMSTER (Helpful Abstraction using Minimum Spanning Trees for Expression Relations) (SCFBM 2009 Journal)
A tool for depicting experiments in a microarray data set as a set of minimum spanning trees. Both web-based and a downloadable versions are available. Note that the web server is no longer maintained by me as of 2011/10/01.
Version 1.3.0, August 26, 2011
[Was http://hamster.cbrc.jp; No longer accessible]
Data Compression / Information Retrieval
- Probabilistic latent semantic analysis (PLSA) (AIRS 2009 paper)
- Re-Store phrase browsing system (PhD thesis and related work -- includes the Re-Pair compression algorithm)
Small software tools / scripts
- A Perl script that takes a set of sequences and a set of quality scores and combines them to make a new data set where the sequences are permuted based on the quality scores. This script is meant to be used when sequences are synthetic. Generating such sequences is not part of this script -- others have provided such software, including RMAP.
Version 1.0, January 14, 2011
- Map human transcription factor binding sites to another species
using data from UCSC's Genome Browser.
Details in the accompanying PDF file.
Version 1.0, December 28, 2005
- A Perl script that takes a set of LaTeX file(s) and a directory of BibTeX files and selects only those entries which are relevant to the LaTeX file(s). This is useful if your set of BibTeX files is getting large and either processing it using bibtex is taking too much time or the publisher wants your .bib file and it is too large to send to them. Archive contains only one file.
Version 1.0, January 7, 2010
- A Perl script that recursively compares the files in two directories using external calls to md5sum to calculate MD5 signatures. Files that have different signatures or exist in one directory but not the other are reported. Archive contains only one file.
Version 1.1, August 3, 2010
- KDE Kate syntax highlighting file for the Graphical Models Toolkit (GMTK) language, a toolkit developed by Prof. Jeff Bilmes and others at the University of Washington for constructing graphical models (typically for speech processing). See the attached README for further details. Distributed under the LGPL license.
Version 1.1, June 10, 2010