Dfam

An open database of repetitive and transposable genomic elements.

The Dfam project is an open database of repetitive and transposable genomic elements maintained by researchers from the Hood Lab at the Institute for Systems Biology and the Wheeler Lab at the University of Arizona.

My role has been to maintain and optimize the website, the back-end API, and our server-side scripts, as well as to design and implement new features and subsystems as the need arises. I have redesigned the partitioning and export tools that allow users to use our data in offline environments, maintained the Docker container containing installations of our tools and their dependencies, built a genome annotation storage and retrieval system, upgraded our Nextflow pipeline to make it easier to use our tools in HPC environments, answered user questions and bug reports, and many other smaller projects besides.

There is never a shortage of things to work on, from bug reports to longer term research projects, and there are always more improvements to make. The database portal is found at Dfam.org, and our project’s software can be found in our GitHub repo.

ISB UA

Related Blog Posts

Annotation Indexing

Dfam
Annotation Indexing

March 9 2025

It had been years since I’d used a compiled, strongly typed language, but we needed a newer, faster subsystem. It was a perfect excuse to expand my toolbox.

Partitioning The Taxonomic Tree

Dfam
Partitioning The Taxonomic Tree

November 28 2023

Before I started working for Dfam, the entire database was less than a hundred Gigabytes. Just before I joined it doubled in size, and by the time I’d found my feet it was over 800 Gigabytes and growing.


Website built with Jekyll and Github Pages