We hosted the
International Fish Genomes meeting once more this weekend,
bringing researchers from Europe, America and Asia to give talks about their
current scientific progresses and also to discuss possible collaborations.
The amount of work reported by the
zebrafish people at the Sange, leaded by
Dr. Derek Stemple, was a surprise to me, only because I wasn't closely
following their efforts before (a quick summary at the bottom). They are in
some ways ahead of the human genetics people in developing methods and
protocols to take full advantage of the NextGen sequencing technologies. The
ZFIN consortium has recently released Zv8, the latest version of the Danio
rerio assembly, which will be present, with brand new gene prediction build,
in the upcoming
Ensembl v54 in
April.
One recurring theme in most talks was how to deal with heterozigosity and
polyploidization issues when trying to assemble fish genomes. Different
techniques to create artificial individuals with double haplotype genome sets
have been developed that make this task easier. Another recurring theme was
that the NextGen sequencing was focused on assembly, but there are a lot of
hanging fruits to be picked up for SNP and cDNA analyses that will squeeze
more biological results out of the same data. Squeezing data doesn't seem
very trendy nowadays with the second deluge we are facing, but I don't think
it's something we should feel to personal about, because in essence is only a
change in the scale of data set size, but with the same imagination and
ingenuity at hand as before.
One of the most interesting talks to me was
Dr. Hugues Roest Crollius demo of
their
Dyogen Synteny Browser, which complements a lot what we have in the
EnsemblCompara GeneTrees in terms of visualization and predictions errors
discovery. This browser is specially relevant for the fish clade, by bringing
out very clear patterns of gene conservation and loss after the whole genome
duplication in the Teleosts.
The extraterrestrial talk about skate cell biology and development brought
pictures of baby skates resembling very much the alien stuck onto the face of
the guy on Alien, the film. Cool and disturbing in equal amounts.
Another conclusion from the meeting is that we probably need to gather
efforts in writing a strong proposal to
NHGRI for a 2x-mammalian-like sequencing
spree in the fish clade. As the mammalian genomics study is now showing, you
get a more complete picture by sequencing lots of ``millions of years'' than
getting stuck in repetitive and haplotypic knots in one single genome. Still,
no one stepped forward on Saturday to coordinate the proposal, and a bit more
haggling and ambushing may be needed :-).
----
Derek Stemple -- Zebrafish Genome Project
Heterozigosity and haplotype issues in the six related individuals
used first
Better DNA to make libraries --> Fertilized eggs with UV-inactivated
sperm -- melt the spindle in the first division -- doubled haploid
fish (DH)
Better genetic maps -- radiation hybrid, heat shock, meotic map
Gene Annotation using Solexa Sequencing -- how to do it
Q: SNPs from the Solexa data?
A: Not focusing on that now, reference already has 0.6M, 6M with other
line
Q: 3UTR repetitive element problems in Solexa?
A: Use read pairs, filter out in the alignment, it's difficult. Will
probably do bigger library inserts soon.
Q: Depth and false 5prime ends?
A: A third of reads cross a boundary, 5prime ends there is a bias
towards one strand in the sequencing
Matthew Clark
SJD is homozigous but nobody uses it because of reproduction issues in
the lab
Doing CNV analyses, comparing with repeats using a HMM (Jared Simpson)
Read pair abberrations -- lots of deletions found -- badly assembled
bits of the genome? (Klaudia Walter)
Doing a WGSA Affy for SNP chip -- will help BAC mapping
De novo map -- mapmaker, joinmap, RECORD, SMOOTH, Combin, MSTmap
MSTmap works on Turing -- combine genetic map with sequencing
Zebrafish and Danio genus polymorphism rates -- conserved elements