Publication/Letter for SEQanswers

From SEQwiki
< Publication
Revision as of 15:06, 31 October 2011 by Andreas.sjodin (talk | contribs)
PublicationPublication/Letter for SEQanswers
Jump to: navigation, search

Trimmed letter: Draft #2

The current state of the trimmed letter could be found as a Google document

Background

During the review of the SEQwiki paper [1], an important point was raised, by the reviewers, the SEQanswers forum has yet to be published, and deserves a good publication.

Lets write a short Science letter (< 300 words) [1] or Nature Correspondence (< 350 words) [2][3] about SEQanswers!

SEQanswers has already been 'informally' cited dozens of times in the literature, so why not write a nice summary for everyone to cite?

Please contribute (and sign the letter) below! The final list of authors will be ranked according to (democratically determined) contribution to the final text.

Meta paper discussion should stay on the forum thread here.

Ideas to convey

  • SEQanswers as a community for interactions among NGS users and between users and developers (UNIQUE)
  • SEQ* is a firmly established community, as reflected by papers that cite SEQ*


Content (Long version)

In recent years, dramatic advancements in sequencing technology have created a rapidly advancing and complex field of research. These new technologies have given us the capability to answer biological questions that were previously out of reach. However, the rate at which these technological advancements have come about has outpaced the speed of peer-reviewed publication and other traditional forms of information sharing in a burgeoning research field rapidly becoming known for 'big data'.

SEQanswers (http://SEQanswers.com) was founded to address this gap. The community-focused format facilitates the rapid dissemination of both wet-lab techniques and bioinformatic analyses. Over 20,000 registered users composed of a diverse blend of bioinformaticians, geneticists, and molecular biologists meet and share their experiences and tools. SEQwiki (http://seqanswers.com/wiki/SEQanswers), a user-maintained semantic wiki hosted by SEQAnswers, provides an organizational structure to hundreds of publications, methods, data formats, and bioinformatics tools.

Previously, user feedback was almost exclusively directed directly to authors of the bioinformatics software. Only a few disparate, well-established, or well-funded groups provided publicly archived mailing lists for community discussion. Software reviews by bloggers were similarly posted independently and relatively rarely. Newcomers to the fast evolving sequencing research field faced a seemingly overwhelming challenge to determine the most effective way to analyze their data.

SEQanswers has fostered lively review of both pre-publication and post-publication tools. Usually, long before peer reviewed publication, tools have been announced in SEQanswers and tested extensively within the community. Post publication improvement and benchmarking among developers is encouraged by discussions in the SEQanswers forums. Moreover, many avid bloggers and individuals from well-established groups contribute to the forum discussions and provide insights that may otherwise only have been found scatter throughout the Internet in various places, if available publicly at all.

As a supplement to traditional forms of scientific communication, SEQanswers offers instantaneous sharing of ideas and review of findings between peers at the cutting edge of high-throughput genome biology. The site has become an important resource for worldwide collaboration and education in the modern genomics era. The Bioinformatics forum in particular has acted as a venue for discussing problems using bioinformatics tools, which can also help tool developers by highlighting areas where their documentation may be lacking, potential for new functionality, and often useful bug reports. In a relatively short time, SEQanswers has thereby become an extensive and firmly established knowledgebase for next generation sequencing and its bioinformatics applications. The rapid response times of experts and developers in various fields to all sorts of questions, ranging from biological questions to the discussion or usage of bioinformatic tools, makes SEQanswers a first port of call for bench scientists and computational biologists alike. Some overlap exists between the bioinformatics forum at SeqAnswers, and the community-driven Bioinformatics Question and Answer website (http://biostar.stackexchange.com). SeqAnswers, however, aims to foster discussion beyond the question-and-answer style of Biostar, and targets a wider audience than just those researchers with a bioinformatics related question.

Moreover, as many people involved in sequencing facilities are part of the SEQanswers community already, common-sense decisions of standards such as data-formats or best-practice bioinformatic analyses may be taken much more easily. In the modern Internet era, it has been demonstrated that transactive memory is becoming an increasingly important mechanism for learning and answering difficult questions. [Sparrow et al., 2011] SeqAnswers consistently shows up in searches by Google for terms related to the fields of next-generation sequencing, genomics and bioinformatics, suggesting it fills a major role in the rapid dissemination of information to scientists all over the world. Within the near future, SEQanswers aims to provide summary pages with links and of various important topics, and continues to provide a platform for discourse among isolated groups of experts across different countries.

Summary of forum contents

Here is a short summary of the contents covered by the forum. (Could act as help in the writing process)

  • General discussion
  • Core facilities
  • Literature
  • Conferences
  • Bioinformatics help discussions (installation, troubleshooting)
  • Jobs forum (Industry/academic/non-profit)
  • Sequencing technologies
  • Scientific applications (Sample prep, resequencing, de novo, metagenomics, epigenetics, RNA sequencing)
  • Application/tool announcements
  • Regional communities

Future Directions

SEQanswers forum is uniquely poised to lead in developing unbiased, relevant community operating standards at a global scope--ably incorporating all of the myriad platforms currently in use to generate massive sequence datasets. It can offer a panel of authoritative insight to national science funding agencies, both formally and informally, maintaining a global perspective on strategic use of scarce public funds in support of the NGS infrastructure and avoiding duplication of efforts.

Other statements: 1) While wet lab protocols are hosted at other sites, there are many posts on SEQanswers forums that instantly address emerging wet lab issues, including reagent batch problems, shearing techniques, sample choices, etc. Much time and money is potentially saved in the course of regular wet lab operations in support of NGS due to the available expertise and experience on tap here. 2) The wide range of posts encompass larger issues that need to be addressed by the global NGS community, including ethics, regulatory issues, public/private funding sources for research and availability of resulting NGS data. 3) This forum provides a useful interface with the general public interested in pursuing popular applications of next gen techniques. It is a source of information to beginners of all types and backgrounds, from high school interns to MBA's.

References

[1] SEQwiki paper in NAR 2012
[2] Sparrow B, Liu J, Wegner DM. (2011) Google effects on memory: cognitive consequences of having information at our fingertips. Science. 2011 Aug 5;333(6043):776-8. Epub 2011 Jul 14. PMID: 21764755

Signers (to be ordered by most significant contribution)

  • Eric C. Olivares, SEQanswers.Com.
  • Jing-Woei Li, The Chinese University of Hong Kong.
  • Dan M. Bolser, University of Dundee.
  • Peter Ulz, Medical University of Graz.
  • Andreas Sjödin, Swedish Defence Research Agency.
  • Peter J. A. Cock, James Hutton Institute.
  • Lex Nederbragt, University of Oslo
  • Felix Krueger, The Babraham Institute.
  • Surya Saha, Cornell University
  • Colin F. Davenport, Hannover Medical School
  • Joann C. Delenick, Woodbridge, CT
  • Michael James Clark, Stanford University
  • Robert Schmieder, San Diego State University
  • Rachel Glover, Food and Environment Research Agency
  • Lorena Pantano, Institut de Medicina Predictiva i Personalitzada del Càncer

Trimmed letter: Draft #1

Rapid technological advancement has outpaced the speed of peer-reviewed publication and other traditional forms of information sharing in a burgeoning research field rapidly becoming known for 'big data'. The SEQanswers community (http://SEQanswers.com/) was founded to facilitate rapid dissemination of high-throughput sequencing (HTS) related wet-lab techniques and bioinformatic analyses. Here, over 20,000 registered users (ECO may be able to provide numbers of queries per day) from a diverse blend of bioinformaticians and molecular biologists meet and share their technical knowhow. Meanwhile, the SEQwiki (http://SEQwiki.org/), a user-maintained semantic wiki hosted by SEQanswers, provides an organized structure to hundreds of bioinformatics tools including their publications, methods and data formats [1].

SEQanswers is uniquely positioned in the HTS field and has become a firmly established knowledge platform. Many researchers and experts contribute extensively to the forum discussions and provide insights that may otherwise only have been found scattered throughout the Internet, if publicly available at all. The SEQanswers community targets a wide audience by providing a platform for debates on trending HTS related issues; reviews of both pre-publication and post-publication tools; and a literature watch. Moreover, the insightful Q&A on biological and informatics problems, make the community a first port of call for both wet-lab scientists and computational biologists [2-12]. SEQanswers also plans to seek partnership with the reputable community-driven bioinformatics Q&A website BioStar (http://biostar.stackexchange.com). In parallel, SEQanswers is inviting a network of high profile bloggers to bring even more content to the site, thereby stimulating more vibrant discussion.

In the Internet era, transactive memory is becoming an increasingly important mechanism for learning and answering difficult questions [13]. SEQanswers ranks highly in Google searches related to HTS, genomics and bioinformatics, suggesting that it is an active, strong and sizeable community playing a major role in the rapid dissemination of information to scientists worldwide. The most common search terms leading to SEQanswers include [ECO might be able to put together a list/table of the most common search terms that lead from Google to Seqanswers using Google Analytics if he has it set up here] and the site is regularly accessed from X countries all over the world [ECO might be able to provide a short table with typical hit counts from specific countries]. SEQanswers will continue to provide a platform for open discourse, allowing the rapid dissemination of knowledge and emerging issues to scientists all over the world.

References

[1] SEQwiki paper in NAR 2012.
[2] Dassanayake, M., Oh, D.H., Haas, J.S., Hernandez, A., Hong, H., Ali, S., Yun, D.J., Bressan, R.A., Zhu, J.K., Bohnert, H.J. et al. (2011) The genome of the extremophile crucifer Thellungiella parvula. Nature genetics, 43, 913-918.
[3] Oshlack, A., Robinson, M.D. and Young, M.D. (2010) From RNA-seq reads to differential expression results. Genome Biology, 11, 220.
[4] Zheng, Z., Advani, A., Melefors, O., Glavas, S., Nordstrom, H., Ye, W., Engstrand, L. and Andersson, A.F. (2010) Titration-free massively parallel pyrosequencing using trace amounts of starting material. Nucleic Acids Res, 38, e137.
[5] Huttenhower, C. and Hofmann, O. (2010) A quick guide to large-scale genomic data mining. PLoS Comput Biol, 6, e1000779.
[6] Li, H. and Homer, N. (2010) A survey of sequence alignment algorithms for next-generation sequencing. Brief Bioinform, 11, 473-483.
[7] Huss, M. (2010) Introduction into the analysis of high-throughput-sequencing based epigenome data. Brief Bioinform, 11, 512-523.
[8] Nielsen, C.B., Cantor, M., Dubchak, I., Gordon, D. and Wang, T. (2010) Visualizing genomes: techniques and challenges. Nat Methods, 7, S5-S15.
[9] Flicek, P. and Birney, E. (2009) Sense from sequence reads: methods for alignment and assembly. Nat Methods, 6, S6-S12.
[10] McPherson, J.D. (2009) Next-generation gap. Nat Methods, 6, S2-5.
[11] Trapnell, C. and Salzberg, S.L. (2009) How to map billions of short reads onto genomes. Nat Biotech, 27, 455-457.
[12] Horner, D.S., Pavesi, G., Castrignano, T., De Meo, P.D.O., Liuni, S., Sammeth, M., Picardi, E. and Pesole, G. (2010) Bioinformatics approaches for genomics and post genomics applications of next-generation sequencing. Brief Bioinform, 11, 181-197.
[13] Sparrow, B., Liu, J. and Wegner, D.M. (2011) Google Effects on Memory: Cognitive Consequences of Having Information at Our Fingertips. Science.