Difference between revisions of "Publication/Letter for SEQanswers"

From SEQwiki
< Publication
PublicationPublication/Letter for SEQanswers
Jump to: navigation, search
m (References)
m (References)
Line 81: Line 81:
 
[2] Dassanayake, M., Oh, D.H., Haas, J.S., Hernandez, A., Hong, H., Ali, S., Yun, D.J., Bressan, R.A., Zhu, J.K., Bohnert, H.J. et al. (2011) The genome of the extremophile crucifer Thellungiella parvula. Nature genetics, 43, 913-918.<br>
 
[2] Dassanayake, M., Oh, D.H., Haas, J.S., Hernandez, A., Hong, H., Ali, S., Yun, D.J., Bressan, R.A., Zhu, J.K., Bohnert, H.J. et al. (2011) The genome of the extremophile crucifer Thellungiella parvula. Nature genetics, 43, 913-918.<br>
 
[3] Oshlack, A., Robinson, M.D. and Young, M.D. (2010) From RNA-seq reads to differential expression results. Genome Biology, 11, 220.<br>
 
[3] Oshlack, A., Robinson, M.D. and Young, M.D. (2010) From RNA-seq reads to differential expression results. Genome Biology, 11, 220.<br>
[4] Zheng, Z., Advani, A., Melefors, ñ., Glavas, S., Nordström, H., Ye, W., Engstrand, L. and Andersson, A.F. (2010) Titration-free massively parallel pyrosequencing using trace amounts of starting material. Nucleic Acids Res, 38, e137.<br>
+
[4] Zheng, Z., Advani, A., Melefors, O., Glavas, S., Nordstrom, H., Ye, W., Engstrand, L. and Andersson, A.F. (2010) Titration-free massively parallel pyrosequencing using trace amounts of starting material. Nucleic Acids Res, 38, e137.<br>
 
[5] Huttenhower, C. and Hofmann, O. (2010) A quick guide to large-scale genomic data mining. PLoS Comput Biol, 6, e1000779.<br>
 
[5] Huttenhower, C. and Hofmann, O. (2010) A quick guide to large-scale genomic data mining. PLoS Comput Biol, 6, e1000779.<br>
 
[6] Li, H. and Homer, N. (2010) A survey of sequence alignment algorithms for next-generation sequencing. Brief Bioinform, 11, 473-483.<br>
 
[6] Li, H. and Homer, N. (2010) A survey of sequence alignment algorithms for next-generation sequencing. Brief Bioinform, 11, 473-483.<br>
Line 89: Line 89:
 
[10] McPherson, J.D. (2009) Next-generation gap. Nat Methods, 6, S2-5.<br>
 
[10] McPherson, J.D. (2009) Next-generation gap. Nat Methods, 6, S2-5.<br>
 
[11] Trapnell, C. and Salzberg, S.L. (2009) How to map billions of short reads onto genomes. Nat Biotech, 27, 455-457.<br>
 
[11] Trapnell, C. and Salzberg, S.L. (2009) How to map billions of short reads onto genomes. Nat Biotech, 27, 455-457.<br>
[12] Horner, D.S., Pavesi, G., Castrignanò, T., De Meo, P.D.O., Liuni, S., Sammeth, M., Picardi, E. and Pesole, G. (2010) Bioinformatics approaches for genomics and post genomics applications of next-generation sequencing. Brief Bioinform, 11, 181-197.<br>
+
[12] Horner, D.S., Pavesi, G., Castrignano, T., De Meo, P.D.O., Liuni, S., Sammeth, M., Picardi, E. and Pesole, G. (2010) Bioinformatics approaches for genomics and post genomics applications of next-generation sequencing. Brief Bioinform, 11, 181-197.<br>
 
[13] Sparrow, B., Liu, J. and Wegner, D.M. (2011) Google Effects on Memory: Cognitive Consequences of Having Information at Our Fingertips. Science. <br>
 
[13] Sparrow, B., Liu, J. and Wegner, D.M. (2011) Google Effects on Memory: Cognitive Consequences of Having Information at Our Fingertips. Science. <br>

Revision as of 10:02, 10 October 2011

During the review of the SEQwiki paper [1], an important point was raised, by the reviewers, the SEQanswers forum has yet to be published, and deserves a good publication.

Lets write a short Science letter (< 300 words) [1] or Nature Correspondence (< 350 words) [2][3] about SEQanswers!

SEQanswers has already been 'informally' cited dozens of times in the literature, so why not write a nice summary for everyone to cite?

Please contribute (and sign the letter) below! The final list of authors will be ranked according to (democratically determined) contribution to the final text.

Meta paper discussion should stay on the forum thread here.

Ideas to convey

  • SEQanswers as a community for interactions among NGS users and between users and developers (UNIQUE)
  • SEQ* is a firmly established community, as reflected by papers that cite SEQ*


Content

In recent years, dramatic advancements in sequencing technology have created a rapidly advancing and complex field of research. These new technologies have given us the capability to answer biological questions that were previously out of reach. However, the rate at which these technological advancements have come about has outpaced the speed of peer-reviewed publication and other traditional forms of information sharing in a burgeoning research field rapidly becoming known for 'big data'.

SEQanswers (http://SEQanswers.com) was founded to address this gap. The community-focused format facilitates the rapid dissemination of both wet-lab techniques and bioinformatic analyses. Over 20,000 registered users composed of a diverse blend of bioinformaticians, geneticists, and molecular biologists meet and share their experiences and tools. SEQwiki (http://seqanswers.com/wiki/SEQanswers), a user-maintained semantic wiki hosted by SEQAnswers, provides an organizational structure to hundreds of publications, methods, data formats, and bioinformatics tools.

Previously, user feedback was almost exclusively directed directly to authors of the bioinformatics software. Only a few disparate, well-established, or well-funded groups provided publicly archived mailing lists for community discussion. Software reviews by bloggers were similarly posted independently and relatively rarely. Newcomers to the fast evolving sequencing research field faced a seemingly overwhelming challenge to determine the most effective way to analyze their data.

SEQanswers has fostered lively review of both pre-publication and post-publication tools. Usually, long before peer reviewed publication, tools have been announced in SEQanswers and tested extensively within the community. Post publication improvement and benchmarking among developers is encouraged by discussions in the SEQanswers forums. Moreover, many avid bloggers and individuals from well-established groups contribute to the forum discussions and provide insights that may otherwise only have been found scatter throughout the Internet in various places, if available publicly at all.

As a supplement to traditional forms of scientific communication, SEQanswers offers instantaneous sharing of ideas and review of findings between peers at the cutting edge of high-throughput genome biology. The site has become an important resource for worldwide collaboration and education in the modern genomics era. The Bioinformatics forum in particular has acted as a venue for discussing problems using bioinformatics tools, which can also help tool developers by highlighting areas where their documentation may be lacking, potential for new functionality, and often useful bug reports. In a relatively short time, SEQanswers has thereby become an extensive and firmly established knowledgebase for next generation sequencing and its bioinformatics applications. The rapid response times of experts and developers in various fields to all sorts of questions, ranging from biological questions to the discussion or usage of bioinformatic tools, makes SEQanswers a first port of call for bench scientists and computational biologists alike. Some overlap exists between the bioinformatics forum at SeqAnswers, and the community-driven Bioinformatics Question and Answer website (http://biostar.stackexchange.com). SeqAnswers, however, aims to foster discussion beyond the question-and-answer style of Biostar, and targets a wider audience than just those researchers with a bioinformatics related question.

Moreover, as many people involved in sequencing facilities are part of the SEQanswers community already, common-sense decisions of standards such as data-formats or best-practice bioinformatic analyses may be taken much more easily. In the modern Internet era, it has been demonstrated that transactive memory is becoming an increasingly important mechanism for learning and answering difficult questions. [Sparrow et al., 2011] SeqAnswers consistently shows up in searches by Google for terms related to the fields of next-generation sequencing, genomics and bioinformatics, suggesting it fills a major role in the rapid dissemination of information to scientists all over the world. The most common search terms leading to SeqAnswers include [ECO might be able to put together a list/table of the most common search terms that lead from Google to SeqAnswers using Google Analytics if he has it set up here] and the site is regularly accessed from X countries all over the world [ECO might be able to provide a short table with typical hit counts from specific countries].

Summary of forum contents

Here is a short summary of the contents covered by the forum. (Could act as help in the writing process)

  • General discussion
  • Core facilities
  • Literature
  • Conferences
  • Bioinformatics help discussions (installation, troubleshooting)
  • Jobs forum (Industry/academic/non-profit)
  • Sequencing technologies
  • Scientific applications (Sample prep, resequencing, de novo, metagenomics, epigenetics, RNA sequencing)
  • Application/tool announcements
  • Regional communities

Future Directions

SEQanswers forum is uniquely poised to lead in developing unbiased, relevant community operating standards at a global scope--ably incorporating all of the myriad platforms currently in use to generate massive sequence datasets. It can offer a panel of authoritative insight to national science funding agencies, both formally and informally, maintaining a global perspective on strategic use of scarce public funds in support of the NGS infrastructure and avoiding duplication of efforts.

Other statements: 1) While wet lab protocols are hosted at other sites, there are many posts on SEQanswers forums that instantly address emerging wet lab issues, including reagent batch problems, shearing techniques, sample choices, etc. Much time and money is potentially saved in the course of regular wet lab operations in support of NGS due to the available expertise and experience on tap here. 2) The wide range of posts encompass larger issues that need to be addressed by the global NGS community, including ethics, regulatory issues, public/private funding sources for research and availability of resulting NGS data. 3) This forum provides a useful interface with the general public interested in pursuing popular applications of next gen techniques. It is a source of information to beginners of all types and backgrounds, from high school interns to MBA's.

References

[1] SEQwiki paper in NAR 2012
[2] Sparrow B, Liu J, Wegner DM. (2011) Google effects on memory: cognitive consequences of having information at our fingertips. Science. 2011 Aug 5;333(6043):776-8. Epub 2011 Jul 14. PMID: 21764755

Signers (to be ordered by most significant contribution)

  • Eric C. Olivares, SEQanswers.Com.
  • Jing-Woei Li, The Chinese University of Hong Kong.
  • Dan M. Bolser, University of Dundee.
  • Peter Ulz, Medical University of Graz.
  • Andreas Sjödin, Swedish Defence Research Agency.
  • Peter J. A. Cock, James Hutton Institute.
  • Lex Nederbragt, University of Oslo
  • Felix Krueger, The Babraham Institute.
  • Surya Saha, Cornell University
  • Colin F. Davenport, Hannover Medical School
  • Joann C. Delenick, Woodbridge, CT
  • Michael James Clark, Stanford University

Trimmed letter: Draft #1

We noted that High throughput sequencing (HTS) has granted scientists the capability to answer previously unimagined biological questions. However, the rapid technological advancement has outpaced the speed of peer-reviewed publication and other traditional forms of information sharing in a burgeoning research field rapidly becoming known for 'big data'.

We founded the SEQanswers community (http://SEQanswers.com/) to facilitate rapid dissemination of HTS related wet-lab techniques and bioinformatic analyses. Here, over 20,000 registered users composed of a diverse blend of bioinformaticians and molecular biologists meet and share their technical knowhow. Meanwhile, the SEQwiki (http://SEQwiki.org/), a user-maintained semantic wiki hosted by SEQanswers, provides an organized structure to hundreds of bioinformatics tools and their publications, methods and data formats [1].

We reckon SEQanswers is uniquely positioned in the HTS field. Many researchers involved in sequencing and data analysis are involved in the community. Therefore in a short period of time, the community has become a firmly established HTS knowledge base. Many experts from well-established groups contribute extensively to the forum discussions and provide insights that may otherwise only have been found scatter throughout the Internet in various places, if available publicly at all. Moreover, the insightful Q&A on biological and informatics problems, make the community a first port of call for both wet-lab scientists and computational biologists [2-12]. The SEQanswers community targets a wide audience by providing a platform for debates on trending HTS related issues; review of both pre-publication and post-publication tools; and also literature watch. We also plan to seek partnership with the reputable community-driven bioinformatics Q&A website (http://biostar.stackexchange.com). In parallel, SEQanswers is inviting a network of high profile bloggers to bring even more content to the site, thereby stimulating more vibrant discussion.

In the Internet era, transactive memory is becoming an increasingly important mechanism for learning and answering difficult questions [13]. SEQanswers consistently shows up in Google searches for terms related to the fields of HTS, genomics and bioinformatics, suggesting that SEQanswers as a strong, active, and sizable community, plays a major role in rapid dissemination of information to scientists all over the world. The most common search terms leading to SEQanswers include [ECO might be able to put together a list/table of the most common search terms that lead from Google to SeqAnswers using Google Analytics if he has it set up here] and the site is regularly accessed from X countries all over the world [ECO might be able to provide a short table with typical hit counts from specific countries].

References

[1] SEQwiki paper in NAR 2012.
[2] Dassanayake, M., Oh, D.H., Haas, J.S., Hernandez, A., Hong, H., Ali, S., Yun, D.J., Bressan, R.A., Zhu, J.K., Bohnert, H.J. et al. (2011) The genome of the extremophile crucifer Thellungiella parvula. Nature genetics, 43, 913-918.
[3] Oshlack, A., Robinson, M.D. and Young, M.D. (2010) From RNA-seq reads to differential expression results. Genome Biology, 11, 220.
[4] Zheng, Z., Advani, A., Melefors, O., Glavas, S., Nordstrom, H., Ye, W., Engstrand, L. and Andersson, A.F. (2010) Titration-free massively parallel pyrosequencing using trace amounts of starting material. Nucleic Acids Res, 38, e137.
[5] Huttenhower, C. and Hofmann, O. (2010) A quick guide to large-scale genomic data mining. PLoS Comput Biol, 6, e1000779.
[6] Li, H. and Homer, N. (2010) A survey of sequence alignment algorithms for next-generation sequencing. Brief Bioinform, 11, 473-483.
[7] Huss, M. (2010) Introduction into the analysis of high-throughput-sequencing based epigenome data. Brief Bioinform, 11, 512-523.
[8] Nielsen, C.B., Cantor, M., Dubchak, I., Gordon, D. and Wang, T. (2010) Visualizing genomes: techniques and challenges. Nat Methods, 7, S5-S15.
[9] Flicek, P. and Birney, E. (2009) Sense from sequence reads: methods for alignment and assembly. Nat Methods, 6, S6-S12.
[10] McPherson, J.D. (2009) Next-generation gap. Nat Methods, 6, S2-5.
[11] Trapnell, C. and Salzberg, S.L. (2009) How to map billions of short reads onto genomes. Nat Biotech, 27, 455-457.
[12] Horner, D.S., Pavesi, G., Castrignano, T., De Meo, P.D.O., Liuni, S., Sammeth, M., Picardi, E. and Pesole, G. (2010) Bioinformatics approaches for genomics and post genomics applications of next-generation sequencing. Brief Bioinform, 11, 181-197.
[13] Sparrow, B., Liu, J. and Wegner, D.M. (2011) Google Effects on Memory: Cognitive Consequences of Having Information at Our Fingertips. Science.