Difference between revisions of "Publication/Paper (NAR 2012)/Reply"

From SEQwiki
Jump to: navigation, search
(Reviewer 2: Making the page easier to paste into the review system, I don't see why we shouldn't make *this* document the primary reply text (i.e. please avoid ancillary word or Google documents))
(Reviewer 1)
Line 12: Line 12:
  
 
== Reviewer 1 ==
 
== Reviewer 1 ==
Did reviewer 1 and reviewer 3 swap brains?
+
I am satisfied with the changes.
 
+
: Thanks for your help improving SEQwiki!
I think so--[[User:Marcowanger|Marcowanger]] 10:33, 3 October 2011 (PDT)
 
 
 
Nothing todo.
 
  
 
== Reviewer 2 ==
 
== Reviewer 2 ==

Revision as of 21:03, 3 October 2011

  • Above: Points and tasks
  • Below: Full comments for context

SUCCESS!

Accept with minor revision!

ThAnk you!

congratulation --Marcowanger 10:34, 3 October 2011 (PDT) Let's meet and celebrate, some days .....


Reviewer 1

I am satisfied with the changes.

Thanks for your help improving SEQwiki!

Reviewer 2

I would not go so far in the second block of added text on page 4 as to claim that the lack on information in the wiki provides useful information. Perhaps in the long run, but in the current state, I disagree that "It is worth pointing out". I would delete the last two sentences in that paragraph.

Sounds reasonable. This was actually an attempt at a polite kick up the bum for lazy tool maintainers, but I agree it isn't very scientific. The phrase "It is worth pointing out, however, that a lack of information may also provide some information to the user. A less used tool will not be likely to have a comprehensive article, while important tools will be described in more detail." has been deleted from the manuscript.

Line 30 on page 6. "Registered member can submit" should use the plural "members" or "A registered member" or "Any registered member"

Changed to "Registered members can submit..."

I would not include the uploaded supplementary figure as supplementary data for the published version.

Right, this figure was just for the benefit of the reviewers, and has been removed.

Comments on *this* page

All part of the service... all reviews and replies should be done like this... Not sure what we need to do here. Need to re-read. P.S. Wan't to kick off a SEQanswers paper on the forum?

Referee: 3

This is a revised MS but I was unable to find a cover letter describing revisions or responses to reviewers. i presume the red blocks are new text.

This is unfortunate, as we submitted a detailed point-by-point rebuttal both in the "Respond to these comments" (subtitled "Response to Decision Letter"), field in section 1 of the submission system (View and Respond to Decision Letter) and in the "Cover Letter" field in section 5 of the submission system (Details & Comments).


The MS now focuses on the core benefits of the wiki and presents the place of the system in the seqanswers ecosystem more clearly.

Cool

The figures 2 and 3, and the supplementary figure (?) are all very contingent process information - it is hardly a surprise that the wiki behaves like most human social activities, and of minor interest that the authors effected a speedup sometime this year - and should be removed.

I disagree
I also disagree, I think he should read the book "Predictably Irrational" and the related papers by Dan Ariely. Human behavior is not rational as traditional economic theory has stated (behaves like most human social activities, how to define "most"?). Decisions is made under a contextual background. The outcome that caused by the largely unknown background is the observation we observed as plotted by figure 2 and 3. I thus strongly suggest to keep these figures

The future directions are laudable. One missing development would be automated 'scraping' of threads within seqanswers concerning different programmes - tagging threads with a vocabulary of programs and having these linked from the program page must be possible (unless the two systems are incompatible...).

Yeah you missed the reply where covered this.

In particular, it is not clear whether the community will work on wiki-coordinated xy-athons if these are generally being organised by major centres and performed in collaboration with vendors.

I don't follow this sentence.

Rebuttal (ROUND 1)

Reviewer 1

In their paper titled "The SEQanswers wiki: A wiki database of tools for high throughput sequencing analysis." the authors present a wiki based website that is meant to store information on a wide variety of bioinformatics tools.

I found the site to be of substantially lower utility than I initially expected.

We hope we have adequately addressed your major criticisms in our point by point reply, below.

Point 1 (Our summary: Browsing is bad)

First of all the navigation appears to be ill defined. For example browsing for a software tools makes users repeatedly click on tags, that, after several steps and choices may lead to only one or two very generic tools like "CLC Genomic Workbench".

Currently, 'tag-based' navigation is the predominant way we allow users to browse the wiki. However, we do provide a tabular overview of all tools (http://seqanswers.com/wiki/Software/list), and of course, a free text search.
In response to this criticism, we have implemented an 'advanced search' form (http://seqanswers.com/wiki/Special:RunQuery/Bioinformatics_application/Query) that allows the user to more quickly perform a very specific query of the database. In addition, we are actively working on improving the tags to more accurately reflect the functionality of the tools (see below for more details).

Comments

Could we base "Bioinformatics methods" and "Biological domains" on modified EDAM "Operation" and "Topic" tags Andreas

Andreas, not sure what you mean, may you explain a bit more?, sorry not enough sleep these days. Marcowanger 21:54, 12 September 2011 (PDT)

I thought we could look at the keywords used in their ontology when we are cleaning up the tags. It is easier if we could show example tags for user to know the difference between biological domain and bioinformatics methods. we could use some parts of their ontology. The EDAM ontology has keywords for the following sub-categories.

Biological domains (Topic) 
Classification, Data handling, Genes, Genomics, Genotype and phenotype, Informatics, Laboratory resources, Literature and documentation, Microarrays, Nucleic acids, Ontologies and nomenclature, Organism , Pathways, networs and models, Phylogenetics, Proteins, Proteomics, Sequence, Sequence and mapping, Structure
Bioinformatics methods (Operation)
Alignment, Analysis and processing, Annotation, Classification, Comparision, Demonstration, Design, Mapping and assembly, Modelling and simulation, Optimisation and refinement, Plotting and rendering, Prediction, detection and recognition, Search and retrieval, Validation and standardisation

However, for now I think it is enough to go through existing tagging. Andreas

Looks great! This has been on the todo list for ages. I'll not get round to it in time for the 'reply' though :( --Dan 16:51, 20 September 2011 (PDT)

Can we say, we have currently curated the tag, but we will eventually turn them into EDAM? --Marcowanger 21:41, 22 September 2011 (PDT)

Point 2 (Our summary: The site is slow)

This suboptimal navigation when coupled to very slow load times and makes the aforementioned awkward navigation even more discouraging. There were instances where the page loaded over about 20 seconds! The problem of course is that at the end of suffering through slow load times, often a minute or two, there may be no useful information at all!

We are sorry that we caused you to be discouraged. Slow response time was an unfortunately timed technical issue with the server that we have now resolved. As evidence, please see the image of average page load times provided by Google webmaster tools.

Google crawl time graph.png

Point 3 (Our summary: The content is bad)

Looking for known and existing tools does not seem to be any better. Even the software tools linked from the front page lead to a overly generic descriptions of the tool with links to the homepage of the tool. Notably some of the information even on these well known tools such as Bowtie or MAQ were occasionally incorrect while being severely lacking

Reply One of the wonderful things about wikis is that incorrect or insubstantial information, when discovered, can be easily corrected or amended by anyone, with minimal effort. Please provide details of any specific problems, and we will be happy to correct them for you, because just bitching about non-specific problems doesn't cut it in the open source community. Yes, I do feel better ;-)
Alternate reply Like all wikis, the quality and quantity of information for each article is dependent on community participation. We agree that there is plenty of room for improvement in this regard. We have carefully reviewed many of the articles ourselves, adding and improving content wherever possible. Since the paper was submitted, there have been over 1000 edits, adding more than 260 kb of content. In addition about 50 new tools and citations and about 100 additional web-links were added.
Generally, we are of the opinion that a minimal entry for a given tool, for example, just listing the title and homepage, is better than no entry at all. However, to address this criticism, we have implemented an 'article score', reflecting how much information is available for a given tool. Short articles are prominently displayed, and can be sorted by size or 'score', allowing us (and hopefully the community) to focus on the tools with the least information. A paragraph describing this work, starting "The quality and quantity of information for each tool is dependent on..." has been added.

Comments

This reply sounds a bit too "schoolmasterly". It's clear that this is a wiki, but that does not automatically make the content correct. Perhaps increasing participation should be a goal. I don't know if it helps, but one could add a message at the top of each page: "Found any errors? Please edit this page!" --Mmartin 01:31, 14 September 2011 (PDT)
lol, right, I was trying to wind him up. Yeah, I'm never motivated by constant invitations to 'participate!' (although I'm guilty of doing it myself all over the place). Not sure, but we could try. --Dan 16:51, 20 September 2011 (PDT)
I reckon since we really put in some effort to improve several tools we should also add ... --Usad 09:54, 19 September 2011 (PDT)
Agreed, added above. --Dan 16:51, 20 September 2011 (PDT)

Point 4 (Our summary: The content is bad)

The same is true on the 'How to' sections, the SNP detection is about two-three paragraphs long and the advice it gives is very basic; seemingly anecdotal.

Reply we have improved the how to sections by adding a general discussion where these were lacking and have further improved the description of individual tools. Furthermore we added additional walk through examples (e.g. scaffolding). In addition, we added citations that try to compare different tools at the bottom of the articles and summarize the outcome in bold . However, we do agree that particularly SNP finding seems to be a moving target as many novel tools are appearing almost on a weekly basis. In addition, unlike in other areas where clearer recommendations can be given (as also HTS tool authors at least compare their tools against existing solutions, as biased as this may be, this is not generally the case for SNP detection software). Therefore we rather try to stick to general recommendation for now.

Comments

I warned you about these sections ;-) Who wants to fix it?- Dan.

Please check the reply, SNP is really THE bugbear nobody ever seems to even bother comparing tools bjoern

Closing comments

Unfortunately I think that even a beginner bioinformaticians would be much better served by reading review papers or asking questions on forums such as Seqanswers Forums than trying to find answers on this site.

We certainly wouldn't discourage anyone from reading review papers or asking questions on forums such as Seqanswers Forums. SEQwiki was designed to act as a companion database for such resources, providing standardized, queryable data and links for the tools discussed in those resources.

A resource of this type is a good idea in principle, but it would need to be substantially improved both content wise as well from the point of view of the user experience.

Thanks for your comments on SEQwiki, which we value, have taken on board, and have endeavoured to resolve so far as is possible in the time available.
The SEQwiki project is an evolving system. It is one among a few of the first community maintained databases for scientific information. Rather than integrate this effort within Wikipedia, we took the step of closely aligning with a strong community built around an existing forum. We additionally differ from Wikipedia by providing a standard form-based interface to an underlying database of information. Using this model, we have rapidly built a large, detailed database of information about tools for high throughput sequencing. Unlike a static database, user can extend the scope of SEQwiki, adding data on service providers, technologies, formats, and perhaps, in future, performance benchmarks and other cool features.
We hope that you will agree with us that such a system is worth supporting and developing further along the lines you have suggested.

Reviewer 2

Seqanswers appears to be a useful resource that has already built a strong user community. It is clearly suitable for the NAR database issue, and I look forward to watching its further development.

Thanks for these encouraging words.

Point 1 (Our summary: Don't forget the forums!)

My main critique of the manuscript is that by focusing mainly on the wiki it doesn't really do justice to the website as a whole. Although Seqanswers has been cited by others this appears to be the first publication about Seqanswers itself. Therefore, the manuscript would benefit from describing both the seqanswers community forum and the Seqanswers wiki. The paper could really use a simple descriptive introductory sentence saying that the resource consists of those two components. This could be done simply by rearranging the last paragraph on page 2.

Thanks for pointing out this omission. In our enthusiasm for the wiki, we forgot that SEQanswers has not yet been described in the literature, and deserves more attention in the manuscript. The section title has been changed to "The SEQanswers forum and the SEQanswers wiki: A credible HTS community", and we have added the paragraph beginning "As a supplement to traditional forms of scientific communication, SEQanswers offers...". Finally, we made a small change to the last paragraph (previously) on page 2, as suggested.

Point 2 (Our summary: The introduction misses the point)

The Introduction would benefit from having:

  • less of the generic web 2.0/distributed science benefits (which I assume will be covered by the editorial from Bateman) and,
In fact there is only one paragraph explicitly on this topic (the paragraph starting, "As international collaboration in the sciences increases ..."), which we feel is a reasonable amount of context for this independent work, even if there is some redundancy with the editorial. However, we have re-drafted the introduction with this comment in mind, and we would value your opinion on the new version.
  • more of an emphasis on how the web-based paradigm shift using things like wikis is especially important for the specific area covered by Seqanswers. It would also be nice to see some description of a few examples of how the community has used the forum/wiki.
We feel that the main drive of the Introduction is in fact the application of wikis to rapidly developing fields such as HTS. However, the idea to add specific examples of how the wiki has been used by the community is a good one, but isn't suitable for the Introduction.

I described how the community has used the FORUM!. the wiki part is not writable because discussion would be vague (not because it is not used). But there will not be any links to support. As the reviewer as forum/wiki. I would rather write about the forum in details.

Please paste the following WITH examples into the system. The text in the manuscript are WITHOUT the examples, because there would be too many

https://docs.google.com/document/d/1KvC6C-tU8lv_dPYnuRwfgMxE4TbTNdOj9ZhaPoNI6uY/edit?hl=en_US

Remember when I said, treat this document as the reply? Never mind ;-) - Dan

Point 3 (Our summary: The discussion misses the point)

The discussion of the wiki is a bit too strong on implementation details and weak on content. For example, one useful aspect of Seqanswers wiki pages that is not mentioned is the link back to forum threads. Again, it would benefit from some discussion of an example.

Thanks for pointing this out. We have added a short description of the wiki from the users perspective in the paragraph starting "The resulting page presents ...", including mention of the links back to the forum provided for every tool.

Referee: 3

The seqanswers wiki is an inspired attempt to develop a crowd sourced wiki of information about next generation and other sequencing - mainly on the bioinformatics side but also including a providers hub. It is rather partial, perhaps reflecting its novelty.

Content is sparse

The tools sections have received the most attention, and are relatively up-to-date. For many tools the only content is a link to the source/home page, and there is in general no edited community comment (on experienced utility).

We have worked extensively on improving the content of the wiki, as described above. Importantly, the page for each tool incorporates a link back to the SEQanswers forum, allowing users to discover discussion threads where the tool has been mentioned. In future, we plan to display this information more prominently on the article page.

The potentially useful section, that of workflows and recommendations is extremely thin, and personal, and does not make much reference to the comparisons and workflows that have been published already.

We have now improved the workflows and recommendation sections. Several also contain links to comparisons, where the main information is given in bold. We hope to be able to improve this even further in the future ... I guess ...

It would be good to have a keyword-linked melding with the fora etc in seqanswers itself...

In fact we do have keyword based links back to the forums for each tool, although currently this feature isn't very prominent. A paragraph mentioning this feature has been added to the manuscript, specifically, "At the foot of each page we include a standard table of links that allow the user to search for the tool in a variety of resources, including the SEQanswers Forum. By linking back to the forum in this way, users are guided back to community discussion about the tool."

The file formats section is brilliant. The extraction and de-complexing of these formats is the bane of many new starts' lives.... however it would be good to have definitions for all of them, and not just the core group's favourites.

Currently there are 99 file formats listed in SEQwiki. Of the 30 reviewed in response to this criticism, only 10 were found to have user contributed information. A further 10 were annotated by us during review. It is clear that we can put more emphasis on community participation in for collecting important information on file formats, but we feel that we already have a good base level of information in the wiki which should help encourage contribution here. It should be pointed out that the page each file format automatically includes links to resources where further information may be found, but in future we would like to more closely integrate with resources such a Wikipedia, where possible, and importantly, we can make use of the format section of the EDAM ontology (http://bioportal.bioontology.org/flex/FlexoViz.html?ontology=45846#)

bDan to add plenty more file format descriptions. (We could look at the format section of the EDAM ontology for a curated list of file formats used in biology. - Andreas) Looks awesome! How do we link to the entry for a file format in EDAM? --Dan 15:36, 22 September 2011 (PDT)

The tags are bad (again)

There are some nice visual search tools, but these rely strongly on the correctness of the tags ascribed to each software page, and do not (to this user) necessarily lead to correct suggestions.

We have overhauled the tags to check for accuracy... Dan and Marco to do this.

The site is slow (again)

Technically, the server (? perhaps semanticmediawiki itself) is rather slow to respond, annoyingly so when updating, and there are issues with cached pages overwriting the current one.

This was a server, and not a software issue, which has now been resolved. Please see reply above.

Is the wiki fit for purpose in the long run?

The proposals for the future are interesting. The concept that the wiki could be used for bugreports and support for specified packages is nice, but I would guess that sourceforge/googlecode forums will be more suited to this.

Perhaps, but having reports for all the tools in one place, with associated discussion is quite nice.
BAMSeek is already using SEQanswers as its main homepage. I still suggest to remove that sentence from the paper since I think the reviewer is right: I (as a developer) would never use a wiki for wish lists or bug reports. I don't think this is SEQwiki's task. --Mmartin 06:51, 13 September 2011 (PDT)
  • The concept of bug reporting is deleted in the manuscript. --Marcowanger 22:51, 22 September 2011 (PDT)

Full comments

On 2 September 2011 20:52,  <nardatabase@gmail.com> wrote:
> 02-Sep-2011
> NAR-02028-DATA-E-2011
> The SEQanswers wiki: A wiki database of tools for high throughput sequencing analysis.
>
> Dear Dr. Bolser
>
> Thank you for giving us the opportunity to consider your manuscript.
>
> The referees have raised substantial criticisms with respect to the database content, the slow response of the server, and the manuscript itself, which are detailed below. We will consider publishing your manuscript only if you can accommodate their suggestions in a revised version or explain satisfactorily why their comments are invalid. The revised version will be subject to another cycle of reviews by the same and/or additional reviewers.
>
> Detailed instructions for submitting your revised manuscript are provided below. Please read these instructions carefully as they differ from the original submission instructions and any error will delay the review of your manuscript. When you submit your revised manuscript, you should provide a concise point-by-point response to the referees' comments. Any text in the manuscript that you change or add in response to referee or Editor comments should be marked in red.
>
> The revised version must be uploaded within 20 days of the date of this letter.
>
> We look forward to receiving your revised manuscript.
>
> Yours sincerely,
>
> Michael Y. Galperin, PhD
> Executive Editor,
> NAR Database Issue
>
>
> **************
> Reviewers' Comments to Author
>
> (Line numbers mentioned in a report may not coincide with the original line numbers.)
>
> Referee: 1
> Comments for the Author
> In their paper titled "The SEQanswers wiki: A wiki database of tools for high throughput sequencing analysis." the authors present a wiki based website that is meant to store  information on a wide variety of bioinformatics tools.
>
> I found the site to be of substantially lower utility than I initially expected. First of all the navigation appears to be ill defined. For example browsing for a software tools makes users repeatedly click on tags, that, after several steps and choices may lead to only one or two very generic tools like "CLC Genomic Workbench". This suboptimal navigation when coupled to very slow load times and makes the aforementioned awkward navigation even more discouraging. There were instances where the page loaded over about 20 seconds! The problem of course is that at the end of suffering through slow load times, often a minute or two, there may be no useful information at all!
>
> Looking for known and existing tools does not seem to be any better. Even the software tools linked from the front page lead to a overly generic descriptions of the tool with links to the homepage of the tool. Notably some of the information even on these well known tools such as Bowtie or MAQ were occasionally incorrect while being severely lacking. The same is true on the 'How to' sections, the SNP detection is about two-three paragraphs long and the advice it gives is very basic; seemingly anecdotal.
>
> Unfortunately I think that even a beginner bioinformaticians would be much better served by reading review papers or asking questions on forums such as Seqanswers Forums  than trying to find answers on this site.
>
> A resource of this type is a good idea in principle, but it would need to be substantially improved both content wise as well from the point of view of the user experience.
>
> Referee: 2
> Comments for the Author
> Seqanswers appears to be a useful resource that has already built a strong user community.  It is clearly suitable for the NAR database issue, and I look forward to watching its further development.  My main critique of the  manuscript is that by focusing mainly on the wiki it doesn't really do justice to the website as a whole.  Although Seqanswers has been cited by others this appears to be the first publication about Seqanswers itself.  Therefore, the manuscript would benefit from describing both the seqanswers community forum and the Seqanswers wiki.  The paper could really use a simple descriptive introductory sentence saying that the resource consists of those two components.  This could be done simply by rearranging the last paragraph on page 2.
>
> The Introduction would benefit from having less of the generic web 2.0/distributed science benefits (which I assume will be covered by the editorial from Bateman) and more of an emphasis on how the web-based paradigm shift using things like wikis is especially important for the specific area covered by Seqanswers.  It would also be nice to see some description of a few examples of how the community has used the forum/wiki.
>
> The discussion of the wiki is a bit too strong on implementation details and weak on content.  For example, one useful aspect of Seqanswers wiki pages that is not mentioned is the link back to forum threads.  Again, it would benefit from some discussion of an example.
>
> Referee: 3
> Comments for the Author
> The seqanswers wiki is an inspired attempt to develop a crowd sourced wiki of information about next generation and other sequencing - mainly on the bioinformatics side but also including a providers hub. It is rather partial, perhaps reflecting its novelty.
>
> The tools sections have received the most attention, and are relatively up-to-date. For many tools the only content is a link to the source/home page, and there is in general no edited community comment (on experienced utility). The potentially useful section, that of workflows and recommendations is extremely thin, and personal, and does not make much reference to the comparisons and workflows that have been published already. It would be good to have a keyword-linked melding with the fora etc in seqanswers itself...
>
> There are some nice visual search tools, but these rely strongly on the correctness of the tags ascribed to each software page, and do not (to this user) necessarily lead to correct suggestions.
>
> The file formats section is brilliant. The extraction and de-complexing of these formats is the bane of many new starts' lives.... however it would be good to have definitions for all of them, and not just the core group's favourites.
>
> Technically, the server (? perhaps semanticmediawiki itself) is rather slow to respond, annoyingly so when updating, and there are issues with cached pages overwriting the current one.
>
> The proposals for the future are interesting. The concept that the wiki could be used for bugreports and support for specified packages is nice, but I would guess that sourceforge/googlecode forums will be more suited to this.
>
> **************