Difference between revisions of "Publication/Paper (NAR 2012)/Reply"

From SEQwiki
Jump to: navigation, search
(Point 1 (Our summary: Browsing is bad): Should we describe these resources in the MS?)
(Point 3 (Our summary: The content is bad))
Line 40: Line 40:
 
: '''Reply''' One of the wonderful things about wikis is that incorrect or insubstantial information, when discovered, can be easily corrected or amended by anyone, with minimal effort. Please provide details of any specific problems, and we will be happy to correct them for you, because just bitching about non-specific problems doesn't cut it in the open source community. '''Yes, I do feel better ;-)'''
 
: '''Reply''' One of the wonderful things about wikis is that incorrect or insubstantial information, when discovered, can be easily corrected or amended by anyone, with minimal effort. Please provide details of any specific problems, and we will be happy to correct them for you, because just bitching about non-specific problems doesn't cut it in the open source community. '''Yes, I do feel better ;-)'''
  
: '''Alternate reply''' Like all wikis, the quality and quantity of information for each article is dependent on community participation. However, we agree that there is plenty of room for improvement in this regard. We have carefully reviewed many of the articles ourselves, adding and improving content wherever possible. Since the paper was submitted, there have been over 1000 edits, adding more than 260 kb of content.
+
: '''Alternate reply''' Like all wikis, the quality and quantity of information for each article is dependent on community participation. We agree that there is plenty of room for improvement in this regard. We have carefully reviewed many of the articles ourselves, adding and improving content wherever possible. Since the paper was submitted, there have been over 1000 edits, adding more than 260 kb of content.
 
:Generally, we are of the opinion that a minimal entry for a given tool, for example, just listing the title and homepage, is better than no entry at all. However, to address this criticism, we have implemented an 'article score', reflecting how much information is available for a given tool. Short articles are prominently displayed, and can be sorted by size or 'score', allowing us (and hopefully the community) to focus on the tools with the least information. A paragraph describing this work, starting "The quality and quantity of information for each tool is dependent on..." has been added.
 
:Generally, we are of the opinion that a minimal entry for a given tool, for example, just listing the title and homepage, is better than no entry at all. However, to address this criticism, we have implemented an 'article score', reflecting how much information is available for a given tool. Short articles are prominently displayed, and can be sorted by size or 'score', allowing us (and hopefully the community) to focus on the tools with the least information. A paragraph describing this work, starting "The quality and quantity of information for each tool is dependent on..." has been added.
  

Revision as of 23:49, 22 September 2011

  • Above: Points and tasks
  • Below: Full comments for context


Rebuttal

Reviewer 1

In their paper titled "The SEQanswers wiki: A wiki database of tools for high throughput sequencing analysis." the authors present a wiki based website that is meant to store information on a wide variety of bioinformatics tools.

I found the site to be of substantially lower utility than I initially expected.

We hope we have adequately addressed your major criticisms in our point by point reply, below.

Point 1 (Our summary: Browsing is bad)

First of all the navigation appears to be ill defined. For example browsing for a software tools makes users repeatedly click on tags, that, after several steps and choices may lead to only one or two very generic tools like "CLC Genomic Workbench".

Currently, 'tag-based' navigation is the predominant way we allow users to browse the wiki. However, we do provide a tabular overview of all tools (http://seqanswers.com/wiki/Software/list), and of course, a free text search.
In response to this criticism, we have implemented an 'advanced search' form (http://seqanswers.com/wiki/Special:RunQuery/Bioinformatics_application/Query) that allows the user to more quickly perform a very specific query of the database. In addition, we are actively working on improving the tags to more accurately reflect the functionality of the tools (see below for more details).

Comments

Could we base "Bioinformatics methods" and "Biological domains" on modified EDAM "Operation" and "Topic" tags Andreas

Andreas, not sure what you mean, may you explain a bit more?, sorry not enough sleep these days. Marcowanger 21:54, 12 September 2011 (PDT)

I thought we could look at the keywords used in their ontology when we are cleaning up the tags. It is easier if we could show example tags for user to know the difference between biological domain and bioinformatics methods. we could use some parts of their ontology. The EDAM ontology has keywords for the following sub-categories.

Biological domains (Topic) 
Classification, Data handling, Genes, Genomics, Genotype and phenotype, Informatics, Laboratory resources, Literature and documentation, Microarrays, Nucleic acids, Ontologies and nomenclature, Organism , Pathways, networs and models, Phylogenetics, Proteins, Proteomics, Sequence, Sequence and mapping, Structure
Bioinformatics methods (Operation)
Alignment, Analysis and processing, Annotation, Classification, Comparision, Demonstration, Design, Mapping and assembly, Modelling and simulation, Optimisation and refinement, Plotting and rendering, Prediction, detection and recognition, Search and retrieval, Validation and standardisation

However, for now I think it is enough to go through existing tagging. Andreas

Looks great! This has been on the todo list for ages. I'll not get round to it in time for the 'reply' though :( --Dan 16:51, 20 September 2011 (PDT)

Point 2 (Our summary: The site is slow)

This suboptimal navigation when coupled to very slow load times and makes the aforementioned awkward navigation even more discouraging. There were instances where the page loaded over about 20 seconds! The problem of course is that at the end of suffering through slow load times, often a minute or two, there may be no useful information at all!

We are sorry that we caused you to be discouraged. Slow response time was an unfortunately timed technical issue with the server that we have now resolved. As evidence, please see the image of average page load times provided by Google webmaster tools.

Google crawl time graph.png

Point 3 (Our summary: The content is bad)

Looking for known and existing tools does not seem to be any better. Even the software tools linked from the front page lead to a overly generic descriptions of the tool with links to the homepage of the tool. Notably some of the information even on these well known tools such as Bowtie or MAQ were occasionally incorrect while being severely lacking

Reply One of the wonderful things about wikis is that incorrect or insubstantial information, when discovered, can be easily corrected or amended by anyone, with minimal effort. Please provide details of any specific problems, and we will be happy to correct them for you, because just bitching about non-specific problems doesn't cut it in the open source community. Yes, I do feel better ;-)
Alternate reply Like all wikis, the quality and quantity of information for each article is dependent on community participation. We agree that there is plenty of room for improvement in this regard. We have carefully reviewed many of the articles ourselves, adding and improving content wherever possible. Since the paper was submitted, there have been over 1000 edits, adding more than 260 kb of content.
Generally, we are of the opinion that a minimal entry for a given tool, for example, just listing the title and homepage, is better than no entry at all. However, to address this criticism, we have implemented an 'article score', reflecting how much information is available for a given tool. Short articles are prominently displayed, and can be sorted by size or 'score', allowing us (and hopefully the community) to focus on the tools with the least information. A paragraph describing this work, starting "The quality and quantity of information for each tool is dependent on..." has been added.

Comments

This reply sounds a bit too "schoolmasterly". It's clear that this is a wiki, but that does not automatically make the content correct. Perhaps increasing participation should be a goal. I don't know if it helps, but one could add a message at the top of each page: "Found any errors? Please edit this page!" --Mmartin 01:31, 14 September 2011 (PDT)
lol, right, I was trying to wind him up. Yeah, I'm never motivated by constant invitations to 'participate!' (although I'm guilty of doing it myself all over the place). Not sure, but we could try. --Dan 16:51, 20 September 2011 (PDT)
I reckon since we really put in some effort to improve several tools we should also add ... --Usad 09:54, 19 September 2011 (PDT)
Agreed, added above. --Dan 16:51, 20 September 2011 (PDT)

Point 4 (Our summary: The content is bad)

The same is true on the 'How to' sections, the SNP detection is about two-three paragraphs long and the advice it gives is very basic; seemingly anecdotal.

I warned you about these sections ;-) Who wants to fix it?- Dan.

working on it (offline) snp is however really not my area of expertise, so any help is appreciated bjoern

Closing comments

Unfortunately I think that even a beginner bioinformaticians would be much better served by reading review papers or asking questions on forums such as Seqanswers Forums than trying to find answers on this site.

We certainly wouldn't discourage anyone from reading review papers or asking questions on forums such as Seqanswers Forums. SEQwiki was designed to act as a companion database for such resources, providing standardized, queryable data and links for the tools discussed in those resources.

A resource of this type is a good idea in principle, but it would need to be substantially improved both content wise as well from the point of view of the user experience.

Thanks for your comments on SEQwiki, which we value, have taken on board, and have endeavoured to resolve so far as is possible in the time available.
The SEQwiki project is an evolving system. It is one among a few of the first community maintained databases for scientific information. Rather than integrate this effort within Wikipedia, we took the step of closely aligning with a strong community built around an existing forum. We additionally differ from Wikipedia by providing a standard form-based interface to an underlying database of information. Using this model, we have rapidly built a large, detailed database of information about tools for high throughput sequencing. Unlike a static database, user can extend the scope of SEQwiki, adding data on service providers, technologies, formats, and perhaps, in future, performance benchmarks and other cool features.
We hope that you will agree with us that such a system is worth supporting and developing further along the lines you have suggested.

Reviewer 2

Seqanswers appears to be a useful resource that has already built a strong user community. It is clearly suitable for the NAR database issue, and I look forward to watching its further development.

Thanks for these encouraging words.

Point 1 (Our summary: Don't forget the forums!)

My main critique of the manuscript is that by focusing mainly on the wiki it doesn't really do justice to the website as a whole. Although Seqanswers has been cited by others this appears to be the first publication about Seqanswers itself. Therefore, the manuscript would benefit from describing both the seqanswers community forum and the Seqanswers wiki. The paper could really use a simple descriptive introductory sentence saying that the resource consists of those two components. This could be done simply by rearranging the last paragraph on page 2.

Thanks for pointing out this omission. In our enthusiasm for the wiki, we forgot that SEQanswers has not yet been described in the literature, and deserves more attention in the manuscript. The section title has been changed to "The SEQanswers forum and the SEQanswers wiki: A credible HTS community", and we have added the paragraph beginning "As a supplement to traditional forms of scientific communication, SEQanswers offers...". Finally, we made a small change to the last paragraph (previously) on page 2, as suggested.

Point 2 (Our summary: The introduction misses the point)

The Introduction would benefit from having:

  • less of the generic web 2.0/distributed science benefits (which I assume will be covered by the editorial from Bateman) and,
In fact there is only one paragraph explicitly on this topic (the paragraph starting, "As international collaboration in the sciences increases ..."), which we feel is a reasonable amount of context for this independent work, even if there is some redundancy with the editorial. However, we have re-drafted the introduction with this comment in mind, and we would value your opinion on the new version.
  • more of an emphasis on how the web-based paradigm shift using things like wikis is especially important for the specific area covered by Seqanswers. It would also be nice to see some description of a few examples of how the community has used the forum/wiki.
We feel that the main drive of the Introduction is in fact the application of wikis to rapidly developing fields such as HTS. However, the idea to add specific examples of how the wiki has been used by the community is a good one, but isn't suitable for the Introduction. Somewhere we have added a few examples of how the wiki has been used as 1) a catalogue of tools for a certain analysis, 2) a place to store information on a specific tool, 3) a repository for URLs and references... Marco is working on this.

Point 3 (Our summary: The discussion misses the point)

The discussion of the wiki is a bit too strong on implementation details and weak on content. For example, one useful aspect of Seqanswers wiki pages that is not mentioned is the link back to forum threads. Again, it would benefit from some discussion of an example.

Thanks for pointing this out. We have added a short description of the wiki from the users perspective in the paragraph starting "The resulting page presents ...", including mention of the links back to the forum provided for every tool.

Referee: 3

The seqanswers wiki is an inspired attempt to develop a crowd sourced wiki of information about next generation and other sequencing - mainly on the bioinformatics side but also including a providers hub. It is rather partial, perhaps reflecting its novelty.

Content is sparse

The tools sections have received the most attention, and are relatively up-to-date. For many tools the only content is a link to the source/home page, and there is in general no edited community comment (on experienced utility).

Tough to answer.

The potentially useful section, that of workflows and recommendations is extremely thin, and personal, and does not make much reference to the comparisons and workflows that have been published already.

We have now improved the workflows and recommendation sections. Several also contain links to comparisons, where the main information is given in bold. We hope to be able to improve this even further in the future

It would be good to have a keyword-linked melding with the fora etc in seqanswers itself...

We do have links to the forums (that we forgot to mention in the text), Dan to add this. See the para starting...

The file formats section is brilliant. The extraction and de-complexing of these formats is the bane of many new starts' lives.... however it would be good to have definitions for all of them, and not just the core group's favourites.

Dan to add plenty more file format descriptions. (We could look at the format section of the EDAM ontology for a curated list of file formats used in biology. - Andreas)

The tags are bad (again)

There are some nice visual search tools, but these rely strongly on the correctness of the tags ascribed to each software page, and do not (to this user) necessarily lead to correct suggestions.

We have overhauled the tags to check for accuracy... Dan and Marco to do this.

The site is slow (again)

Technically, the server (? perhaps semanticmediawiki itself) is rather slow to respond, annoyingly so when updating, and there are issues with cached pages overwriting the current one.

This was a server, and not a software issue, which has now been resolved. Please see reply above.

Is the wiki fit for purpose in the long run?

The proposals for the future are interesting. The concept that the wiki could be used for bugreports and support for specified packages is nice, but I would guess that sourceforge/googlecode forums will be more suited to this.

Perhaps, but having reports for all the tools in one place, with associated discussion is quite nice.
BAMSeek is already using SEQanswers as its main homepage. I still suggest to remove that sentence from the paper since I think the reviewer is right: I (as a developer) would never use a wiki for wish lists or bug reports. I don't think this is SEQwiki's task. --Mmartin 06:51, 13 September 2011 (PDT)

Full comments

On 2 September 2011 20:52,  <nardatabase@gmail.com> wrote:
> 02-Sep-2011
> NAR-02028-DATA-E-2011
> The SEQanswers wiki: A wiki database of tools for high throughput sequencing analysis.
>
> Dear Dr. Bolser
>
> Thank you for giving us the opportunity to consider your manuscript.
>
> The referees have raised substantial criticisms with respect to the database content, the slow response of the server, and the manuscript itself, which are detailed below. We will consider publishing your manuscript only if you can accommodate their suggestions in a revised version or explain satisfactorily why their comments are invalid. The revised version will be subject to another cycle of reviews by the same and/or additional reviewers.
>
> Detailed instructions for submitting your revised manuscript are provided below. Please read these instructions carefully as they differ from the original submission instructions and any error will delay the review of your manuscript. When you submit your revised manuscript, you should provide a concise point-by-point response to the referees' comments. Any text in the manuscript that you change or add in response to referee or Editor comments should be marked in red.
>
> The revised version must be uploaded within 20 days of the date of this letter.
>
> We look forward to receiving your revised manuscript.
>
> Yours sincerely,
>
> Michael Y. Galperin, PhD
> Executive Editor,
> NAR Database Issue
>
>
> **************
> Reviewers' Comments to Author
>
> (Line numbers mentioned in a report may not coincide with the original line numbers.)
>
> Referee: 1
> Comments for the Author
> In their paper titled "The SEQanswers wiki: A wiki database of tools for high throughput sequencing analysis." the authors present a wiki based website that is meant to store  information on a wide variety of bioinformatics tools.
>
> I found the site to be of substantially lower utility than I initially expected. First of all the navigation appears to be ill defined. For example browsing for a software tools makes users repeatedly click on tags, that, after several steps and choices may lead to only one or two very generic tools like "CLC Genomic Workbench". This suboptimal navigation when coupled to very slow load times and makes the aforementioned awkward navigation even more discouraging. There were instances where the page loaded over about 20 seconds! The problem of course is that at the end of suffering through slow load times, often a minute or two, there may be no useful information at all!
>
> Looking for known and existing tools does not seem to be any better. Even the software tools linked from the front page lead to a overly generic descriptions of the tool with links to the homepage of the tool. Notably some of the information even on these well known tools such as Bowtie or MAQ were occasionally incorrect while being severely lacking. The same is true on the 'How to' sections, the SNP detection is about two-three paragraphs long and the advice it gives is very basic; seemingly anecdotal.
>
> Unfortunately I think that even a beginner bioinformaticians would be much better served by reading review papers or asking questions on forums such as Seqanswers Forums  than trying to find answers on this site.
>
> A resource of this type is a good idea in principle, but it would need to be substantially improved both content wise as well from the point of view of the user experience.
>
> Referee: 2
> Comments for the Author
> Seqanswers appears to be a useful resource that has already built a strong user community.  It is clearly suitable for the NAR database issue, and I look forward to watching its further development.  My main critique of the  manuscript is that by focusing mainly on the wiki it doesn't really do justice to the website as a whole.  Although Seqanswers has been cited by others this appears to be the first publication about Seqanswers itself.  Therefore, the manuscript would benefit from describing both the seqanswers community forum and the Seqanswers wiki.  The paper could really use a simple descriptive introductory sentence saying that the resource consists of those two components.  This could be done simply by rearranging the last paragraph on page 2.
>
> The Introduction would benefit from having less of the generic web 2.0/distributed science benefits (which I assume will be covered by the editorial from Bateman) and more of an emphasis on how the web-based paradigm shift using things like wikis is especially important for the specific area covered by Seqanswers.  It would also be nice to see some description of a few examples of how the community has used the forum/wiki.
>
> The discussion of the wiki is a bit too strong on implementation details and weak on content.  For example, one useful aspect of Seqanswers wiki pages that is not mentioned is the link back to forum threads.  Again, it would benefit from some discussion of an example.
>
> Referee: 3
> Comments for the Author
> The seqanswers wiki is an inspired attempt to develop a crowd sourced wiki of information about next generation and other sequencing - mainly on the bioinformatics side but also including a providers hub. It is rather partial, perhaps reflecting its novelty.
>
> The tools sections have received the most attention, and are relatively up-to-date. For many tools the only content is a link to the source/home page, and there is in general no edited community comment (on experienced utility). The potentially useful section, that of workflows and recommendations is extremely thin, and personal, and does not make much reference to the comparisons and workflows that have been published already. It would be good to have a keyword-linked melding with the fora etc in seqanswers itself...
>
> There are some nice visual search tools, but these rely strongly on the correctness of the tags ascribed to each software page, and do not (to this user) necessarily lead to correct suggestions.
>
> The file formats section is brilliant. The extraction and de-complexing of these formats is the bane of many new starts' lives.... however it would be good to have definitions for all of them, and not just the core group's favourites.
>
> Technically, the server (? perhaps semanticmediawiki itself) is rather slow to respond, annoyingly so when updating, and there are issues with cached pages overwriting the current one.
>
> The proposals for the future are interesting. The concept that the wiki could be used for bugreports and support for specified packages is nice, but I would guess that sourceforge/googlecode forums will be more suited to this.
>
> **************