Clarification throughout the abstract and introduction #813

rasbt · 2018-01-23T21:07:40Z

Hi, all,

this is a really nice document listing a lot of interesting literature concerning deep learning and biomedicine. While I was reading it, made some notes about certain wordings that I found a bit awkward and added suggested re-wordings to this PR in hope it's going to be helpful.

agitter · 2018-01-23T21:31:55Z

@rasbt Thanks for the suggestions and edits.

I'm going to leave this open for the time being. We submitted the current version for re-review at a journal and are discussing what we want to do next with the manuscript in #810. I'll return to this pull request once we make some progress in #810.

rasbt · 2018-01-23T22:15:17Z

That makes sense and sounds like a good plan! Hope the re-review goes smoothly!

agitter · 2018-03-01T15:50:11Z

@rasbt thanks again for your suggestions. The manuscript has been provisionally accepted (#820) so we are in somewhat uncharted territory regarding adding new contributors.

I propose that we merge these changes and add you to the acknowledgements for your contributions. Before proceeding, I wanted to confirm with you that sounds fair. If so, @cgreene or I will review this pull request to comment on a couple specific lines.

rasbt · 2018-03-01T15:56:01Z

Congratulations, that's great news! And sure, that's totally fine, I didn't expect to be e.g., a co-author for these relatively minor suggestions :)

cgreene

Some thoughts. I think there are a few revisions needed. But I didn't have good answers in many cases. I'm interested in @agitter's thoughts.

cgreene · 2018-03-01T18:22:39Z

content/01.abstract.md

+We examine applications of deep learning to a variety of biomedical problems---patient classification, fundamental biological processes, and treatment of patients---and discuss whether deep learning will be able to transform these tasks or if the biomedical sphere poses unique challenges.
+As a result from an extensive literature review, we find that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art.
+Even though improvements over previous baselines have been modest, the recent progress indicates that deep learning methods will provide valuable means for speeding up or aiding human investigation.
+However, deep learning models are still regarded as black box algorithms, and more work is needed to address the common concerns related to interpretability and how to best model each problem.


I'm a bit reluctant to use the terminology black box here. I feel it's overused and under-justified, when many other methods that don't get this terminology applied have similar issues. What about:

Though progress has been made in determining the primary factors that lead a specific deep neural network to make a specific prediction in a certain case, understanding how users should interpret these models to make specific mechanistic hypotheses remains an open challenge.

☝️ I'm not a huge fan of that, but I feel like the sentiment is what i'm going for so I'll put it there.

I like the sentiment as well, but determining the primary factors that lead a specific deep neural network to make a specific prediction in a certain case is quite verbose

What about this:

Though progress has been made linking a specific neural networks prediction to input features, understanding how users should interpret these models to make testable hypotheses about the system under study remains an open challenge.

👍 but add ' in network's

cgreene · 2018-03-01T18:29:50Z

content/01.abstract.md

-We find that deep learning has yet to revolutionize or definitively resolve any of these problems, but promising advances have been made on the prior state of the art.
-Even when improvement over a previous baseline has been modest, we have seen signs that deep learning methods may speed or aid human investigation.
-More work is needed to address concerns related to interpretability and how to best model each problem.
+Deep learning, which describes a class of machine learning algorithms focussing on the training of deep artificial neural networks, has recently shown impressive results across a variety of domains.


I think we specifically left artificial neural network algorithms out of our definitions in many cases. What I think is more important is that these methods can work on raw data to produce intermediate features that are then used for some subsequent task. What about:

Deep learning describes a class of machine learning algorithms that are capable of learning to combine relatively raw inputs into layers of intermediate features, and such algorithms now perform impressively across a variety of domains.

@cgreene that version has good ideas, but I think it would work better as 2 sentences. How about:

Deep learning describes a class of machine learning algorithms that are capable of combining raw inputs into layers of intermediate features. These algorithms now perform impressively across a variety of domains.

I'm not sure why, but the longer recently shown impressive results across a variety of domains sounds more natural to me.

I agree with recently shown impressive results across a variety of domains. Another suggestion would be produces remarkable results across a variety of domains compared to more traditional methods, but I would prefer the former.

Let's go with the more compact recently shown impressive results across a variety of domains

cgreene · 2018-03-01T18:36:18Z

content/01.abstract.md

-Even when improvement over a previous baseline has been modest, we have seen signs that deep learning methods may speed or aid human investigation.
-More work is needed to address concerns related to interpretability and how to best model each problem.
+Deep learning, which describes a class of machine learning algorithms focussing on the training of deep artificial neural networks, has recently shown impressive results across a variety of domains.
+Biology and medicine are data-rich disciplines, but the data are complex and often ill-understood. Hence, deep learning techniques may be particularly well-suited to solve problems of these fields.


I don't really like this (or our initial framing). What about:

Deep learning techniques may be particularly well suited for challenges in biology and medicine, which are data-rich disciplines where there is often a complex biological system between what we can measure and what we wish to know.

☝️ needs revision too, but I vaguely like the direction it's going in

I'm okay with the @rasbt version here but am happy to switch it if you want.

cgreene · 2018-03-01T18:36:42Z

content/02.intro.md

@@ -7,11 +7,11 @@ Automated algorithms that extract meaningful patterns could lead to actionable k

 The term _deep learning_ has come to refer to a collection of new techniques that, together, have demonstrated breakthrough gains over existing best-in-class machine learning algorithms across several fields.
 For example, over the past five years these methods have revolutionized image classification and speech recognition due to their flexibility and high accuracy [@doi:10.1038/nature14539].
-More recently, deep learning algorithms have shown promise in fields as diverse as high-energy physics [@doi:10.1038/ncomms5308], dermatology [@doi:10.1038/nature21056], and translation among written languages [@arxiv:1609.08144].
+More recently, deep learning algorithms have shown promise in fields as diverse as high-energy physics [@doi:10.1038/ncomms5308], computational chemistry [@doi:10.1002/jcc.24764], dermatology [@doi:10.1038/nature21056], and translation among written languages [@arxiv:1609.08144].


👍

Also noting that we cite this review later in the intro.

cgreene · 2018-03-01T18:36:54Z

content/02.intro.md

 Across fields, "off-the-shelf" implementations of these algorithms have produced comparable or higher accuracy than previous best-in-class methods that required years of extensive customization, and specialized implementations are now being used at industrial scales.

-Deep learning approaches grew from research in neural networks, which were first proposed in 1943 [@doi:10.1007/BF02478259] as a model for how our brains process information.
-The history of neural networks is interesting in its own right [@doi:10.1103/RevModPhys.34.135].
+Deep learning approaches grew from research on artificial neurons, which were first proposed in 1943 [@doi:10.1007/BF02478259] as a model for how the neurons in a biological brain process information.


cgreene · 2018-03-01T18:37:01Z

content/02.intro.md

@@ -26,11 +26,11 @@ In particular, deep learning approaches can be used both in *supervised* applica
 Deep learning methods may in fact combine both of these steps.
 When sufficient data are available and labeled, these methods construct features tuned to a specific problem and combine those features into a predictor.
 In fact, if the dataset is "labeled" with binary classes, a simple neural network with no hidden layers and no cycles between units is equivalent to logistic regression if the output layer is a sigmoid (logistic) function of the input layer.
-Similarly, for continuous outcomes, linear regression can be seen as a simple neural network.
-Thus, in some ways, supervised deep learning approaches can be seen as a generalization of regression models that allow for greater flexibility.
+Similarly, for continuous outcomes, linear regression can be seen as a single-layer neural network.


cgreene · 2018-03-01T18:37:08Z

content/02.intro.md

 Recently, hardware improvements and very large training datasets have allowed these deep learning techniques to surpass other machine learning algorithms for many problems.
 In a famous and early example, scientists from Google demonstrated that a neural network "discovered" that cats, faces, and pedestrians were important components of online videos [@url:http://research.google.com/archive/unsupervised_icml2012.html] without being told to look for them.
-What if, more generally, deep learning could solve the challenges presented by the growth of data in biomedicine? Could these algorithms identify the "cats" hidden in our data---the patterns unknown to the researcher---and suggest ways to act on them? In this review, we examine deep learning's application to biomedical science and discuss the unique challenges that biomedical data pose for deep learning methods.
+What if, more generally, deep learning take advantage of the growth of data in biomedicine to tackle challenges in this field? Could these algorithms identify the "cats" hidden in our data---the patterns unknown to the researcher---and suggest ways to act on them? In this review, we examine deep learning's application to biomedical science and discuss the unique challenges that biomedical data pose for deep learning methods.


agitter

Good changes overall. We usually wouldn't be so picky, but because this is the final version of the abstract let's continue to work on some of the phrasing together before merging. We also need to keep the total word count <= 200.

agitter · 2018-03-02T12:19:17Z

content/01.abstract.md

-We find that deep learning has yet to revolutionize or definitively resolve any of these problems, but promising advances have been made on the prior state of the art.
-Even when improvement over a previous baseline has been modest, we have seen signs that deep learning methods may speed or aid human investigation.
-More work is needed to address concerns related to interpretability and how to best model each problem.
+Deep learning, which describes a class of machine learning algorithms focussing on the training of deep artificial neural networks, has recently shown impressive results across a variety of domains.


@cgreene that version has good ideas, but I think it would work better as 2 sentences. How about:

Deep learning describes a class of machine learning algorithms that are capable of combining raw inputs into layers of intermediate features. These algorithms now perform impressively across a variety of domains.

I'm not sure why, but the longer recently shown impressive results across a variety of domains sounds more natural to me.

agitter · 2018-03-02T12:22:52Z

content/01.abstract.md

-Even when improvement over a previous baseline has been modest, we have seen signs that deep learning methods may speed or aid human investigation.
-More work is needed to address concerns related to interpretability and how to best model each problem.
+Deep learning, which describes a class of machine learning algorithms focussing on the training of deep artificial neural networks, has recently shown impressive results across a variety of domains.
+Biology and medicine are data-rich disciplines, but the data are complex and often ill-understood. Hence, deep learning techniques may be particularly well-suited to solve problems of these fields.


I'm okay with the @rasbt version here but am happy to switch it if you want.

agitter · 2018-03-02T12:23:59Z

content/01.abstract.md

+Deep learning, which describes a class of machine learning algorithms focussing on the training of deep artificial neural networks, has recently shown impressive results across a variety of domains.
+Biology and medicine are data-rich disciplines, but the data are complex and often ill-understood. Hence, deep learning techniques may be particularly well-suited to solve problems of these fields.
+We examine applications of deep learning to a variety of biomedical problems---patient classification, fundamental biological processes, and treatment of patients---and discuss whether deep learning will be able to transform these tasks or if the biomedical sphere poses unique challenges.
+As a result from an extensive literature review, we find that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art.


Could switch As a result from to Following to help the word count

agitter · 2018-03-02T12:25:53Z

content/01.abstract.md

+Biology and medicine are data-rich disciplines, but the data are complex and often ill-understood. Hence, deep learning techniques may be particularly well-suited to solve problems of these fields.
+We examine applications of deep learning to a variety of biomedical problems---patient classification, fundamental biological processes, and treatment of patients---and discuss whether deep learning will be able to transform these tasks or if the biomedical sphere poses unique challenges.
+As a result from an extensive literature review, we find that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art.
+Even though improvements over previous baselines have been modest, the recent progress indicates that deep learning methods will provide valuable means for speeding up or aiding human investigation.


Perhaps have been modest in general? There have been a few impressive exceptions since we first drafted the abstract.

agitter · 2018-03-02T12:27:49Z

content/01.abstract.md

+We examine applications of deep learning to a variety of biomedical problems---patient classification, fundamental biological processes, and treatment of patients---and discuss whether deep learning will be able to transform these tasks or if the biomedical sphere poses unique challenges.
+As a result from an extensive literature review, we find that deep learning has yet to revolutionize biomedicine or definitively resolve any of the most pressing challenges in the field, but promising advances have been made on the prior state of the art.
+Even though improvements over previous baselines have been modest, the recent progress indicates that deep learning methods will provide valuable means for speeding up or aiding human investigation.
+However, deep learning models are still regarded as black box algorithms, and more work is needed to address the common concerns related to interpretability and how to best model each problem.


I like the sentiment as well, but determining the primary factors that lead a specific deep neural network to make a specific prediction in a certain case is quite verbose

agitter · 2018-03-02T12:29:53Z

content/citation-tags.tsv

@@ -80,6 +80,7 @@ Gerstein2016_scaling	doi:10.1186/s13059-016-0917-0
 Ghandi2014_enhanced	doi:10.1371/journal.pcbi.1003711
 Ghosh1992_sequence	doi:10.1117/12.140112
 Glorot2011_domain	url:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.231.3442
+Goh2017_compchemistry	doi:10.1002/jcc.24764


We can remove the tag if the DOI is being using in the reference above. The tag is only needed when we use it directly in the citation from Markdown.

agitter · 2018-03-02T12:30:54Z

content/02.intro.md

@@ -7,11 +7,11 @@ Automated algorithms that extract meaningful patterns could lead to actionable k

 The term _deep learning_ has come to refer to a collection of new techniques that, together, have demonstrated breakthrough gains over existing best-in-class machine learning algorithms across several fields.
 For example, over the past five years these methods have revolutionized image classification and speech recognition due to their flexibility and high accuracy [@doi:10.1038/nature14539].
-More recently, deep learning algorithms have shown promise in fields as diverse as high-energy physics [@doi:10.1038/ncomms5308], dermatology [@doi:10.1038/nature21056], and translation among written languages [@arxiv:1609.08144].
+More recently, deep learning algorithms have shown promise in fields as diverse as high-energy physics [@doi:10.1038/ncomms5308], computational chemistry [@doi:10.1002/jcc.24764], dermatology [@doi:10.1038/nature21056], and translation among written languages [@arxiv:1609.08144].


👍

Also noting that we cite this review later in the intro.

agitter · 2018-03-02T12:32:21Z

content/02.intro.md

-Similarly, for continuous outcomes, linear regression can be seen as a simple neural network.
-Thus, in some ways, supervised deep learning approaches can be seen as a generalization of regression models that allow for greater flexibility.
+Similarly, for continuous outcomes, linear regression can be seen as a single-layer neural network.
+Thus, in some ways, supervised deep learning approaches can be seen as a extension of regression models that allow for greater flexibility.


an extension

maybe as an add-on to that sentence: and are especially well-suited for modeling non-linear relationships among the input features.

cgreene · 2018-03-02T19:54:28Z

@rasbt : do you want to make the final changes and then we'll squash merge? I am happy to approve after @agitter's revisions. 😄

rasbt · 2018-03-02T23:54:52Z

Sure, happy to do that and will get to it later this evening

rasbt · 2018-03-03T02:53:24Z

Just addressed all the comments in the last commit. You probably need to trim the abstract quite a bit if it's a 200 and not 250 word limit -- but I think it's been already a bit longer prior to these changes ;)

agitter

Thanks for the revisions. Everything looks great to me now. You're right that we were already over the abstract word count. We haven't worried about going a few (thousand) words over our limits before, so no need to start now.

I pushed a new commit acknowledging @rasbt so that we can merge that at the same time.

@cgreene please merge if you approve these changes.

This build is based on 588cfa6. This commit was created by the following Travis CI build and job: https://travis-ci.org/greenelab/deep-review/builds/349261851 https://travis-ci.org/greenelab/deep-review/jobs/349261852 [ci skip] The full commit message that triggered this build is copied below: Clarification throughout the abstract and introduction (#813) * suggested changes to the abstract * suggested improv in intro sec * fix line split * address comments * Add acknowledgement

rasbt added 3 commits January 23, 2018 15:50

suggested changes to the abstract

834462e

suggested improv in intro sec

d0a037e

fix line split

ac60661

cgreene mentioned this pull request Mar 1, 2018

Provisional acceptance, but some minor changes still to be made - quickly! #820

Closed

cgreene reviewed Mar 1, 2018

View reviewed changes

agitter requested changes Mar 2, 2018

View reviewed changes

address comments

a74ddec

agitter added 2 commits March 3, 2018 06:28

Merge branch 'master' into clarify-abstract-intro

207a27a

Add acknowledgement

4d6d20d

agitter approved these changes Mar 3, 2018

View reviewed changes

cgreene merged commit 588cfa6 into greenelab:master Mar 5, 2018

Clarification throughout the abstract and introduction #813

Clarification throughout the abstract and introduction #813

Conversation

rasbt commented Jan 23, 2018

agitter commented Jan 23, 2018

rasbt commented Jan 23, 2018

agitter commented Mar 1, 2018

rasbt commented Mar 1, 2018

cgreene left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agitter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cgreene commented Mar 2, 2018

rasbt commented Mar 2, 2018

rasbt commented Mar 3, 2018

agitter left a comment

Choose a reason for hiding this comment