Skip to main content

Extracting Core Claims from Scientific Articles

  • Conference paper
  • First Online:
BNAIC 2016: Artificial Intelligence (BNAIC 2016)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 765))

Included in the following conference series:

Abstract

The number of scientific articles has grown rapidly over the years and there are no signs that this growth will slow down in the near future. Because of this, it becomes increasingly difficult to keep up with the latest developments in a scientific field. To address this problem, we present here an approach to help researchers learn about the latest developments and findings by extracting in a normalized form core claims from scientific articles. This normalized representation is a controlled natural language of English sentences called AIDA, which has been proposed in previous work as a method to formally structure and organize scientific findings and discourse. We show how such AIDA sentences can be automatically extracted by detecting the core claim of an article, checking for AIDA compliance, and – if necessary – transforming it into a compliant sentence. While our algorithm is still far from perfect, our results indicate that the different steps are feasible and they support the claim that AIDA sentences might be a promising approach to improve scientific communication in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

') var buybox = document.querySelector("[data-id=id_"+ timestamp +"]").parentNode var buyingOptions = buybox.querySelectorAll(".buying-option") ;[].slice.call(buyingOptions).forEach(initCollapsibles) // springerPlus roll out 10% starts here var springerPlusGroup = setLocalStorageSpringerPlus(); var rollOutSpringerPlus = springerPlusGroup === "B" function setLocalStorageSpringerPlus() { var selectUserKey = "springerPlusRollOut"; var springerPlusGroup = "X"; if (!window.localStorage) return springerPlusGroup; try { var selectUserValue = window.localStorage.getItem(selectUserKey) springerPlusGroup = selectUserValue || randomDistributionSpringerPlus(selectUserKey) } catch (err) { console.log(err) } return springerPlusGroup; } function randomDistributionSpringerPlus(selectUserKey) { var randomGroup = Math.random() < 0.9 ? "A" : "B" window.localStorage.setItem(selectUserKey, randomGroup) return randomGroup } if (rollOutSpringerPlus) { revealSpringerPlus(); } function revealSpringerPlus() { if(buybox) { document.querySelectorAll(".c-springer-plus").forEach(function(node) { node.style.display = "block" }) } } //springerPlus ends here var buyboxMaxSingleColumnWidth = 480 function initCollapsibles(subscription, index) { var toggle = subscription.querySelector(".buying-option-price") subscription.classList.remove("expanded") var form = subscription.querySelector(".buying-option-form") var priceInfo = subscription.querySelector(".price-info") var buyingOption = toggle.parentElement if (toggle && form && priceInfo) { toggle.setAttribute("role", "button") toggle.setAttribute("tabindex", "0") toggle.addEventListener("click", function (event) { var expandedBuyingOptions = buybox.querySelectorAll(".buying-option.expanded") var buyboxWidth = buybox.offsetWidth ;[].slice.call(expandedBuyingOptions).forEach(function(option) { if (buyboxWidth <= buyboxMaxSingleColumnWidth && option != buyingOption) { hideBuyingOption(option) } }) var expanded = toggle.getAttribute("aria-expanded") === "true" || false toggle.setAttribute("aria-expanded", !expanded) form.hidden = expanded if (!expanded) { buyingOption.classList.add("expanded") } else { buyingOption.classList.remove("expanded") } priceInfo.hidden = expanded }, false) } } function hideBuyingOption(buyingOption) { var toggle = buyingOption.querySelector(".buying-option-price") var form = buyingOption.querySelector(".buying-option-form") var priceInfo = buyingOption.querySelector(".price-info") toggle.setAttribute("aria-expanded", false) form.hidden = true buyingOption.classList.remove("expanded") priceInfo.hidden = true } function initKeyControls() { document.addEventListener("keydown", function (event) { if (document.activeElement.classList.contains("buying-option-price") && (event.code === "Space" || event.code === "Enter")) { if (document.activeElement) { event.preventDefault() document.activeElement.click() } } }, false) } function initialStateOpen() { var buyboxWidth = buybox.offsetWidth ;[].slice.call(buybox.querySelectorAll(".buying-option")).forEach(function (option, index) { var toggle = option.querySelector(".buying-option-price") var form = option.querySelector(".buying-option-form") var priceInfo = option.querySelector(".price-info") if (buyboxWidth > buyboxMaxSingleColumnWidth) { toggle.click() } else { if (index === 0) { toggle.click() } else { toggle.setAttribute("aria-expanded", "false") form.hidden = "hidden" priceInfo.hidden = "hidden" } } }) } initialStateOpen() if (window.buyboxInitialised) return window.buyboxInitialised = true initKeyControls() })()

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://www.nltk.org/.

  2. 2.

    https://github.com/emilmont/pyStatParser - accessed on 09/03/2017.

  3. 3.

    https://ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_bulk/ - accessed on 10/03/2017.

  4. 4.

    https://github.com/tkuhn/nanopubstudies-supplementary/blob/master/botstudy/extract_evalresults.ods - accessed on 15/03/2017.

  5. 5.

    https://github.com/TomJansen25/Extracting-Core-Claims/.

References

  1. Aggarwal, C.C., Zhai, C. (eds.): Mining text data. Springer Science & Business Media, New York (2012)

    Google Scholar 

  2. Barrera, A., Verma, R.: Combining syntax and semantics for automatic extractive single-document summarization. In: Gelbukh, A. (ed.) CICLing 2012, vol. 7182, pp. 366–377. Springer, Heidelberg (2012)

    Google Scholar 

  3. Chiticariu, L., Li, Y., Reiss, F.R.: Rule-based information extraction is dead! Long live rule-based information extraction systems! In: EMNLP, pp. 827–832, October 2013

    Google Scholar 

  4. Ferreira, R., de Souza Cabral, L., Lins, R.D., e Silva, G.P., Freitas, F., Cavalcanti, G., Lima, R., Simske, S.J., Favaro, L.: Assessing sentence scoring techniques for extractive text summarization. Expert Syst. Appl. 40(14), 5755–5764 (2013)

    Article  Google Scholar 

  5. Hong, B., Zhen, D.: An extended keyword extraction method. Phys. Procedia 24, 1120–1127 (2012)

    Article  Google Scholar 

  6. Kuhn, T.: A survey and classification of controlled natural languages. Comput. Linguist. 40(1), 121–170 (2014)

    Article  Google Scholar 

  7. Kuhn, T., Barbano, P.E., Nagy, M.L., Krauthammer, M.: Broadening the scope of nanopublications. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 487–501. Springer, Heidelberg (2013)

    Google Scholar 

  8. Larsen, P.O., Von Ins, M.: The rate of growth in scientific publication and the decline in coverage provided by Science Citation Index. Scientometrics 84(3), 575–603 (2010)

    Article  Google Scholar 

  9. Lloret, E., Romá-Ferri, M.T., Palomar, M.: COMPENDIUM: a text summarization system for generating abstracts of research papers. Data Knowl. Eng. 88, 164–175 (2013)

    Article  Google Scholar 

  10. Mihalcea, R., Tarau, P.: TextRank: Bringing Order into Texts. Association for Computational Linguistics, Barcelona (2004)

    Google Scholar 

  11. Mons, B., van Haagen, H., Chichester, C., den Dunnen, J.T., et al.: The value of data. Nat. Genet. 43(4), 281–283 (2011)

    Article  Google Scholar 

  12. Ramos, J.: Using tf-idf to determine word relevance in document queries. In: Proceedings of the First Instructional Conference on Machine Learning, December 2003

    Google Scholar 

  13. Rose, S., Engel, D., Cramer, N., Cowley, W.: Automatic keyword extraction from individual documents. Text Mining, pp. 1–20 (2010)

    Google Scholar 

  14. Saggion, H., Poibeau, T.: Automatic text summarization: past, present and future. In: Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R. (eds.) Multi-Source, Multilingual Information Extraction and Summarization, pp. 3–21. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  15. Shah, P.K., Perez-Iratxeta, C., Bork, P., Andrade, M.A.: Information extraction from full text scientific articles: Where are the keywords? BMC Bioinform. 4(1), 20 (2003)

    Article  Google Scholar 

  16. Siddiqi, S., Sharan, A.: Keyword and keyphrase extraction techniques: a literature review. J. Comput. Appl. 109(2), 18–23 (2015)

    Google Scholar 

  17. Tan, A.H.: Text mining: the state of the art and the challenges. In: Proceedings of the PAKDD 1999 Workshop on Knowledge Discovery from Advanced Databases 8, pp. 65–70 (1999)

    Google Scholar 

  18. Turney, P.D.: Learning algorithms for keyphrase extraction. Inform. Retrieval 2(4), 303–336 (2000)

    Article  Google Scholar 

  19. De Waard, A., Schneider, J.: Formalising uncertainty: an ontology of reasoning, certainty and attribution (ORCA). In: Proceedings of the Joint 2012 International Conference on Semantic Technologies Applied to Biomedical Informatics and Individualized Medicine, vol. 930, pp. 10–17. CEUR-WS.org., November 2012

    Google Scholar 

  20. Wartena, C., Brussee, R., Slakhorst, W.: Keyword extraction using word co-occurrence. In: 2010 Workshop on Database and Expert Systems Applications (DEXA), pp. 54–58. IEEE, August 2010

    Google Scholar 

  21. Zweigenbaum, P., Demner-Fushman, D., Yu, H., Cohen, K.B.: Frontiers of biomedical text mining: current progress. Brief. Bioinform. 8(5), 358–375 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tom Jansen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Jansen, T., Kuhn, T. (2017). Extracting Core Claims from Scientific Articles. In: Bosse, T., Bredeweg, B. (eds) BNAIC 2016: Artificial Intelligence. BNAIC 2016. Communications in Computer and Information Science, vol 765. Springer, Cham. https://doi.org/10.1007/978-3-319-67468-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67468-1_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67467-4

  • Online ISBN: 978-3-319-67468-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation