top of page

PUBLICATIONS

2022

​

  • Saggion, H., Štajner, S., Ferrés, D., Sheang, K.C., Shardlow, M., North, K., Zampieri, M. 2022. Findings of the TSAR-2022 Shared Task on Multilingual Lexical Simplification. In Proceedings of the EMNLP-2022 workshop on Text Simplification, Accessibility, and Readability (TSAR). To appear.

​

  • Štajner, S., Ferrés, D., Shardlow, M., North, K., Zampieri, M., Saggion, H. 2022. Lexical simplification benchmarks for English, Portuguese, and Spanish. Frontiers in Artificial Intelligence. (pdf)​

​

  • Basile, V., Kozareva, Z., Štajner, S. (eds.) 2022. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations.

​

  • Štajner, S., Sheang, K.C., Saggion, H. 2022. Sentence Simplification Capabilities of Transfer-Based Models. In Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI), Vol. 36, No. 11, pp. 12172–12180 (pdf)

​

  • Samuelsson, C. and Štajner, S. 2022. Statistical methods: fundamentals. In Oxford Handbook of Computational Linguistics, 2nd edition, Oxford University Press. (link)

​​​

​

​

2021

​

  • ​Saggion, H., Štajner, S., Ferrés, D., Sheang, K.C. (eds.) 2021. Proceedings of the First Workshop on Current Trends in Text Simplification (CTTS 2021) co-located with the 37th Conference of the Spanish Society for Natural Language Processing (SEPLN 2021). CEUR Workshop Proceedings 2944. (link)

​

  • Štajner, S., Yenikent, S., Franco-Salvador, M. 2021. Five Psycholinguistic Characteristics for Better Interaction with Users. In Proceedings of the 8th International Conference on Behavioral and Social Computing (BESC), pp. 1-7. (link)

​

  • Štajner, S. 2021. Exploring Reliability of Gold Labels for Emotion Detection in Twitter. In Proceedings of the 13th international conference on Recent Advances in Natural Language Processing (RANLP), pp. 1350-1359. (pdf, dataset)

​

  • Štajner, S., Yenikent, S. 2021. How to Obtain Reliable Labels for MBTI Classification from Texts? In Proceedings of the 13th international conference on Recent Advances in Natural Language Processing (RANLP), pp. 1360-1368. (pdf)

​

  • Štajner, S. 2021. Automatic Text Simplification for Social Good: Progress and Challenges. In Findings of ACL, pp. 2637-2652. (pdf)

​

  • Štajner, S., Yenikent, S., Ghanem, B. and Franco-Salvador, M. 2021. What Motivates You? Benchmarking Automatic Detection of Basic Needs from Short Posts. In Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP), pp. 803-810. (pdf)

​

  • Štajner, S., Yenikent, S. 2021. Why Is MBTI Personality Detection from Texts a Difficult Task? In Proceedings of the 16th conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. 3580-3589. (pdf, dataset)

​​

​

2020

​​

  • Štajner, S., Yenikent, S., Franco-Salvador, M. 2020. Benchmarking Automatic Detection of Psycholinguistic Characteristics for Better Human-Computer Interaction. arXiv:2012.09692. (pdf)

​

  • Štajner, S., Yenikent, S. 2020. A Survey of Automatic Personality Detection from Texts. In Proceedings of the 28th International Conference on Computational Linguistics (COLING), online, pp. 6284-6295. (pdf)

​

  • Štajner, S., Nisioi, S., Ibáñez, D. 2020. Is Simple English Wikipedia As Simple and Easy-to-Understand as We Expect It to Be? In Proceedings of the 9th International Conference on Software Development and Technologies for Enhancing Accessibility and Fighting Info-exclusion (DSAI), online, pp. 66-70.  (pdf)

​

  • Štajner, S., Nisioi, S., HulpuÈ™, I. 2020. CoCo: A tool for automatically assessing conceptual complexity of texts. In Proceedings of the 12th Language Resources and Evaluation Conference (LREC), Marseille, France, pp. 7179-7186 (pdf, code)

​​

  • Štajner, S., HulpuÈ™, I. 2020. When shallow is good enough: Automatic assessment of conceptual text complexity using shallow semantic features. In Proceedings of the 12th Language Resources and Evaluation Conference (LREC), Marseille, France, pp. 1414-1422. (pdf)

​

  • Theil, C.K., Štajner, S. and Stuckenschmidt, H. 2020. Explaining financial uncertainty through specialized word embeddings. ACM Transactions on Data Science, Vol. 1, Issue 1, Article no. 6. (pdf)

​

2019

​

  • Štajner, S., Popovic, M. 2019. Automated Text Simplification as a Preprocessing Step for Machine Translation into an Under-resourced Language. In Proceedings of Recent Advances in Natural Language Processing (RANLP), Varna, Bulgaria, pp. 1141-1150. Best Paper by a Young Researcher Award. (pdf)

​​

  • Hulpus, I., Štajner, S. and Stuckenschmidt, H. 2019. A spreading activation framework for tracking conceptual complexity of texts. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), Florence, Italy. pp. 3878-3887. (pdf)

​​

  • Štajner, S., Saggion, H. and Ponzetto, S.P. 2019. Improving lexical coverage of text simplification systems for Spanish. Expert Systems with Applications , Elsevier, Volume 118, pp. 80-91. Impact Factor: 3.768. (pdf)

​​

  • Basile, A.*, Franco-Salvador, M.*, Pawar, N.*, Štajner, S.*, Rios, MC. and Benajiba, Y. 2019. SymantoResearch at SemEval-2019 Task 3: Combined Neural Models for Emotion Classification in Human-Chatbot Conversations. In Proceedings of the 13th International Workshop on Semantic Evaluation (SemEval), pp. 330-334. (*equally contributing first authors)​ (pdf)

​

  • Heurich, M, Štajner, S. 2019. Durch Technologie zu mehr Empathie in der Kundenansprache – Wie Text Analytics helfen kann, die Stimme des digitalen Verbrauchers zu verstehen. In Stützer, C. M., Wachenfeld-Schell, A., & Oglesby, S. (editors) (Intelligentes) Text Mining in der Marktforschung. DGOF-Kompendium der Online-Forschung, Band 1, Köln, pp. 12-15. (pdf)

​​

​​

​​

2018

​

  • Štajner, S., Popovic, M. 2018. Improving Machine Translation of English Relative Clauses with Automatic Text Simplification. In Proceedings of the First Workshop on Automatic Text Adaptation (ATA), held in conjunction with International Conference on Natural Language Generation (INLG), Tilburg, Netherlands.​​ (pdf)

​​

  • Štajner, S. and Hulpus, I. 2018. Automatic Assessment of Conceptual Text Complexity. In Proceedings of the 27th International Conference on Computational Linguistics (COLING) , Santa Fe, New-Mexico, USA, pp. 318-330. (pdf)

​​

  • Štajner, S. and Saggion, H. 2018. Data-Driven Text Simplification. Tutorial at COLING 2018, Santa Fe, New-Mexico, USA. (pdf)

​​

  • Theil, C. K., Štajner, S. and Stuckenschmidt, H. 2018. Word embeddings-based uncertainty detection in financial disclosures. In Proceedings of the ACL Workshop on Economics and Natural Language Processing (ECONLP), Melbourne, Australia, pp. 32-37. (pdf

​​

  • Yimam, S. M., Biemann, C., Malmasi, S., Paetzold, G. H., Specia, L., Štajner, S., Tack, A., Zampieri, M. 2018. A Report on the Complex Word Identification Shared Task 2018. In Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications (BEA), at NAACL 2018, New Orleans, USA, pp. 66-78. (pdf)

​​

  • Štajner, S. 2018. How to Make Troubleshooting Simpler? Assessing Differences in Perceived Sentence Simplicity by Native and Non-native Speakers. In Proceedings of the second LREC workshop on Improving Social Inclusion: Tools, Methods and Resources (ISI-NLP 2), Miyazaki, Japan. (pdf)

​​

  • Štajner, S., Franco-Salvador, M., Rosso, P., Ponzetto, S. P. 2018. CATS: A Tool for Customised Alignment of Text Simplification Corpora. In Proceedings of the 11th Language Resources and Evaluation Conference, Miyazaki, Japan, pp. 3895-3903. (pdf, code)

​​

  • Štajner S.*, Nisioi, S.* 2018. A Detailed Evaluation of Neural Sequence-to-Sequence Models for In-domain and Cross-domain Text Simplification. In Proceedings of the 11th Language Resources and Evaluation Conference, Miyazaki, Japan, pp. 3026-3033. (*equally contributing first authors) (pdf)

​

​

​

2017

​

​

  • Yimam, S. M. Štajner, S., Riedl, M. and Biemann, C. 2017. CWIG3G2-Complex Word Identification Task across Three Text Genres and Two User Groups. In Proceedings of the 8th International Joint Conference on Natural Language Processing (IJCNLP), Taipei, Taiwan, pp. 401-407. (pdf)

​​

  • Štajner, S., Ponzetto, S. P., Stuckenschmidt, H. 2017. Automatic Assessment of Absolute Sentence Complexity. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI), Melbourne, Australia, pp. 4096-4102. (pdf)

​​

  • Štajner, S., Franco-Salvador, M., Ponzetto, S. P., Rosso, P., Stuckenschmidt, H. 2017. Sentence Alignment Methods for Improving Text Simplification Systems. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL), Vancouver, Canada, pp. 97-102 (Short Papers). acc.rate: 18%. (pdf)

​​

  • Nisioi, S.*, Štajner S.*, Ponzetto, S. P., Dinu, L. P. 2017. Exploring Neural Text Simplification Models. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL), Vancouver, Canada, pp. 85-91 (Short Papers). (*equally contributing first authorsacc.rate: 18%. (pdf, code)

​​

  • Štajner S. and Glavaš G. 2017. Leveraging Event-Based Semantics for Automated Text Simplification. Expert Systems With Applications, Elsevier, Vol. 82, pp. 383-395. Impact Factor: 3.928. (pdf)

​​

  • Štajner, S., Yaneva, V., Mitkov, R. and Ponzetto, S. P. 2017. Effects of Lexical Properties on Viewing Time per Word in Autistic and Neurotypical Readers. In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications (BEA) at EMNLP 2017, Copenhangen, Denmark, pp. 271-281.

​​

  • Yimam, S. M., Štajner, S., Riedl, M. and Biemann, C. 2017. Multilingual and Cross-Lingual Complex Word Identification. In Proceedings of the Recent Advances in Natural Language Processing (RANLP), Varna, Bulgaria, pp. 813-822. (pdf)

​​​

  • Theil, C. K., Štajner, S., Stuckenschmidt, H., Ponzetto, S. P. 2017. Automatic Detection of Uncertain Statements in the Financial Domain. In Proceedings of 18th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing), Budapest, Hungary, pp. 642-654. (pdf)

​​

  • Štajner, S., Glavaš, G., Ponzetto, S. P., Stuckenschmidt, H. 2017. Domain Adaptation for Automatic Detection of Speculative Sentences. In Proceedings of the 11th International Conference on Semantic Computing (IEEE ICSC), San Diego, USA, pp. 164-171. Best Paper Award (honorable mention).  acc.rate: 20.15% (pdf)

​

​

​

2016

​

​

  • Štajner, S., Popovic, M. 2016. Can Text Simplification Help Machine Translation?. In Proceedings of the 19th Annual Conference of the European Association for Machine Translation (EAMT), Riga, Latvia. Baltic Journal of Modern Computing, Vol. 4, No. 2s, pp. 230-242.

​​

  • Štajner, S., Baerg, N., Ponzetto, S.P. and Stuckenschmidt, H. 2016. Automatic Detection of Speculation in Policy Statements. In Proceedings of the WebSci'16 Workshop on Natural Language Processing and Computational Social Science (NLP+CSS). To appear.

​​

  • Štajner, S., Popovic, M., Saggion, H., Specia, L. and Fishel M. (eds.) 2016. Proceedings of the First International Workshop on Quality Assessment for Text Simplification (QATS) collocated with LREC 2016, Portoroz, Slovenia, May 28.

​​

  • Štajner, S., Popovic, M., Saggion, H., Specia, L. and Fishel M. 2016. Shared Task on Quality Assessment for Text Simplification. In Proceedings of the LREC Workshop on Quality Assessment for Text Simplification, Portoroz, Slovenia, May 28, pp. 22-31.

​​

  • Štajner, S., Popovic, M. and Béchara, H. 2016. Quality Estimation for Text Simplification. In Proceedings of the LREC Workshop on Quality Assessment for Text Simplification, Portoroz, Slovenia, May 28, pp. 15-21.

​​

  • Popovic, M. and Štajner, S. 2016. Machine Translation Evaluation Metrics for Quality Assessment of Automatically Simplified Sentences. In Proceedings of the LREC Workshop on Quality Assessment for Text Simplification, Portoroz, Slovenia, May 28, pp. 32-37.

​​

  • Štajner, S., Querido, A., Rendeiro, N., Rodrigues, J., and Branco, A. 2016. Use of Domain-Specific Language Resources in Machine Translation. In Proceedings of the International Conference on Language Resources and Evaluation (LREC), Portoroz, Slovenia, May 25-27, pp. 592-598.

​​

  • Rodrigues, J., Rendeiro, N., Querido, A., Štajner, S., and Branco, A. 2016. Bootstrapping a Hybrid MT System to a New Language Pair. In Proceedings of the International Conference on Language Resources and Evaluation (LREC), Portoroz, Slovenia, May 25-27, pp. 2762–2765.

​​

  • Rodrigues, J., Gomes, L., Neale, S., Querido, A., Rendeiro, N., Štajner, S., Silva, J. and Branco, A. 2016. Domain-Specific Hybrid Machine Translation from English to Portuguese. In Proceedings of the International Conference on the Computational Processing of Portuguese (PROPOR), Tomar, Portugal, July 13-15, pp. 50-61.

​

​

​

2015

​

​

  • Štajner, S., Rodrigues, J., Gomes, L. and Branco, A. 2015. Machine Translation for Multilingual Troubleshooting in the IT Domain: A Comparison of Different Strategies. In Proceedings of the Deep Machine Translation Workshop (DMTW), Prague, Czech Republic, 3-4 September 2015, pp. 106-115.

​​

  • Štajner, S., Béchara, H. and Saggion, H. 2015. A Deeper Exploration of the Standard PB-SMT Approach to Text Simplification and its Evaluation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-IJCNLP), Beijing, China, 26-31 July 2015, pp. 823-828. (pdf, bibtex) acc.rate: 22.3%

​​

  • Glavaš G. and Štajner S. 2015. Simplifying Lexical Simplification: Do We Need Simplified Corpora? In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-IJCNLP), Beijing, China, 26-31 July 2015, pp. 63-68. acc.rate: 22.3%

​​

  • Štajner, S., Calixto, I. and Saggion, H. 2015. Automatic Text Simplification for Spanish: Comparative Evaluation of Various Simplification Strategies. In Proceedings of the Recent Advances in Natural Language Processing (RANLP), Hissar, Bulgaria, 5-11 September 2015, pp. 618-626.

​​

  • Štajner, S. and Saggion, H. 2015. Translating from Original to Simplified Sentences using Moses: When does it Actually Work? In Proceedings of the Recent Advances in Natural Language Processing (RANLP), Hissar, Bulgaria, 5-11 September 2015, pp. 611-617.

​​

  • Saggion, H., Štajner, S., Bott, S., Mille, S., Rello, L. and Drndarevic, B. 2015. Making It Simplext: Implementation and Evaluation of a Text Simplification System for Spanish. ACM Transactions on Accessible Computing (TACCESS), Vol. 6, Issue 4, Article no. 14.

​​

  • Štajner, S. 2015. "New Data-Driven Approaches to Text Simplification". PhD thesis. University of Wolverhampton, UK.

​​

  • Štajner, S., Mitkov, R. and Corpas Pastor, G. 2015. 'Simple or not simple? A readability question'. In N. Gala, R. Rapp, and G. Bel-Enguix (eds), Recent Advances in Language Production, Cognition and the Lexicon, Springer, pp. 379-398. 

​

​

​

2014

​

​

  • Štajner, S. 2014. Translating sentences from 'original' to 'simplified' Spanish. Procesamiento del Lenguaje Natural, Vol. 53, pp. 61-68.

​​

  • Štajner, S., Evans R. and Dornescu, I. 2014. Assessing Conformance of Manually Simplified Corpora with User Requirements: the Case of Autistic Readers. Proceedings of the COLING workshop on Automatic Text Simplification - Methods and Applications in the Multilingual Society (ATS-MA), Dublin, Ireland, 24 August 2014, pp. 53-63.

​​

  • Mitkov, R. and Štajner, S. 2014. The Fewer, the Better? A Contrastive Study about Ways to Simplify. Proceedings of the COLING workshop on Automatic Text Simplification - Methods and Applications in the Multilingual Society (ATS-MA), Dublin, Ireland, 24 August 2014, pp. 30-40.

​​

  • Štajner, S., Mitkov, R. and Saggion, H. 2014. One Step Closer to Automatic Evaluation of Text Simplification Systems. In Proceedings of the EACL workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR), Gothenburg, Sweden, 27 April 2014, pp. 1-10.

​​

  • Štajner, S., Mitkov, R. and Leech, G. 2014. Natural Language Processing Methodology for Tracking Diachronic Changes in the 20th Century English Language. Journal of Research Design and Statistics in Linguistics and Communication Science, Vol 1, No 1, pp.71-112.

​

​

​

2013

​

​

  • Štajner, S. and Saggion, H. 2013. Readability Indices for Automatic Evaluation of Text Simplification Systems: A Feasibility Study for Spanish. In Proceedings of the 6th International Joint Conference on Natural Language Processing (IJCNLP), Nagoya, Japan, 14-18 October 2013, pp. 374-382. acc.rate: 23.4%

​​

  • Glavaš G. and Štajner S. 2013. Event-Centered Simplification of News Stories. In Proceedings of the Student Research Workshop at the International Conference on Recent Advances in Natural Language Processing (RANLP) , Hissar, Bulgaria, 9-11 September 2013. Best paper awardacc.rate: 11%

​​

  • Štajner, S. and Saggion, H. 2013. Adapting Text Simplification Decisions to Different Text Genres and Target Users. Procesamiento del Lenguaje Natural, Vol. 51, pp. 135-142.

​​

  • Štajner, S. and Evans, R. 2013. Can Statistical Tests Be Used for Feature Selection in Diachronic Text Classification? In Proceedings of the First International Conference on Statistical Language and Speech Processing (SLSP), Lecture Notes in Artificial Intelligence, Vol. 7978, Springer, pp. 273-283.

​​

  • Štajner, S. and Zampieri, M. 2013. Stylistic Changes for Temporal Text Classification. In Proceedings of the 16th International Conference on Text, Speech and Dialogue (TSD2013). Lecture Notes in Artificial Intelligence, Vol. 8082, Springer, pp. 519-526.

​​

  • Štajner, S., Drndarevic, B. and Saggion, H. 2013. Corpus-based Sentence Deletion and Split Decisions for Spanish Text Simplification. Computación y Sistemas, Vol.17, No.2, pp. 251-262, ISSN 1405-5546.

​​

  • Drndarevic, B., Štajner, S., Bott, S., Bautista, S. and Saggion, H. 2013. Automatic Text Simplification in Spanish: A Comparative Evaluation of Complementing Modules. In Proceedings of the 14th International Conference on Intelligent Text Processing and Computational Linguistics (Part II). Lecture Notes in Computer Science, Vol. 7817, Springer, pp. 488-500. acc.rate: 26%.

​

​

​

2012

​

​

  • Drndarevic, B., Štajner, S. and Saggion, H. 2012. Reporting Simply: A Lexical Simplification Strategy for Enhancing Text Accessibility. In Proceedings of the Easy-to-read on the Web Symposium.

​​

  • Štajner, S. and Mitkov, R. 2012. Diachronic Changes in Text Complexity in 20th Century English Language: An NLP Approach. In Proceedings of the International Conference on Language Resources and Evaluation (LREC) 2012. Istanbul, Turkey, May 21-27, pp. 1577-1584.

​​

  • Štajner, S., Evans, R., Orasan, C. and Mitkov, R. 2012. What can readability measures really tell us about text complexity? In Proceedings of the Workshop on Natural Language Processing for Improving Textual Accessibility (NLP4ITA), held in conjunction with LREC 2012. Istanbul, Turkey, May 27, pp. 14-21. 

​​

  • Štajner, S. and Mitkov, R. 2012. Using Comparable Corpora to Track Diachronic and Synchronic Changes in Lexical Density and Lexical Richness. In Proceedings of the 5th Workshop on Building and Using Comparable Corpora (5th BUCC), held in conjunction with LREC 2012. Istanbul, Turkey, May 26, pp. 88-97. 

​​

  • Štajner, S. and Mitkov, R. 2012. Style of Religious Texts in 20th Century. In Proceedings of the Workshop on Language Resource and Evaluation for Religious Texts (LRE-Rel), held in conjunction with LREC 2012. Istanbul, Turkey, May 23, pp. 81-87.

​​

  • Štajner, S. 2012. NLP Methodology for Investigating Language Change. Bulletin de Linguistique Appliquée et Générale (Bulag): Natural Language Processing and Human Language Technology 2011, N.36 . Presses universitaries de Franche-Comté, pp. 219-232.

​

​

​

2011

​

​

  • Štajner, S. and Mitkov, R. 2011. Diachronic Stylistic Changes in British and American Varieties of 20th Century Written English Language. In Proceedings of the International Workshop on Language Technologies for Digital Humanities and Cultural Heritage, held in conjunction with RANLP 2011. Hissar, Bulgaria, September 16, pp.78-85.

​​

  • Štajner, S. 2011. Towards a Better Exploitation of the Brown ‘Family’ Corpora in Diachronic Studies of British and American English Language Varieties. In Proceedings of the Student Research Workshop, held in conjunction with RANLP 2011. Hissar, Bulgaria, September 13, pp. 17-24. 

​

bottom of page