NLLP Workshop 2021 took place on 10 November 2021, co-located with EMNLP 2021.

The workshop proceedings are available here.

The recording of the workshop is available here

Important Dates:

  • Submission deadline ― 30 August 2021
  • Notification ― 22 September 2021
  • Camera ready ― 30 September 2021
  • Workshop ― 10 November 2021


All times are in AST time zone (conversion tool)

08:00 - 08:10   Workshop opening
08:10 - 09:00   Invited Speaker: John Armour
09:00 - 10:30   Session 1
09:00 - 09:15   A Corpus for Multilingual Analysis of Online Terms of Service
Kasper Drawzeski1, Andrea Galassi2, Agnieszka Jablonowska3, Francesca Lagioia3, Marco Lippi4, Hans Micklitz3, Giovanni Sartor3, Giacomo Tagiuri3, Paolo Torroni5
1BEUC, 2University of Bologna, 3European University Institute, 4University of Modena and Reggio Emilia, 5Alma Mater - Università di Bologna
09:15 - 09:30   Named Entity Recognition in the Romanian Legal Domain
Vasile Pais, Maria Mitrofan, Carol Luca Gasan, Vlad Coneschi, Alexandru Ianov
Research Institute for Artificial Intelligence, Romanian Academy
09:30 - 09:45   The Power of Legislatures in Hungary - A Text Reuse Analysis
Miklós Sebők, Anna Székely, István Járay
Centre for Social Sciences, Budapest
09:45 - 10:00   Swiss-Judgment-Prediction: A Multilingual Legal Judgment Prediction Benchmark
Joel Niklaus1, Ilias Chalkidis2, Matthias Stürmer1
1University of Bern, 2University of Copenhagen
10:00 - 10:15   LexGLUE: A Benchmark Dataset for Legal Language Understanding in English
Ilias Chalkidis1, Abhik Jana2, Dirk Hartung3, Michael Bommarito4, Ion Androutsopoulos5, Daniel Martin Katz6, Nikolaos Aletras7
1University of Copenhagen, 2Universitat Hamburg, 3Bucerius Law School, 4Stanford Law School, 5Athens University of Economics and Business, 6Illinois Tech - Chicago Kent College of Law, 7University of Sheffield
10:15 - 10:30   'Just What do You Think You're Doing, Dave?' A Checklist for Responsible Data Use in NLP
Anna Rogers1, Tim Baldwin2, Kobi Leins3
1Center for Social Data Science, University of Copenhagen, 2The University of Melbourne, 3King’s College / London
10:30 - 10:45   Break
10:45 - 12:10   Session 2
10:45 - 11:00   Automated Extraction of Sentencing Decisions from Court Cases in the Hebrew Language
Mohr Wenger1, Tom Kalir1, Noga Berger2, Carmit Chalamish3, Renana Keydar1, Gabriel Stanovsky1
1The Hebrew University of Jerusalem, 2The Association of Rape Crisis Centers in Israel, 3The Association of Rape Crisis Centers in Israel, Bar-Ilan University, Israel.
11:00 - 11:15   A Multilingual Approach to Identify and Classify Exceptional Measures against COVID-19
Georgios Tziafas1, Eugenie de Saint-Phalle1, Wietse de Vries1, Clara Egger1, Tommaso Caselli2
1University of Groningen, 2Rijksuniversiteit Groningen
11:15 - 11:30   Multi-granular Legal Topic Classification on Greek Legislation
Christos Papaloukas1, Ilias Chalkidis2, Konstantinos Athinaios1, Despina Pantazi3, Manolis Koubarakis1
1National and Kapodistrian University of Athens, 2University of Copenhagen, 3Department of Informatics and Telecommunications, National and Kapodistrian University of Athens
11:30 - 11:40   Machine Extraction of Tax Laws from Legislative Texts
Elliott Ash1, Malka Guillot2, Luyang Han1
1ETH Zurich, 2Liege
11:40 - 11:50   jurBERT: A Romanian BERT Model for Legal Judgement Prediction
Mihai Masala1, Radu Cristian Alexandru Iacob1, Ana Sabina Uban2, Marina Cidota3, Horia Velicu4, Traian Rebedea1, Marius Popescu3
1University Politehnica of Bucharest, 2Universitat Politecnica de Valencia, University of Bucharest, 3University of Bucharest, 4BRD Groupe Societe Generale
11:50 - 12:00   JuriBERT: A Masked-Language Model Adaptation for French Legal Text
Stella Douka1, Hadi Abdine1, Michalis Vazirgiannis1, Rajaa El Hamdani2, David Restrepo Amariles2
1Ecole Polytechnique, 2HEC Paris
12:00 - 12:10   Few-shot and Zero-shot Approaches to Legal Text Classification: A Case Study in the Financial Sector
Rajdeep Sarkar1, Atul Kr. Ojha2, Jay Megaro3, John Mariano3, Vall Herard3, John P. McCrae4
1Data Science Institute, NUI Galway, 2Data Science Institute, Unit for Linguistic Data, National University of Ireland Galway, 3FMR LLC, 4Insight Center for Data Analytics, National University of Ireland Galway
12:10 - 13:00   Lunch & Virtual Town Hall
13:00 - 13:45   Invited Speaker: Sylvie Delacroix
Data Trusts as a Bottom-up Empowerment Tool
13:45 - 14:30   Session 3
13:45 - 14:00   AutoLAW: Augmented Legal Reasoning through Legal Precedent Prediction
Robert Zev Mahari
Human Dynamics Group, MIT Media Lab, Massachusetts Institute of Technology; Harvard Law School
14:00 - 14:10   A Free Format Legal Question Answering System
Soha Khazaeli1, Janardhana Punuru1, Chad Morris1, Sanjay Sharma1, Bert Staub1, Michael Cole1, Sunny Chiu-Webster2, Dhruv Sakalley3
1LexisNexis Legal & Professional, 2Facebook, 3Sensibill
14:10 - 14:20   Searching for Legal Documents at Paragraph Level: Automating Label Generation and Use of an Extended Attention Mask for Boosting Neural Models of Semantic Similarity
Li Tang and Simon Clematide
Institut für Computerlinguistik, Universitäten Zürich
14:20 - 14:30   GerDaLIR: A German Dataset for Legal Information Retrieval
Marco Wrzalik and Dirk Krechel
RheinMain University of Applied Sciences
14:30 - 14:45   Break
14:45 - 16:10   Session 4
14:45 - 15:00   SPaR.txt, a Cheap Shallow Parsing Approach for Regulatory Texts
Ruben Kruiper1, Ioannis Konstas1, Alasdair Gray1, Farhad Sadeghineko2, Richard Watson2, Bimal Kumar2
1Heriot-Watt University, 2Northumbria University
15:00 - 15:15   Capturing Logical Structure of Visually Structured Documents with Multimodal Transition Parser
Yuta Koreeda1 and Christopher Manning2
1Research & Development Group, Hitachi America Ltd., 2Stanford University
15:15 - 15:30   Legal Terminology Extraction with the Termolator
Nhi Pham, Lachlan Pham, Adam Meyers
New York University
15:30 - 15:45   Supervised Identification of Participant Slots in Contracts
Dan Simonson
BlackBoiler Inc
15:45 - 16:00   ContractNLI: A Dataset for Document-level Natural Language Inference for Contracts
Yuta Koreeda1 and Christopher Manning2
1Research & Development Group, Hitachi America Ltd., 2Stanford University
16:00 - 16:10   Named Entity Recognition in Historic Legal Text: A Transformer and State Machine Ensemble Method
Fernando Trias, Hongming Wang, Sylvain Jaume, Stratos Idreos
Harvard University
16:10 - 16:25   Break
16:25 - 17:20   Session 5
16:25 - 16:40   Summarization of German Court Rulings
Ingo Glaser1, Sebastian Moser1, Florian Matthes2
1Technical University of Munich, 2Technische Universität München
16:40 - 16:55   Privacy Policy Question Answering Assistant A Query-Guided Extractive Summarization Approach
Moniba Keymanesh, Micha Elsner, Srinivasan Parthasarathy
The Ohio State University
16:55 - 17:10   Learning from Limited Labels for Long Legal Dialogue
Jenny Hong, Derek Chong, Christopher Manning
Stanford University
17:10 - 17:20   Automating Claim Construction in Patent Applications: The CMUmine Dataset
Ozan Tonguz1, Yiwei Qin1, Yimeng Gu1, Hyun Moon2
1Electrical Computer Engineering at Carnegie Mellon University, 2School of Engineering at Carnegie Mellon University
17:20 - 18:00   Session 6
17:20 - 17:30   Effectively Leveraging BERT for Legal Document Classification
Nut Limsopatham
Microsoft AI+R
17:30 - 17:45   Semi-automatic Triage of Requests for Free Legal Assistance
Meladel Mistica1, Jey Han Lau2, Brayden Merrifield3, Kate Fazio3, Timothy Baldwin2
1The University of Queensland, 2The University of Melbourne, 3Justice Connect
17:45 - 18:00   Automatic Resolution of Domain Name Disputes
Wayan Vihikan1, Meladel Mistica2, Inbar Levy1, Andrew Christie1, Timothy Baldwin1
1The University of Melbourne, 2The University of Queensland
18:00 - 18:10   Workshop Closing


Organizing Committee

Programe Committee

Invited Speakers

John Armour (Oxford Law)

Title: Access to Caselaw Data in the UK: Constraints and Future Prospects

Abstract: In this talk, we will consider the legal, regulatory and practical constraints on access to caselaw data in the UK for use in legal NLP applications, contrasting these with the position in other leading jurisdictions. We will analyse how the legal constraints can be managed and provide an overview of the future prospects for data access.

Block Image

Bio: John Armour is Professor of Law and Finance at Oxford University and a Fellow of the British Academy and the European Corporate Governance Institute. He was previously a member of the Faculty of Law and the interdisciplinary Centre for Business Research at the University of Cambridge. He studied law (MA, BCL) at the University of Oxford and then at Yale Law School (LLM). He has held visiting posts at various institutions including the University of Auckland, the University of Chicago, Columbia Law School, the University of Frankfurt, the Max Planck Institute for Comparative Private Law in Hamburg, the University of Pennsylvania Law School and the University of Sydney. He is a member of the American Law Institute and an Academic Member of the Chancery Bar Association. Armour has published widely in the fields of company law, financial regulation, and corporate insolvency. His main research interest lies in the integration of legal and economic analysis, with particular emphasis on the impact on the real economy of changes in company law, corporate insolvency law and financial regulation. He serves as an Executive Editor of the Journal of Corporate Law Studies and the Journal of Law, Finance and Accounting, and has been involved in policy-related projects commissioned by the UK’s Department of Trade and Industry (now BEIS), Financial Services Authority (now FCA) and Insolvency Service, the Commonwealth Secretariat, and the World Bank. He served as a member of the European Commission’s Informal Company Law Expert Group from 2014-19.

Sylvie Delacroix (Birmingham Law & Alan Turing Institute)

Title: Data Trusts as a Bottom-up Empowerment Tool

Abstract: This presentation will focus on data trusts as a bottom-up empowerment tool. It will proceed from an analysis of the particular type of vulnerability concomitant with our 'leaking' data on a daily basis, to argue that data ownership is both unlikely and inadequate as an answer to the problems at stake. There are three key problems that bottom-up data trusts seek to address:

  • Lack of mechanisms to empower groups, not just individuals
  • We can (and should) do better than 'make belief' consent.
  • We can (and should) do better when it comes to intelligent data sharing.

Block Image

Bio: Sylvie Delacroix focuses on the intersection between law and ethics, with a particular interest in Data and Machine Ethics, Agency and the role of habit within moral decisions (Habitual Ethics?, Bloomsbury / Hart Publishing, 2021). Her current research focuses on the design of computer systems meant for morally-loaded contexts. She is also considering the potential inherent in 'bottom-up' Data Trusts as a mechanism to address power imbalances between data-subjects and data-controllers. Professor Delacroix’s work has notably been funded by the Wellcome Trust, the NHS and the Leverhulme Trust, from whom she received the Leverhulme Prize. Professor Delacroix was one of three appointed commissioners on the Public Policy Commission on the use of algorithms in the justice system (Law Society of England and Wales), which released its report on 04 June 2019. She is a Fellow of the Alan Turing Institute and a Mozilla Fellow. Visit the Data Trusts website - a new site co-created by Sylvie that brings together information about data trusts, with the aim of helping advance debate about their use.