Shared Task on Readability-Controlled Text Simplification

We invite participation in the TSAR 2025 Shared Task on Readability-Controlled Text Simplification, aimed at generating simplifications of texts that conform to a specified target readability level, balancing reduced linguistic complexity with meaning preservation and fluency.

Announcements

  • 07 July 2025: Shared task announced

Description

The task targets English-language paragraphs written at upper-intermediate or advanced levels and requires participants to simplify them according to a specified target readability level, defined using the Common European Framework of Reference for Languages (CEFR). Specifically, participants will be asked to simplify texts originally at B2 level or above to target levels of A1, A2, or B1.

Participants are expected to adjust linguistic complexity while preserving the core meaning and coherence of the original paragraph.

Important Dates

All dates are 11:59PM UTC-12:00 (“anywhere on Earth”).

  • 16 July 2025 – Trial data and evaluation scripts released
  • 15 August 2025 – Test data released
  • 26 August 2025 – System outputs due
  • 2 September 2025 – Evaluation results published
  • 23 September 2025 – System description papers due
  • 30 September 2025 – Reviews and notifications sent
  • 7 October 2025 – Camera-ready papers due

Data

This is a fully open task in terms of system development. Participants are free to use any publicly-available data or resources. No training data will be provided.

Trial Data

We will release a trial dataset including:

  • Input paragraphs with associated target CEFR levels
  • One or more reference simplifications
  • Official evaluation scripts

This release is intended to help participants understand the data format, expected output, and evaluation process.

Test Data

The final test set will consist of:

  • Paragraphs with target CEFR levels
  • No references will be provided

Participants must submit their outputs strictly following a prespecified format for official evaluation.

Evaluation

Submissions will be evaluated using the following metrics:

  1. CEFR Compliance: A CEFR-level classifier will verify whether the simplified paragraph meets the specified target level.

  2. Meaning Preservation: Semantic similarity between the source paragraph and the system output.

  3. Similarity to References: Semantic similarity between the system output and references.

The official evaluation scripts will be released together with the trial data.

Participation

All participants must register in advance using the following form:

👉 [Registration Form]

Registered participants will receive announcements, updates, and submission instructions.

Submission Format

Submissions must follow a specific JSON format (details will be provided with the trial data). Each entry should include:

  • A paragraph ID
  • The simplified paragraph
  • The target CEFR level

Each team may submit up to three runs for evaluation.

Publication

Participants are invited to submit a system description paper to the TSAR 2025 Workshop. All papers will undergo peer review and accepted papers will appear in the workshop proceedings.

Organizers

  • Fernando Alva-Manchego (Cardiff University, UK)
  • Regina Stodden (University of Bielefeld, Germany)
  • Kai North (Cambium Assessment, USA)
  • Joseph Marvin Imperial (National University Philippines and University of Bath, UK)
  • Abdullah Barayan (Cardiff University, UK)
  • Harish Tayyar Madabushi (University of Bath, UK)

Contact

For questions, please contact us at: tsarworkshop@googlegroups.com adding [Shared Task] to the email subject.

Useful Resources