Summarization
Transformers
Safetensors
English
pegasus_x
text2text-generation
Synthetic
16384
encoder-decoder

pegasus-x-base-synthsumm_open-16k

Open In Colab

This is a text-to-text summarization model fine-tuned from pegasus-x-base on a dataset of long documents from various sources/domains and their synthetic summaries.

It performs surprisingly well as a general summarization model for its size. More details, a larger model, and the dataset will be released (as time permits).

Usage

It's recommended to use this model with beam search decoding. If interested, you can also use the textsum util package to have most of this abstracted out for you:

pip install -U textsum

then:

from textsum.summarize import Summarizer

model_name = "BEE-spoke-data/pegasus-x-base-synthsumm_open-16k"
summarizer = Summarizer(model_name) # GPU auto-detected
text = "put the text you don't want to read here"
summary = summarizer.summarize_string(text)
print(summary)

architecture

Update May 2026:

The architecture of Pegasus-X is rather interesting and perhaps under-explored or built on. Additionally, one small innovation here (more on the larger variant) is the original activation function was updated to swish and subsequently healed as part of the fine-tuning process and worked fine

Here's a little glimpse on how this thing works and processes long sequences while being a small encoder-decoder:

Open BEE-spoke-data/pegasus-x-base-synthsumm_open-16k in hfviewer
Downloads last month
148
Safetensors
Model size
0.3B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for BEE-spoke-data/pegasus-x-base-synthsumm_open-16k

Finetuned
(12)
this model

Spaces using BEE-spoke-data/pegasus-x-base-synthsumm_open-16k 2

Papers for BEE-spoke-data/pegasus-x-base-synthsumm_open-16k