Zero-Training Context Extension for Transformer Encoders via Nonlinear Absolute Positional Embeddings Interpolation
Reason this release was yanked:
Unaccessible scripts
Project description
Zero-Training Context Extension for Transformer Encoders via Nonlinear Absolute Positional Embeddings Interpolation
Official implementation of "Zero-Training Context Extension for Transformer Encoders via Nonlinear Absolute Positional Embeddings Interpolation". Paper preprint is coming soon.
This implementation currently supports only models compatible with Sentence Transformers library.
Models
Models are available at HuggingFace:
| Model | Context length | Language |
|---|---|---|
| idanylenko/e5-large-v2-ctx1024 | 1024 | English |
Installation
To install the package, use pip:
pip install "context-extension>=0.1.1"
Usage
After installing the package you may use extend-context-spline (recommended) or extend-context-linear scripts for embeddings interpolation.
Spline Interpolation
Use this for smooth, nonlinear interpolation to support arbitrary context lengths:
extend-context-spline \
--model_name_or_path="intfloat/e5-large-v2" \
--max_seq_length=1024 \
--embeddings_attr_name="embeddings.position_embeddings" \
--offset=0 \
--output_dir="intfloat/e5-large-v2-ctx1024-spline"
Linear Interpolation
Use this to double the model's positional embedding range using linear averaging between consecutive embeddings:
extend-context-linear \
--model_name_or_path="intfloat/e5-large-v2" \
--embeddings_attr_name="embeddings.position_embeddings" \
--offset=0 \
--output_dir="intfloat/e5-large-v2-ctx1024-linear"
Both commands modify the positional embeddings of a model and save the updated model to the specified directory. You can then upload the resulting model to Hugging Face or use it locally for inference.
For models like RoBERTa that use special tokens in the first few positions, remember to set appropriate --offset argument.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file context_extension-0.1.1.tar.gz.
File metadata
- Download URL: context_extension-0.1.1.tar.gz
- Upload date:
- Size: 10.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f25d37c2066146f7b95f06770b4513bb246e80c27bbe7d9af615e9f53c37201f
|
|
| MD5 |
4cb4cecf71c40ecb63e21feb7362a54f
|
|
| BLAKE2b-256 |
c442af222197abf28b40869842153e7510d10a6d21c6c8bb3304b4edb4e76f59
|
Provenance
The following attestation bundles were made for context_extension-0.1.1.tar.gz:
Publisher:
python-publish.yml on Kowd-PauUh/encoders-context-extension
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
context_extension-0.1.1.tar.gz -
Subject digest:
f25d37c2066146f7b95f06770b4513bb246e80c27bbe7d9af615e9f53c37201f - Sigstore transparency entry: 207797027
- Sigstore integration time:
-
Permalink:
Kowd-PauUh/encoders-context-extension@fe4c3dfa9dc3c1c95f1c97ae63eef4fcd2686ee1 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/Kowd-PauUh
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@fe4c3dfa9dc3c1c95f1c97ae63eef4fcd2686ee1 -
Trigger Event:
release
-
Statement type:
File details
Details for the file context_extension-0.1.1-py3-none-any.whl.
File metadata
- Download URL: context_extension-0.1.1-py3-none-any.whl
- Upload date:
- Size: 13.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c7fdc38b5d7bfcf644a7a7726b234d69f9fd2cf97eb6f7d7374ed963c974a17
|
|
| MD5 |
cd262e779c911b87eeaa41204a5bb0fc
|
|
| BLAKE2b-256 |
506630aed1ebd4fec46f76afaa9a6f354ab445df0e0d255d060ed1df59ad7a7b
|
Provenance
The following attestation bundles were made for context_extension-0.1.1-py3-none-any.whl:
Publisher:
python-publish.yml on Kowd-PauUh/encoders-context-extension
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
context_extension-0.1.1-py3-none-any.whl -
Subject digest:
2c7fdc38b5d7bfcf644a7a7726b234d69f9fd2cf97eb6f7d7374ed963c974a17 - Sigstore transparency entry: 207797030
- Sigstore integration time:
-
Permalink:
Kowd-PauUh/encoders-context-extension@fe4c3dfa9dc3c1c95f1c97ae63eef4fcd2686ee1 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/Kowd-PauUh
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@fe4c3dfa9dc3c1c95f1c97ae63eef4fcd2686ee1 -
Trigger Event:
release
-
Statement type: