82 projects
pyprocessors-rf_consolidate
RFConsolidate annotations coming from different annotators
pyformatters-xml-rf
Groupe RF XML formatter
pyprocessors-afp_keywords
Processor based on AFP keywords extraction
pysegmenters-blingfire
Segmenter based on BlingFire
pyprocessors-iptc_mapper
Sherpa IPTC category mapper
pyconverters-whisperx
WhisperX converter for audio transcription with speaker diarization support.
pyconverters-cairn_xml
Cairn.info XML converter
pyprocessors-afp_entities
AFPEntities annotations coming from different annotators
pyannotators-entityfishing
Annotator based on entity-fishing
pyprocessors-segment_renseignor
Create segments from annotations based on Renseignor document structure
pyprocessors-deepl
DeepL processor plugin for pymultirole
pyconverters-openai_audio
OpenAIAudio converter
pysegmenters-syntok
syntok segmenter
pysegmenters-rules_segmenter
Rule-based segmenter
pysegmenters-pysdb
Rule-based segmenter
pysegmenters-md_splitter
Markdown splitter segmenter
pyprocessors-tag2segment
Create segments from annotations
pyprocessors-standoff2inline
Sherpa transform annotations to categories processor
pyprocessors-reconciliation
Sherpa reconciliation processor
pyprocessors-pseudonimizer
Processor based on Presidio anonymizer
pyprocessors-openai_completion
OpenAICompletion processor
pyprocessors-nameparser
Processor based on Nameparser
pyprocessors-document_fingerprint
Sherpa Consolidation processor
pyprocessors-chunk_sentences
Sherpa sentence chunking processor
pyprocessors-categories_from_annotations
Sherpa transform annotations to categories processor
pyprocessors-capitalizer
Replace document text with capitalized annotations
pyformatters-tabular
Tabular formatter for Sherpa
pyconverters-pyword
Convert DOCX to Markdown using mammoth
pyconverters-pypowerpoint
Convert PPTX to text using python-pptx
pyconverters-pyexcel
Convert XLSX to 1-segment per row document
pyconverters-pubmedfetcher
Fetch and convert Pubmed articles
pyconverters-paddleocr
Convert PDF to structured text using PaddleOCR
pyconverters-openai_vision
OpenAIVision converter
pyconverters-mistralocr
Convert PDF to structured text using MistralOCR
pyconverters-inscriptis
Convert HTML to text using inscriptis
pyannotators-spacyner
Annotator based on Spacy NER
pyannotators-spacymatcher
SpacyMatcher annotator using the spacy rule-matching engine
pyannotators-patterns
Annotator based on Presidio pattern recognizer
pyannotators-duckling
Annotator based on Facebook Duckling
pyannotators-acronyms
Annotator based on Facebook's Acronyms
pyconverters-grobid
Convert PDF to structured text using Grobid
pymultirole-plugins
Sherpa multirole plugins
pyconverters-newsml
NewsML converter (AFP news)
pyprocessors-opennre
Processor based on Huggingface transformers zero-shot classification pipeline
pyprocessors-afp_sports
Sherpa Consolidation processor
pyconverters-isako
Convert PDF to structured text using Isako
pyprocessors-consolidate
Sherpa Consolidation processor
pyprocessors-generative_augmenter
GenerativeazA$$ processor
pyprocessors-bel_entities
Sherpa Consolidation processor
pyprocessors-xcago_reconciliation
Sherpa xcago_reconciliation processor
pyformatters-bel_table
Sherpa BELTable formatter
pyconverters-xcago
X-CAGO converter.
pyformatters-afp_quality
Sherpa AFP Quality formatter
pyimporters-skos
Sherpa knowledge import plugins
pyimporters-json
Sherpa knowledge import plugins
pyimporters-csv
Sherpa knowledge import plugins
pyimporters-skos-rf
Sherpa knowledge import plugins
pyimporters-mesh
Sherpa knowledge import plugins
pyimporters-obo
Sherpa knowledge import plugins
pyimporters-plugins
Sherpa knowledge import plugins
pyconverters-ocrmypdf
Convert OCRized PDF to text using [OCRmyPDF]
pyprocessors-escrim_reconciliation
Sherpa ESCRIM processor
pyprocessors-rf_resegment
Sherpa sentence chunking processor
pyprocessors-restore_punctuation
Sherpa sentence chunking processor
pyprocessors-q_and_a
Processor based on Huggingface transformers Q&A pipeline
pyprocessors-readinggrid
Processor that generate a focussed reading-grid (keep only the sentences containing annotation)
pyprocessors-normalizer
Sherpa git initnormalizer
pyprocessors-mazars_table
Sherpa Mazars normalizer
pyformatters-textranksummarizer
Formatter/processor based on TextRank
pyformatters-summarizer
Formatter based on Huggingface transformers summarization pipeline
pyconverters-speech
Speech recognition converter based on Huggingface pipeline
pyconverters-deeptranscript
DeepTranscript converter.
pyannotators-trankitner
Annotator based on Trankit NER
pyannotators-trfclassifier
Classifier based on Huggingface Text Classification pipeline
pyannotators-zeroshotclassifier
Annotator based on Huggingface transformers zero-shot classification pipeline
pyprocessors-similar_segments
Similar segments processor
pyprocessors-search_segments
Similar segments processor
pyformatters-consolidate
Sherpa Consolidation formatter
pysegmenters-rules
Rule-based segmenter
pyprocessors-silero_te
text repunctuation and recapitalization for
pysegmenters-spacyrules
Segmenter based on Spacy
pyimporters-dummy
Sherpa knowledge import plugins