llama-index readers microsoft_sharepoint integration

These details have not been verified by PyPI

Project description

Microsoft SharePoint Reader

pip install llama-index-readers-microsoft-sharepoint

The loader loads the files from a folder in SharePoint site or SharePoint Site Pages.

It also supports traversing recursively through the sub-folders.

Prerequisites

App Authentication using Microsoft Entra ID (formerly Azure AD)

You need to create an App Registration in Microsoft Entra ID. Refer here
API Permissions for the created app:
- Microsoft Graph → Application Permissions → Sites.Read.All (Grant Admin Consent) (Allows access to all sites in the tenant)
- OR Microsoft Graph → Application Permissions → Sites.Selected (Grant Admin Consent) (Allows access only to specific sites you select and grant permissions for)
- Microsoft Graph → Application Permissions → Files.Read.All (Grant Admin Consent)
- Microsoft Graph → Application Permissions → BrowserSiteLists.Read.All (Grant Admin Consent)

Note: If you use Sites.Selected, you must grant your app access to the specific SharePoint site(s) via the SharePoint admin center. See Grant access to a specific site for details.

More info on Microsoft Graph APIs - Refer here

Usage

To use this loader client_id, client_secret and tenant_id of the registered app in Microsoft Azure Portal is required.

Loading Files from SharePoint Drive

This loader loads the files present in a specific folder in SharePoint.

If the files are present in the Test folder in SharePoint Site under root directory, then the input for the loader for file_path is Test

FilePath

from llama_index.readers.microsoft_sharepoint import SharePointReader

loader = SharePointReader(
    client_id="<Client ID of the app>",
    client_secret="<Client Secret of the app>",
    tenant_id="<Tenant ID of the Microsoft Azure Directory>",
)

documents = loader.load_data(
    sharepoint_site_name="<Sharepoint Site Name>",
    sharepoint_folder_path="<Folder Path>",
    recursive=True,
)

Using Sites.Selected Permission

If you have only been granted access to a specific site (using Sites.Selected), you can use the site host name and relative URL instead of the site name:

from llama_index.readers.microsoft_sharepoint import SharePointReader

loader = SharePointReader(
    client_id="<Client ID of the app>",
    client_secret="<Client Secret of the app>",
    tenant_id="<Tenant ID of the Microsoft Azure Directory>",
    sharepoint_host_name="contoso.sharepoint.com",
    sharepoint_relative_url="sites/YourSiteName",
)

documents = loader.load_data(
    sharepoint_folder_path="<Folder Path>",
    recursive=True,
)

Loading SharePoint Site Pages

You can also load SharePoint Site Pages as documents by setting sharepoint_type to PAGE:

from llama_index.readers.microsoft_sharepoint import (
    SharePointReader,
    SharePointType,
)

loader = SharePointReader(
    client_id="<Client ID of the app>",
    client_secret="<Client Secret of the app>",
    tenant_id="<Tenant ID of the Microsoft Azure Directory>",
    sharepoint_site_name="<Sharepoint Site Name>",
    sharepoint_host_name="<your-tenant>.sharepoint.com",
    sharepoint_relative_url="/sites/<YourSite>",
    sharepoint_type=SharePointType.PAGE,
)

# Load all pages
documents = loader.load_data()

# Or load a specific page by ID
loader.sharepoint_file_id = "<page_id>"
documents = loader.load_data()

Filtering Pages with Callbacks

You can filter which pages to process using the process_document_callback:

def page_filter(page_name: str) -> bool:
    # Only process pages that don't start with "Draft"
    return not page_name.startswith("Draft")


loader = SharePointReader(
    client_id="<Client ID>",
    client_secret="<Client Secret>",
    tenant_id="<Tenant ID>",
    sharepoint_site_name="<Site Name>",
    sharepoint_type=SharePointType.PAGE,
    process_document_callback=page_filter,
)

Error Handling

Control error behavior with fail_on_error:

loader = SharePointReader(
    client_id="<Client ID>",
    client_secret="<Client Secret>",
    tenant_id="<Tenant ID>",
    fail_on_error=False,  # Log errors and continue instead of raising
)

Instrumentation Events

The SharePoint reader emits events during page processing for monitoring:

from llama_index.core.instrumentation import get_dispatcher
from llama_index.core.instrumentation.event_handlers import BaseEventHandler
from llama_index.readers.microsoft_sharepoint import (
    TotalPagesToProcessEvent,
    PageDataFetchCompletedEvent,
    PageFailedEvent,
)


class SharePointEventHandler(BaseEventHandler):
    def handle(self, event):
        if isinstance(event, TotalPagesToProcessEvent):
            print(f"Processing {event.total_pages} pages...")
        elif isinstance(event, PageDataFetchCompletedEvent):
            print(f"Completed: {event.page_id}")
        elif isinstance(event, PageFailedEvent):
            print(f"Failed: {event.page_id} - {event.error}")


dispatcher = get_dispatcher("llama_index.readers.microsoft_sharepoint.base")
dispatcher.add_event_handler(SharePointEventHandler())

Available events:

TotalPagesToProcessEvent: Total number of pages to process
PageDataFetchStartedEvent: Page processing started
PageDataFetchCompletedEvent: Page successfully processed
PageSkippedEvent: Page skipped (via callback)
PageFailedEvent: Page processing failed

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.9.1

Mar 13, 2026

This version

0.9.0

Mar 12, 2026

0.8.1

Mar 3, 2026

0.8.0

Feb 17, 2026

0.7.0

Feb 2, 2026

0.6.1

Sep 8, 2025

0.6.0

Jul 31, 2025

0.5.1

Apr 3, 2025

0.5.0

Nov 18, 2024

0.4.1

Nov 14, 2024

0.4.0

Nov 12, 2024

0.3.4

Oct 24, 2024

0.3.3

Oct 2, 2024

0.3.2

Sep 22, 2024

0.3.1

Sep 9, 2024

0.3.0

Aug 22, 2024

0.2.8

Aug 8, 2024

0.2.7

Aug 1, 2024

0.2.6

Jul 15, 2024

0.2.5

Jul 5, 2024

0.2.4

Jun 26, 2024

0.2.3

May 17, 2024

0.2.2

Apr 30, 2024

0.2.1

Apr 24, 2024

0.1.7

Apr 4, 2024

0.1.6

Apr 1, 2024

0.1.5

Mar 31, 2024

0.1.4

Mar 27, 2024

0.1.3

Feb 21, 2024

0.1.2

Feb 13, 2024

0.1.1

Feb 12, 2024

0.1.0

Feb 10, 2024

0.0.1

Feb 4, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llama_index_readers_microsoft_sharepoint-0.9.0.tar.gz (55.3 kB view details)

Uploaded Mar 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llama_index_readers_microsoft_sharepoint-0.9.0-py3-none-any.whl (53.5 kB view details)

Uploaded Mar 12, 2026 Python 3

File details

Details for the file llama_index_readers_microsoft_sharepoint-0.9.0.tar.gz.

File metadata

Download URL: llama_index_readers_microsoft_sharepoint-0.9.0.tar.gz
Upload date: Mar 12, 2026
Size: 55.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_readers_microsoft_sharepoint-0.9.0.tar.gz
Algorithm	Hash digest
SHA256	`9e184e0e90ea43dfb9990a5de1099bcf0e4efd35f5f57a0c147fe04bf5c36409`
MD5	`12c7ff7bd1e9248e4f24bea6186a7fcc`
BLAKE2b-256	`866c93ab548f26f0f6e00f71029184c244d77d25ec7f404788a8085c09704741`

See more details on using hashes here.

File details

Details for the file llama_index_readers_microsoft_sharepoint-0.9.0-py3-none-any.whl.

File metadata

Download URL: llama_index_readers_microsoft_sharepoint-0.9.0-py3-none-any.whl
Upload date: Mar 12, 2026
Size: 53.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for llama_index_readers_microsoft_sharepoint-0.9.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c4f1306364f151de17e30e8e40a251b82f3a2d5a6ffea0bcbf2b3c865396c49f`
MD5	`c256bd31b771e1d478cbe5ff06243a6a`
BLAKE2b-256	`6e769133356555fc37f2e0263b4e757ee5d88baf0bda884c499a8e1ccc166769`

See more details on using hashes here.

llama-index-readers-microsoft-sharepoint 0.9.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Microsoft SharePoint Reader

Prerequisites

App Authentication using Microsoft Entra ID (formerly Azure AD)

Usage

Loading Files from SharePoint Drive

Using Sites.Selected Permission

Loading SharePoint Site Pages

Filtering Pages with Callbacks

Error Handling

Instrumentation Events

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes