A Python library for syncing files between Microsoft 365 SharePoint and local storage
Project description
MS365Sync
A Python library for syncing files between Microsoft 365 SharePoint and local storage.
Features
- 🔄 Two-way sync detection: Automatically detects added, modified, and deleted files
- 📁 Hierarchical support: Maintains folder structures during sync
- 🔐 OAuth2 authentication: Secure authentication using Microsoft Graph API
- 📊 Detailed logging: Comprehensive sync reports and file trees
- 🚀 CLI and library: Use as a command-line tool or import as a Python library
- ⚡ Efficient: Only downloads changed files to minimize bandwidth usage
Installation
From PyPI (when published)
pip install ms365sync
From source
git clone https://github.com/yourusername/ms365sync.git
cd ms365sync
pip install -e .
Development installation
git clone https://github.com/yourusername/ms365sync.git
cd ms365sync
pip install -e ".[dev]"
Configuration
Create a .env file in your project directory with the following variables:
TENANT_ID=your-azure-tenant-id
CLIENT_ID=your-azure-app-client-id
CLIENT_SECRET=your-azure-app-client-secret
Azure App Registration
- Go to the Azure Portal
- Navigate to "Azure Active Directory" → "App registrations"
- Click "New registration"
- Set application type to "Web"
- Under "API permissions", add:
Sites.Read.All(to read SharePoint sites)Files.Read.All(to read files)Files.ReadWrite.All(if you need write access)
- Generate a client secret under "Certificates & secrets"
- Copy the Application (client) ID, Directory (tenant) ID, and client secret
Usage
Command Line Interface
# Basic sync
ms365sync
# Verbose output
ms365sync --verbose
# Dry run (see what would be synced)
ms365sync --dry-run
# Use custom config file
ms365sync --config /path/to/your/.env
Python Library
from ms365sync import SharePointSync
# Initialize the sync client
syncer = SharePointSync()
# Perform sync and get changes
changes = syncer.sync()
print(f"Added: {len(changes['added'])} files")
print(f"Modified: {len(changes['modified'])} files")
print(f"Deleted: {len(changes['deleted'])} files")
Advanced Usage
from ms365sync import SharePointSync
import os
# Custom configuration
os.environ['TENANT_ID'] = 'your-tenant-id'
os.environ['CLIENT_ID'] = 'your-client-id'
os.environ['CLIENT_SECRET'] = 'your-client-secret'
syncer = SharePointSync()
# Get SharePoint files without syncing
sp_files = syncer.get_sharepoint_files()
print(f"Found {len(sp_files)} files in SharePoint")
# Get local files
local_files = syncer.get_local_files()
print(f"Found {len(local_files)} local files")
# Compare without syncing
added, modified, deleted = syncer.compare_files(sp_files, local_files)
print(f"Would add: {len(added)}, modify: {len(modified)}, delete: {len(deleted)}")
Configuration Options
The library uses the following configuration variables (set in .env or environment):
| Variable | Description | Required |
|---|---|---|
TENANT_ID |
Azure Active Directory tenant ID | Yes |
CLIENT_ID |
Azure app registration client ID | Yes |
CLIENT_SECRET |
Azure app registration client secret | Yes |
The following constants can be modified in the code:
SHAREPOINT_HOST = "your-sharepoint-site.sharepoint.com"
SITE_NAME = "Your Site Name" # Display name as seen in SharePoint
DOC_LIBRARY = "Your Document Library" # Display name
LOCAL_ROOT = pathlib.Path("download") # Local destination folder
File Structure
ms365sync/
├── __init__.py # Package initialization
├── sharepoint_sync.py # Main sync logic
└── cli.py # Command-line interface
download/ # Default local sync folder
sync_logs/ # Sync change logs (JSON)
Sync Process
- Authentication: Connects to Microsoft Graph API using OAuth2
- Discovery: Recursively scans SharePoint document library
- Comparison: Compares SharePoint files with local files by size and modification date
- Sync: Downloads new/modified files, deletes files removed from SharePoint
- Logging: Saves detailed change log to
sync_logs/sync_changes_TIMESTAMP.json
Error Handling
The library includes comprehensive error handling:
- Authentication errors: Clear messages for invalid credentials
- Network errors: Retry logic for temporary connection issues
- File system errors: Graceful handling of permission issues
- API errors: Proper handling of SharePoint/Graph API limitations
Development
Setting up development environment
git clone https://github.com/yourusername/ms365sync.git
cd ms365sync
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -e ".[dev]"
Running tests
pytest
Code formatting
black ms365sync/
isort ms365sync/
Type checking
mypy ms365sync/
Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Changelog
Version 0.1.0
- Initial release
- Basic SharePoint to local sync functionality
- CLI interface
- Comprehensive logging and error handling
Roadmap
- Implement dry-run mode
- Add configuration file support (YAML/JSON)
- Implement upload functionality (local to SharePoint)
- Add filtering options (file types, patterns)
- Add scheduled sync support
- Implement incremental sync optimization
- Add progress bars for large syncs
- Support for multiple SharePoint sites
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ms365sync-0.1.0.tar.gz.
File metadata
- Download URL: ms365sync-0.1.0.tar.gz
- Upload date:
- Size: 14.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
58b2e798b205f5f7f177eb61c54d5b10adbefcac5224699fe982f744a5877a17
|
|
| MD5 |
f87fd5777094f485e496b5cdc248d6be
|
|
| BLAKE2b-256 |
19e4199765f328c853e4600179bbcdc2bc117ca6871038e0ade55522441aec25
|
File details
Details for the file ms365sync-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ms365sync-0.1.0-py3-none-any.whl
- Upload date:
- Size: 10.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b76894add6df1526a3f4fee0f3b2f734e4869acb9dc17dd370c74013cbd66db3
|
|
| MD5 |
0d502e66675de61b0a4e2c82747919e1
|
|
| BLAKE2b-256 |
4a978e44cf82f8dd4507587db0968957d7c3e58107d7b7fffe7f67260a98b49f
|