Skip to main content

A Phased Implementation Framework for Moroccan Digital and Knowledge Sovereignty

Project description

๐Ÿ‡ฒ๐Ÿ‡ฆ Moroccan Digitalโ€“Physical Factory (MDPF) v1.0.0

A Phased Implementation Framework for Moroccan Digital and Knowledge Sovereignty

From Regional Cloud Hosting to National Knowledge Sovereignty

License DOI Paper GitLab GitHub ORCID


A Phased, Low-Capital-First Pathway to Moroccan Digital and Knowledge Sovereignty, Grounded in the Maroc Digital 2030 / Cloud First Policy Landscape

Independent research paper ยท June 2026


๐Ÿ“‹ Table of Contents


๐ŸŒ Overview

The Moroccan Digitalโ€“Physical Factory (MDPF) is a conceptual and practical framework proposing how Morocco's pursuit of digital and knowledge sovereignty can be built incrementally rather than all at once. Sovereignty in this domain rests on a pyramid of five interdependent capacities โ€” physical infrastructure, processing, semantic representation, cultural identity, and governance โ€” that cannot realistically be constructed simultaneously without prohibitive capital expenditure.

This project proposes a phased, low-capital-first implementation pathway: instead of beginning with nationally owned data-center construction, independent researchers and small institutions can start with regionally hosted or Morocco-resident cloud computing, deferring sovereign physical infrastructure to a later maturation stage. The highest-leverage, lowest-cost entry points are the processing layer (classification, clustering, retrieval) and the cultural-identity layer (Moroccan-language and dialectal corpora, particularly Darija) โ€” both of which require methodological and linguistic expertise rather than capital.

๐Ÿง  Core argument: The conventional bottom-up build order โ€” infrastructure before processing before representation before culture before governance โ€” should be inverted in execution while preserved as a conceptual hierarchy. Demonstrated value at the processing and cultural-identity layers is what subsequently attracts infrastructure partnerships and institutional attention, not the reverse.

The framework is grounded in Morocco's existing institutional landscape: the national high-performance computing capacity already operated by Mohammed VI Polytechnic University (UM6P), the government's 2025โ€“2030 "Cloud First" roadmap under the Digital Morocco 2030 strategy, and emerging sovereign data-center investment such as the EcoDar facility in Dakhla.


๐ŸŽฏ Why This Framework

Digital sovereignty has moved from an abstract policy aspiration to an operational requirement for national governments. For Morocco, this is codified in the Maroc Digital 2030 strategy (launched September 2024 under the Ministry of Digital Transition and Administrative Reform), which sets explicit targets for AI adoption, public-service digitalization, and specialized digital infrastructure. A central pillar, formalized in December 2025, is the national "Cloud First" roadmap for 2025โ€“2030, which frames cloud adoption as a lever of national sovereignty rather than a purely technical choice.

A practical gap exists between this national-level policy ambition and the operational reality faced by independent researchers, small laboratories, and emerging technology teams who want to contribute to Moroccan knowledge infrastructure but cannot deploy capital at the scale of state-backed data-center programs. MDPF addresses that gap directly.


๐Ÿ—๏ธ The Five-Layer Framework

# Layer Approach Who Leads
1 Data Infrastructure Deferred, rented โ€” not built State / quasi-state institutions (long term)
2 Processing (classification, clustering, retrieval) Recommended entry point โ€” rented compute, open-source tooling Independent researchers, small teams
3 Digital / Semantic Representation Fine-tune open-weight models on Moroccan corpora Researchers, university partnerships (ENSIAS, INPT)
4 Cultural & Knowledge Identity Structure Moroccan legal, historical, cultural data into machine-usable datasets Best suited to individuals / small teams
5 Governance & Sovereignty National education, research, and policy integration Ministry of Digital Transition, ANRT, ADD

Layer 1 โ€” Data Infrastructure (Deferred, Rented, Not Built)

Entry-level implementation rents compute from Morocco-resident commercial cloud regions (e.g., Oracle's Casablanca and Settat regions), regional North African/European cloud zones with low-latency connectivity to Morocco, or general-purpose international providers โ€” with an explicit migration plan toward in-country hosting once available. This removes the single largest capital barrier: energy, GPU/TPU procurement, and physical security.

Layer 2 โ€” Processing (Classification, Clustering, Retrieval)

The recommended primary entry point. Classification (sorting data by Moroccan dialectal and administrative context), clustering (building knowledge groups across law, culture, education), and retrieval (activating stored knowledge on demand) require methodological expertise and curated data rather than capital. A working prototype can be built on rented compute with open-source tooling.

Layer 3 โ€” Digital Representation (Semantic Models)

Fine-tuning existing open-weight language models on Moroccan-specific corpora (Darija text, administrative/legal documents, regional news) is computationally lighter than foundation-model training and runs on the same rented compute as Layer 2.

Layer 4 โ€” Cultural and Knowledge Identity

Structuring and documenting Moroccan legal, historical, and cultural knowledge into machine-usable datasets, published on open-science platforms (Zenodo, OSF, HuggingFace) under DOI and preregistration practices, creates durable, citable infrastructure without requiring institutional permission or capital.

Layer 5 โ€” Governance and Sovereignty

Embedding this work into national education, research, and policy structures requires engagement with state institutions โ€” the Ministry of Digital Transition, ANRT, the Agence de Dรฉveloppement du Digital (ADD). This is explicitly out of scope for direct execution by independent researchers; demonstrated output at Layers 2 and 4 is the credible route to institutional attention.


๐Ÿ—บ๏ธ Phased Implementation Roadmap

Phase Name Description
0 Data & corpus assembly Collect and clean Moroccan-language and domain-specific datasets (Darija text, legal/administrative documents, cultural archives); local hardware only
1 Rented processing prototype Deploy classification, clustering, retrieval pipelines on regional/international cloud compute; validate and publish methodology
2 Semantic fine-tuning Fine-tune an open-weight LLM on assembled corpora; benchmark against general-purpose models on Moroccan-specific tasks
3 Open publication & dataset release Publish datasets, model weights (where licensing permits), and methodology papers with DOIs
4 Institutional partnership Approach UM6P or in-country cloud regions (Oracle Casablanca/Settat, future sovereign facilities like EcoDar) for scaled compute
5 Policy engagement Engage ANRT, the Ministry of Digital Transition, or ADD regarding inclusion in national education, research, or policy frameworks

๐Ÿ›๏ธ Institutional and Policy Context

National Cloud Policy โ€” The Cloud First roadmap, presented to the House of Representatives in late 2025, mandates a phased transition of government digital platforms toward cloud architectures, tying data residency and processing location to sovereignty, and is accompanied by forthcoming "Digital X.0" legislation on data flows, AI ethics, and interoperability.

Emerging Sovereign and Commercial Cloud Capacity โ€” Between 2024 and 2026, approximately $1.1 billion has been allocated to the broader Digital Morocco 2030 strategy. Oracle announced a $140 million investment for cloud regions in Casablanca and Settat; Microsoft, AWS, and Google Cloud are reportedly evaluating Moroccan footholds. On the sovereign side, the EcoDar green data center in Dakhla is intended to keep sensitive national data resident within Moroccan jurisdiction.

Existing HPC Capacity โ€” UM6P's African Supercomputing Center (inaugurated 2021), home to the Toubkal supercomputer, provides 3.15 petaflops of compute and ranks among the global TOP500 systems. UM6P has formalized cloud/AI partnerships with Oracle (first Oracle Lab in Africa, Casablanca 2022) and Microsoft's Africa Transformation Office โ€” a precedent for academicโ€“commercial cloud collaboration inside Morocco.


โš ๏ธ Risks and Limitations

Risk Description Mitigation
Data residency Renting non-Moroccan cloud compute, even temporarily, may tension with sovereignty objectives Encryption at rest/in transit, contractual data-deletion guarantees, avoidance of adverse jurisdictions, clear migration plan to in-country hosting
Vendor dependency Reliance on commercial cloud providers introduces lock-in and pricing exposure Build on portable, open-source tooling rather than provider-proprietary services
Legitimacy Work produced outside institutional channels may face slower state adoption Phase 4โ€“5 sequencing treats institutional/policy engagement as a consequence of demonstrated output, not a precondition

๐Ÿ—‚๏ธ Project Structure

mdpf/
โ”‚
โ”œโ”€โ”€ README.md                          # This file
โ”œโ”€โ”€ README_ar.md                       # Arabic version
โ”œโ”€โ”€ LICENSE                            # MIT License
โ”œโ”€โ”€ CONTRIBUTING.md                    # Contribution guidelines
โ”œโ”€โ”€ CHANGELOG.md                       # Version history
โ”‚
โ”œโ”€โ”€ paper/                             # Source paper and references
โ”‚   โ”œโ”€โ”€ moroccan_digital_factory_paper.docx
โ”‚   โ””โ”€โ”€ references.bib
โ”‚
โ”œโ”€โ”€ docs/                              # Documentation
โ”‚   โ”œโ”€โ”€ index.md
โ”‚   โ”œโ”€โ”€ framework/                     # Per-layer documentation
โ”‚   โ”‚   โ”œโ”€โ”€ layer1_infrastructure.md
โ”‚   โ”‚   โ”œโ”€โ”€ layer2_processing.md
โ”‚   โ”‚   โ”œโ”€โ”€ layer3_representation.md
โ”‚   โ”‚   โ”œโ”€โ”€ layer4_cultural_identity.md
โ”‚   โ”‚   โ””โ”€โ”€ layer5_governance.md
โ”‚   โ””โ”€โ”€ roadmap/                       # Per-phase documentation
โ”‚       โ”œโ”€โ”€ phase0_corpus.md
โ”‚       โ”œโ”€โ”€ phase1_prototype.md
โ”‚       โ”œโ”€โ”€ phase2_finetuning.md
โ”‚       โ”œโ”€โ”€ phase3_publication.md
โ”‚       โ”œโ”€โ”€ phase4_partnership.md
โ”‚       โ””โ”€โ”€ phase5_policy.md
โ”‚
โ”œโ”€โ”€ corpora/                           # Moroccan-language datasets (Phase 0)
โ”‚   โ”œโ”€โ”€ darija/
โ”‚   โ”œโ”€โ”€ legal_administrative/
โ”‚   โ””โ”€โ”€ cultural_archives/
โ”‚
โ”œโ”€โ”€ pipelines/                         # Processing pipelines (Phase 1)
โ”‚   โ”œโ”€โ”€ classification/
โ”‚   โ”œโ”€โ”€ clustering/
โ”‚   โ””โ”€โ”€ retrieval/
โ”‚
โ””โ”€โ”€ models/                            # Fine-tuned semantic models (Phase 2)

๐Ÿ”ฌ Research Agenda

Empirical validation of this framework is an open research agenda, including:

  • Benchmarking classification/clustering/retrieval pipelines on Darija and Moroccan administrative corpora against general-purpose multilingual baselines
  • Quantifying the cost differential between rented regional cloud compute and projected in-country sovereign hosting
  • Documenting reuse and citation metrics for openly published Moroccan-language datasets as a proxy for "demonstrated value" preceding institutional partnership
  • Formalizing data-residency and data-deletion contractual templates suitable for independent researchers operating on commercial cloud infrastructure

๐Ÿค Contributing

We welcome contributions from NLP researchers, Moroccan dialectology and Darija specialists, cloud/infrastructure engineers, and policy researchers.

# 1. Fork and clone
git clone https://gitlab.com/YOUR_USERNAME/mdpf.git

# 2. Create a feature branch
git checkout -b feature/your-feature-name

# 3. Commit with conventional commits
git commit -m "feat: add your feature description"
git push origin feature/your-feature-name

# 4. Open a Merge Request on GitLab

Priority contribution areas:

  • Darija and Moroccan dialectal corpus collection and cleaning
  • Open-weight model fine-tuning recipes for Moroccan administrative/legal text
  • Classification, clustering, and retrieval pipeline implementations (Layer 2)
  • Cultural and historical knowledge dataset structuring (Layer 4)
  • Documentation translation (Arabic, French, English)

๐Ÿ“– Citation

@article{Baladi2026MDPF,
  title   = {A Phased Implementation Framework for the Moroccan Digital--Physical
             Factory: From Regional Cloud Hosting to National Knowledge Sovereignty},
  author  = {Baladi, Samir},
  year    = {2026},
  month   = {June},
  note    = {Independent research paper}
}

๐Ÿ‘ค Author

Name Role Affiliation
Samir Baladi Author ยท Framework design ยท Analysis Independent Researcher, Ronin Institute / Rite of Renaissance

Corresponding author: Samir Baladi ยท gitdeeper@gmail.com ยท ORCID: 0009-0003-8903-0029


๐Ÿ“„ License

This project is licensed under the MIT License โ€” see LICENSE for details.


๐Ÿ‡ฒ๐Ÿ‡ฆ MDPF โ€” Building Moroccan digital sovereignty layer by layer, starting where capital isn't the constraint.

From rented compute to national infrastructure โ€” an inverted, achievable build order.


ORCID: 0009-0003-8903-0029

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

moroccan_digital_factory-1.0.0.tar.gz (8.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

moroccan_digital_factory-1.0.0-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file moroccan_digital_factory-1.0.0.tar.gz.

File metadata

File hashes

Hashes for moroccan_digital_factory-1.0.0.tar.gz
Algorithm Hash digest
SHA256 a86f6437cd95fdcb01c0e3bff21f2d68c88e8660a0d0bb9107c0e349307f1347
MD5 504e23672451b73af3ca553ec227f7b7
BLAKE2b-256 f568654a617d67eb17df3863cf2c6e5dbfb4acc947d84bb0b1cbe54832c778b1

See more details on using hashes here.

File details

Details for the file moroccan_digital_factory-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for moroccan_digital_factory-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1f8bcff22e9a1ebbedeb288ef28fad2dbb8c975e6307fa372be404b00f236062
MD5 b8ad70fe1bb8e9eb8ee0226199f57974
BLAKE2b-256 035ac9add79819df8434b9ae078a4e5a685dd4a554c170ed25dee693d1b5e2af

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page