Made updates to high-level overview on main page

This commit is contained in:
Diego Ripley
2026-01-26 11:30:34 -05:00
parent 4ccbe9f5b9
commit e1fb86a98f
+23 -14
View File
@@ -4,12 +4,15 @@ toc: false
--- ---
## Mission ## Mission
Data for Canada exists to bridge the gap between open data availability and data usability. We curate, clean, and re-engineer high-value Canadian datasets into high-performance, analysis-ready formats for researchers, developers, and systems. Data for Canada exists to bridge the gap between open data availability and data usability. We curate, clean, and re-engineer high-value Canadian datasets into high-performance, analysis-ready formats for researchers, developers, and systems.
## The Problem ## The Problem
Canada creates incredible amounts of open data, from foundational road networks to federal census statistics. However, these datasets are often locked in legacy formats, fragmented portals, or structures that require significant engineering effort to normalize before they can be used. For a researcher or a system developer, the "time-to-insight" is often bottlenecked by data preparation. Canada creates incredible amounts of open data, from foundational road networks to federal census statistics. However, these datasets are often locked in legacy formats, fragmented portals, or structures that require significant engineering effort to normalize before they can be used. For a researcher or a system developer, the "time-to-insight" is often bottlenecked by data preparation.
## The Solution ## The Solution
We act as the transformation layer. We aggregate datasets with permissive licenses and process them into "digestible" standards optimized for modern downstream applications. We act as the transformation layer. We aggregate datasets with permissive licenses and process them into "digestible" standards optimized for modern downstream applications.
* **For Researchers:** Skip the cleaning phase. Access normalized, documented data ready for analysis. * **For Researchers:** Skip the cleaning phase. Access normalized, documented data ready for analysis.
@@ -26,13 +29,13 @@ flowchart TD
classDef consumer fill:#f3e5f5,stroke:#8e24aa,stroke-width:2px classDef consumer fill:#f3e5f5,stroke:#8e24aa,stroke-width:2px
subgraph "Data Sources" subgraph "Data Sources"
StatCan[("Statistical Products")]:::source StatProducts[("Statistical Products")]:::source
Orthoimagery[("Orthoimagery")]:::source Orthoimagery[("Orthoimagery")]:::source
end end
subgraph "Processing Pipeline" subgraph "Processing Pipeline"
Raw[Raw Data Ingestion<br/>CSVs, Shapefiles, ECW]:::process Raw[Raw Data Ingestion<br/>CSVs, Shapefiles, ECW]:::process
Transform[Transformation Engine<br/>Open & Closed Source]:::process Transform[Transformation Engine]:::process
Opt[Optimization]:::process Opt[Optimization]:::process
end end
@@ -43,35 +46,41 @@ flowchart TD
end end
subgraph "Distribution Infrastructure" subgraph "Distribution Infrastructure"
S3_COMPLIANT_STORAGE[S3 Compliant Storage]:::storage ObjectStorage[Object Storage]:::storage
Decentralized_Distribution[Decentralized Distribution]:::storage DecentralizedDistribution[Decentralized Distribution]:::storage
Serverless[Cloudflare Worker<br/>API & Serving]:::storage Serverless[Serverless Worker<br/>API & Serving]:::storage
end end
subgraph "Consumption / End Users" subgraph "Consumption / End Users"
DataSci[DuckDB, Python, QGIS, Jupyter]:::consumer DataSci[DuckDB, Python, QGIS, Jupyter]:::consumer
WebApps[Web Applications]:::consumer WebApps[Web Applications]:::consumer
DataSci[Python]:::consumer DataSci[Python, R, Julia]:::consumer
Systems[Systems]:::consumer Systems[Systems]:::consumer
end end
%% Relationships %% Relationships
StatCan --> Raw StatProducts --> Raw
Raw --> Transform Raw --> Transform
Transform --> Opt Transform --> Opt
Opt --> Parquet Opt --> Parquet
Opt --> PMTiles Opt --> PMTiles
Opt --> FlatGeoBuf Opt --> FlatGeoBuf
Parquet --> S3_COMPLIANT_STORAGE Parquet --> ObjectStorage
PMTiles --> S3_COMPLIANT_STORAGE PMTiles --> ObjectStorage
FlatGeoBuf --> S3_COMPLIANT_STORAGE FlatGeoBuf --> ObjectStorage
S3_COMPLIANT_STORAGE --> Decentralized_Distribution ObjectStorage --> DecentralizedDistribution
S3_COMPLIANT_STORAGE --> Serverless ObjectStorage --> Serverless
Decentralized_Distribution --> Systems DecentralizedDistribution --> Systems
Serverless --> WebApps Serverless --> WebApps
Serverless --> DataSci Serverless --> DataSci
%% Concept Annotations %% Concept Annotations
Transform -.->|"Join Spatial & Tabular"| Parquet Transform -.->|"Join Spatial & Tabular"| Parquet
PMTiles -.->|"Stream tiles"| WebApps PMTiles -.->|"Stream"| WebApps
FlatGeoBuf -.->|"Stream"|DataSci
FlatGeoBuf -.->|"Stream"|WebApps
click StatProducts "https://www.dataforcanada.org/docs/processes/statistical_products/" _blank
click Orthoimagery "https://www.dataforcanada.org/docs/processes/statistical_products/" _blank
click DecentralizedDistribution "https://www.dataforcanada.org/docs/dissemination/" _blank
``` ```