--- title: Welcome to Data for Canada toc: false --- ## Mission Data for Canada exists to bridge the gap between open data availability and data usability. We curate, clean, and re-engineer high-value Canadian datasets into high-performance, analysis-ready formats for researchers, developers, and systems. ## The Problem Canada creates incredible amounts of open data, from foundational road networks to federal census statistics and orthoimagery. However, these datasets are often locked in legacy formats, fragmented portals, or structures that require significant engineering effort to normalize. For a researcher or system developer, the "time-to-insight" is often bottlenecked by data preparation. ## The Solution We act as the transformation layer. We aggregate datasets with permissive licenses and process them into "digestible" standards optimized for modern downstream applications. * **For Data Engineers, Researchers/Scientists, and Developers:** Skip the cleaning phase. Access normalized, documented data ready for analysis. * **For Systems:** Standardized data structures designed to feed directly into pipelines, data warehouses, and downstream services. **Our Stewardship:** Data for Canada takes ownership of the datasets we create, from start to finish. We ensure that data structures remain consistent, allowing for reliable analysis across **time and space**. ## What Guides Us We prioritize our work in a utilitarian manner, aiming to provide the greatest amount of good to the greatest amount of individuals, though we remain open to making exceptions where necessary. Our approach is informed by the following: * [Guidance on assessing readiness to manage data according to Findable, Accessible, Interoperable, Reusable (FAIR) principles](https://www.canada.ca/en/government/system/digital-government/digital-government-innovations/information-management/guidance-assessing-readiness-manage-data-according-findable-accessible-interoperable-reusable-principles.html) * [GC White Paper: Data Sovereignty and Public Cloud](https://www.canada.ca/en/government/system/digital-government/digital-government-innovations/cloud-services/digital-sovereignty/gc-white-paper-data-sovereignty-public-cloud.html) ## High-Level Overview ```mermaid flowchart TD subgraph ds [Data Sources] Statistical@{ shape: lean-l} Foundation@{ shape: lean-l} Orthoimagery@{ shape: lean-l} FieldImagery@{ shape: lean-l, label: "Field Imagery"} EnvironmentClimate@{ shape: lean-l, label: "Environmental & Climate"} Elevation@{ shape: lean-l} WebCorpus@{ shape: lean-l, label: "Web Corpus"} end subgraph pp [Processing Pipeline] Raw@{ shape: rect, label: "Raw Data Ingestion"} Transform@{ shape: rect, label: "Transform and Optimize"} end subgraph df [Dissemination Formats] Parquet@{ shape: lean-l} FlatGeoBuf@{ shape: lean-l} MVT@{ shape: lean-l} MLT@{ shape: lean-l} PMTiles@{ shape: lean-l} COG@{ shape: lean-l} Zarr@{ shape: lean-l} WebP@{ shape: lean-l} JPEGXL@{ shape: lean-l, label: "JPEG XL"} AV1@{ shape: lean-l, label: "AV1"} end subgraph di [Distribution Infrastructure] ObjectStorage@{ shape: bow-rect, label: "Object Storage"} Metadata@{ shape: rect} HTTP@{ shape: rect, label: "Static Files"} DecentralizedDistribution@{ shape: rect, label: "Decentralized Distribution"} end subgraph ei [Experimental Infrastructure] GeoSpatialServices@{ shape: rect, label: "Geospatial Services"} %%Martin@{ shape: rect} %%GeoServer@{ shape: rect} %%ZOOProject@{ shape: rect, label: "ZOO-Project"} %%BBOXServer@{ shape: rect, label: "BBOX Server"} %%Panoramax@{ shape: rect} %%Pelias@{ shape: rect} end subgraph "Consumption" DataSci@{ shape: rect, label: "Researchers & Developers"} Systems@{ shape: rect, label: "Systems"} end %% Relationships Statistical a1@--> Raw a1@{animate: true, animation: slow} Foundation a2@--> Raw a2@{animate: true, animation: slow} Orthoimagery a3@--> Raw a3@{animate: true, animation: slow} FieldImagery a4@--> Raw a4@{animate:true, animation: fast} EnvironmentClimate a5@--> Raw a5@{animate: true, animation: fast} Elevation a6@--> Raw a6@{animate: true, animation: slow} WebCorpus a7@--> Raw a7@{animate: true, animation: fast} Raw a8@--> Transform a8@{animate: true, animation: slow} Transform a9@--> df a9@{animate: true, animation: slow} Parquet a10@--> FlatGeoBuf a10@{animate: true, animation: slow} FlatGeoBuf a11@--> MVT a11@{animate: true, animation: slow} FlatGeoBuf a91@--> MLT a91@{animate: true, animation: slow} MVT a90@ --> PMTiles a90@{animate: true, animation: slow} MLT a92@ --> PMTiles a92@{animate: true, animation: slow} Zarr a12@ --> WebP a12@{animate: true, animation: slow} df a13@ --> di a13@{animate: true, animation: slow} COG a14@--> WebP a14@{animate: true, animation: slow} WebP a93@--> PMTiles a93@{animate: true, animation: slow} ObjectStorage a15@--> Metadata a15@{animate: true, animation: slow} Metadata a16@--> HTTP a16@{animate: true, animation: slow} HTTP a17@--> ei a17@{animate: true, animation: slow} HTTP a18@--> DecentralizedDistribution a18@{animate: true, animation: slow} HTTP a19@--> DataSci a19@{animate: true, animation: slow} DecentralizedDistribution a20@--> Systems a20@{animate: true, animation: fast} DecentralizedDistribution a21@--> DataSci a21@{animate: true, animation: fast} Systems a22@ --> DataSci a22@{animate: true, animation: fast} ei a23@ --> DataSci a23@{animate: true, animation: slow} %% URLs click Foundation "https://github.com/dataforcanada/process-foundation-labs/" _blank click Statistical "https://github.com/dataforcanada/process-statistical-labs/" _blank click Orthoimagery "https://github.com/dataforcanada/process-orthoimagery-labs/" _blank click FieldImagery "https://github.com/dataforcanada/process-field-imagery-labs/" _blank click EnvironmentClimate "https://github.com/dataforcanada/process-environmental-climate-labs/" _blank click Elevation "https://www.dataforcanada.org/docs/dissemination/" _blank click WebCorpus "https://github.com/dataforcanada/process-web-corpus-labs/" _blank click Parquet "https://github.com/apache/parquet-format/" _blank click FlatGeoBuf "https://flatgeobuf.org/" _blank click MVT "https://github.com/mapbox/vector-tile-spec/" _blank click MLT "https://github.com/maplibre/maplibre-tile-spec/" _blank click COG "https://cogeo.org/" _blank click Zarr "https://github.com/zarr-developers/geozarr-spec/" _blank click WebP "https://developers.google.com/speed/webp/" _blank click PMTiles "https://github.com/protomaps/PMTiles/blob/main/spec/v3/spec.md" _blank click JPEGXL "https://jpeg.org/jpegxl/" _blank click AV1 "https://aomedia.org/specifications/av1/" _blank click DecentralizedDistribution "https://www.dataforcanada.org/docs/dissemination/" _blank click Metadata "https://stac-utils.github.io/stac-geoparquet/latest/spec/stac-geoparquet-spec/" _blank click GeoSpatialServices "https://github.com/dataforcanada/geo-services-labs/" _blank click Martin "https://martin.maplibre.org/" _blank click GeoServer "https://geoserver.org/" _blank click ZOOProject "https://zoo-project.org/" _blank click BBOXServer "https://www.bbox.earth/" _blank click Panoramax "https://gitlab.com/panoramax" _blank click Pelias "https://pelias.io" _blank ```