Files
d4c-community/presentations/presentation-01/slides.md
T
Diego Ripley 65583f9533 Updates
2026-04-11 09:28:47 -04:00

28 KiB
Raw Blame History

theme, layout, title, author, class, drawings, transition, comark, duration, hideInToc, favicon, routerMode
theme layout title author class drawings transition comark duration hideInToc favicon routerMode
seriph cover Data for Canada / the Universe Background and Strategy Diego Ripley text-center
persist
false
slide-up true 35min true https://www.dataforcanada.org/favicon.svg hash

Data for Canada / the Universe

Background and Strategy

Presented By: Diego Ripley

Date: April 10, 2026

<style> .slidev-layout.cover { background-image: url('/datafortheuniverse-background.webp') !important; background-size: cover !important; background-position: center !important; } h1, h2, p { text-shadow: 1px 1px 4px rgba(0,0,0,0.9); } </style>

layout: cover hideInToc: true

"Space is big. You just won't believe how vastly, hugely, mind-bogglingly big it is. I mean, you may think it's a long way down the road to the chemist's, but that's just peanuts to space."

Douglas Adams, Hitchicker's Guide to the Galaxy, #1

<style> .slidev-layout.cover { background-image: url('/dont-panic-background-01.webp') !important; background-size: cover !important; background-position: center !important; } p { text-shadow: 1px 1px 4px rgba(0,0,0,0.9); } </style>

layout: cover transition: slide-left hideInToc: true

<style> .slidev-layout.cover { background-image: url('/dont-panic-background-02.webp') !important; background-size: cover !important; background-position: center !important; } a { text-shadow: 1px 1px 4px rgba(0,0,0,0.9); } </style>

layout: two-cols-header zoom: 1.2 transition: slide-left hideInToc: true

Notes

::left::

  • Keep questions after presentation

hideInToc: true transition: slide-left zoom: 0.8

Table of Contents


layout: center hideInToc: true

Guide By


layout: center transition: slide-left hideInToc: true

In Plain Words

  • Make sure that data lasts as long as humanly possible presenting all perspectives by creating efficient data and processes for long-term archival.
  • I want what I am building to be in libraries.
  • Create processes, tools, infrastructure, datasets to empower everyday citizens, to make their lifes just a little easier, to filter all of the noise.
  • Yes, ethics is at the core of everything.

layout: center


layout: center

flowchart TD
    classDef linkNode stroke:#0000EE,color:#0000EE,stroke-width:2px;
    subgraph mirrors [Mirrors & Preservation]
        SourceCoop[Source Cooperative]
        Tigris[Tigris]
        Community[Community]
        Cloudflare
        Zenodo[Zenodo]
        InternetArchive[Internet Archive]
        Metadata[FAIR Data Catalogue]
    end

    Sources[Open Data Sources]
    Processes[Data Packages]
    Artifacts[Systems-Ready Data]
    P2P["P2P Technology"]
    
    subgraph Consumers [Consumption]
        Users[Data People & Developers]
        Systems[Systems]
    end

    %% Flow with Animations
    Sources a1@<--> Processes
    a1@{animate: true, animation: slow}
    
    Processes a2@<--> Artifacts
    a2@{animate: true, animation: slow}
    
    Artifacts a3@<--> Metadata
    a3@{animate: true, animation: fast}

    Metadata a20@<--> SourceCoop
    a20@{animate: true, animation: slow}
    Metadata a21@<--> Tigris
    a21@{animate: true, animation: fast}
    Metadata a22@<--> Community
    a22@{animate: true, animation: fast}
    Metadata a23@<--> Zenodo
    a23@{animate: true, animation: slow}
    Metadata a24@<--> Cloudflare
    a24@{animate: true, animation: fast}
    Metadata a25@<--> InternetArchive
    a25@{animate: true, animation: slow}
    
    %% Mirror Connections
    mirrors a12@<--> Consumers
    a12@{animate: true, animation: slow}
    

    %% Hint, the FAIR Data Catalogue can also be decentralized 🤯
    %%Metadata a30@<--> P2P
    %%a30@{animate: true, animation: fast}
    mirrors a9@<--> P2P
    a9@{animate: true, animation: fast}

    %% P2P Connections
    P2P a10@<--> Consumers
    a10@{animate: true, animation: fast}
    
    style Sources fill:#FFB74D,stroke:#EF6C00,color:#000000
    style Artifacts fill:#B71C1C,stroke:#7F0000,color:#FFFFFF
    %% Opera concertmaster
    style Metadata fill:#B71C1C,stroke:#7F0000,color:#FFFFFF
    class Metadata Metadata
    style Processes fill:#B71C1C,stroke:#7F0000,color:#FFFFFF
    class Processes Processes
    style SourceCoop fill:#B71C1C,stroke:#7F0000,color:#FFFFFF
    style Tigris fill:#B71C1C,stroke:#7F0000,color:#FFFFFF
    style Cloudflare fill:#FFB74D,stroke:#EF6C00,color:#000000
    style Zenodo fill:#FFB74D,stroke:#EF6C00,color:#000000
    style Community fill:#B71C1C,stroke:#7F0000,color:#FFFFFF
    style P2P fill:#B71C1C,stroke:#7F0000,color:#FFFFFF
    style InternetArchive fill:#66BB6A,stroke:#2E7D32,color:#000000
    style Users fill:#B71C1C,stroke:#7F0000,color:#FFFFFF
    style Systems fill:#B71C1C,stroke:#7F0000,color:#FFFFFF
    
    %% Click Actions
	click P2P "https://libp2p.io/" _blank
    click Tigris "https://d4c-pkgs.t3.storage.dev/" _blank
    click Sources "https://www.dataforcanada.org/#high-level-overview" _blank
    click Processes "https://www.dataforcanada.org/docs/d4c-pkgs/" _blank
    click Metadata "https://stac-utils.github.io/stac-geoparquet/latest/spec/stac-geoparquet-spec/" _blank
    click Zenodo "https://zenodo.org/communities/dataforcanada/" _blank
    click SourceCoop "https://source.coop/dataforcanada/" _blank
    click InternetArchive "https://archive.org/details/@diegoripley/uploads/" _blank

    %% APPLY STYLES TO LINKED NODES
    class Sources linkNode

layout: center level: 1

Solutions


layout: center level: 2

Assess

Mapping data portals and all data assets in Canada.


layout: iframe-unscaled url: https://directory.opendatasociety.ca/directory level: 2


layout: center level: 2

Rank

  • Rank datasets according to impact on Canadians (ex. COVID-19 death cases by dissemination block).

layout: iframe-unscaled url: https://dataindex.us/collections/ level: 2


layout: iframe-unscaled url: ./98-301-x2021001-eng.pdf level: 2


layout: center level: 2

Archive

  • Download datasets and make into efficient long-term storage file formats.
  • Make them available to the community via something like Backblaze B2 Overdrive, which has a throughput speed ranging from 100Gbps up to 1Tbps (minimum 1PB commitment).
    • $15 USD / TB
    • $15K USD per month, $180K per year
  • Have unique identifiers to the datasets.

layout: center level: 2

  • Download them via something like geoparquet-io that enables downloading from Esri data portals and WFS servers. It supports both vector data and raster data.

layout: full level: 2 zoom: 0.9

File Formats

flowchart TD
    classDef high fill:#B71C1C,stroke:#7F0000,color:#FFFFFF
    classDef med fill:#FBC02D,stroke:#F9A825,color:#000000
    classDef low fill:#66BB6A,stroke:#2E7D32,color:#000000
    classDef medOrange fill:#FFCC80,stroke:#FB8C00,color:#000000
    classDef darkOrange fill:#EF6C00,stroke:#E65100,color:#000000
    classDef highLight fill:#FFCDD2,stroke:#E57373,color:#000000
    classDef white fill:#fff,stroke:#388E3C,color:#000000

    subgraph sot [Long-Term Storage]
        Parquet["Parquet"]:::highLight
        Lance:::high
        FlatCityBuf:::high
        Zarr:::highLight
        GeoTIFF:::medOrange
        JPEGXL["JPEG XL"]:::highLight
        AV1:::highLight
        FAIRCat["FAIR Data Catalogue"]:::high
    end

    FlatGeoBuf:::med

    subgraph vt [Vector Tiles]
        VectorTiles["Mapbox Vector Tiles"]:::low
        NextGenVT["Next-Gen Vector Tiles"]:::high
        GLB["glTF GLB"]:::high
    end

    subgraph visuals [Imagery]
        AVIF:::high
        WebP:::medOrange
        JPG:::low
        PNG:::low
    end

    subgraph pkg [Portable Databases]
        PMTiles:::medOrange
        SQLite:::darkOrange
    end

    subgraph ent [Enterprise]
        FileGDB["File Geodatabase"]:::white
    end

    sot <--> FlatGeoBuf
    FlatGeoBuf --> vt
    sot <--> visuals
    vt <--> pkg
    visuals <--> pkg
    sot <--> ent
    visuals --> ent


    style sot fill:#EF9A9A,stroke:#C62828,color:#000000
    style vt fill:#FBC02D,stroke:#F9A825,color:#000000
    style visuals fill:#FBC02D,stroke:#F9A825,color:#000000
    style pkg fill:#FFB74D,stroke:#EF6C00,color:#000000
    style ent fill:#66BB6A,stroke:#2E7D32,color:#000000

    click FlatCityBuf "https://github.com/cityjson/flatcitybuf" _blank
    click Parquet "https://github.com/apache/parquet-format/" _blank
    click FlatGeoBuf "https://flatgeobuf.org/" _blank
    click SQLite "https://www.geopackage.org/" _blank
    click FileGDB "https://gdal.org/en/stable/drivers/vector/openfilegdb.html" _blank
    click VectorTiles "https://github.com/mapbox/vector-tile-spec/" _blank
    click NextGenVT "https://github.com/maplibre/maplibre-tile-spec/" _blank
    click GLB "https://registry.khronos.org/glTF/specs/2.0/glTF-2.0.html#glb-file-format-specification" _blank
    click Lance "https://docs.lancedb.com/lance" _blank
    click GeoTIFF "https://cogeo.org/" _blank
    click Zarr "https://github.com/zarr-developers/geozarr-spec/" _blank
    click WebP "https://developers.google.com/speed/webp/" _blank
    click PMTiles "https://github.com/protomaps/PMTiles/blob/main/spec/v3/spec.md" _blank
    click JPEGXL "https://jpeg.org/jpegxl/" _blank
    click AV1 "https://aomedia.org/specifications/av1/" _blank
    click FAIRCat "https://stac-utils.github.io/stac-geoparquet/latest/spec/stac-geoparquet-spec/" _blank

layout: center level: 2

Standard Interfaces


layout: center level: 2 hideInToc: true

S3


layout: center level: 2 hideInToc: true

P2P

BitTorrent, IPFS, libp2p)

SETI@home

layout: center level: 2 hideInToc: true

Other

SSH, etc.


layout: center level: 2 hideInToc: true

Discreet Global Grid Systems (DGGS)


layout: iframe-unscaled level: 2 hideInToc: true url: https://aidggs-pilot.hartis.org/


layout: center level: 2

Unique Identifiers

  • ARKs are open, mainstream, non-paywalled, decentralized persistent identifiers that you can start creating in under 48 hours. They identify anything digital, physical, or abstract.
  • Archival Resource Key (ARK) - Spec, Overview

layout: center level: 2

Ledger

  • Unique identifier
  • Added/Updated/Deleted
  • File hash
  • Location
  • Reputation - across time by stakeholders

layout: center level: 2

Data Packages

flowchart TD
    subgraph ds [Data Sources]
        Statistical@{ shape: lean-l}
        Foundation@{ shape: lean-l}
        EnvClimate@{ shape: lean-l, label: "Environment, Climate, & Health"}
        Orthoimagery@{ shape: lean-l}
        FieldImagery@{ shape: lean-l, label: "Field Imagery"}
        Elevation@{ shape: lean-l}
        WebCorpus@{ shape: lean-l, label: "Web Corpus"}
    end

    DataPkgs@{ shape: rect, label: "Data Packages"}

    Statistical e1@<--> DataPkgs
    e1@{animate: true, animation: slow}
    Foundation e2@<--> DataPkgs
    e2@{animate: true, animation: slow}
    EnvClimate e4@<--> DataPkgs
    e4@{animate: true, animation: fast}
    Orthoimagery e3@<--> DataPkgs
    e3@{animate: true, animation: slow}
    FieldImagery e7@<--> DataPkgs
    e7@{animate: true, animation: fast}
    Elevation e5@<--> DataPkgs
    e5@{animate: true, animation: slow}
    WebCorpus e6@<--> DataPkgs
    e6@{animate: true, animation: fast}

    style EnvClimate fill:#B71C1C,stroke:#7F0000,color:#FFFFFF
    style Orthoimagery fill:#FBC02D,stroke:#F9A825,color:#000000
    style FieldImagery fill:#FBC02D,stroke:#F9A825,color:#000000
    style WebCorpus fill:#66BB6A,stroke:#2E7D32,color:#000000
    style Elevation fill:#66BB6A,stroke:#2E7D32,color:#000000
    style Statistical fill:#B71C1C,stroke:#7F0000,color:#FFFFFF
    style Foundation fill:#B71C1C,stroke:#7F0000,color:#FFFFFF
    style DataPkgs fill:#B71C1C,stroke:#7F0000,color:#FFFFFF

    classDef linkNode stroke:#333333,color:#333333,stroke-width:1.5px
    class FieldImagery linkNode

    click DataPkgs "https://github.com/dataforcanada/d4c-pkgs" _blank
    click Foundation "https://github.com/dataforcanada/d4c-datapkg-foundation" _blank
    click Statistical "https://github.com/dataforcanada/d4c-datapkg-statistical" _blank
    click Orthoimagery "https://github.com/dataforcanada/d4c-datapkg-orthoimagery" _blank
    click FieldImagery "https://github.com/dataforcanada/d4c-datapkg-field-imagery" _blank
    click EnvClimate "https://github.com/dataforcanada/d4c-datapkg-environment-climate-health" _blank
    click Elevation "https://github.com/dataforcanada/d4c-datapkg-elevation" _blank
    click WebCorpus "https://github.com/dataforcanada/d4c-datapkg-web-corpus" _blank

layout: center hideInToc: true

Can download all datasets at https://source.coop/dataforcanada


layout: center level: 3

Statistical

  • This is how our governments see the world and how what they are supposed to use when making decisions.
  • We need to request that statistical data be tied to individual authors, so that we can start to trust institutions. If someone's credibility in the community becomes a factor, I believe that individuals will fight to keep their credibility with the community.
  • Open processes.

layout: center level: 3


layout: center level: 4 hideInToc: true

Statistical Tables

  • I did this in 2025 for 7918 Statistics Canada data tables.
  • Started with 3314.57 GB of CSVs and turned them into 25.73 GB.

layout: iframe-unscaled hideInToc: true level: 4 url: https://www.diegoripley.ca/blog/2025/what-i-learned-from-processing-all-statcan-tables/


layout: center hideInToc: true level: 4

https://source.coop/dataforcanada/d4c-datapkg-statistical/processed/tables


layout: center level: 4 hideInToc: true

Census Data


layout: iframe-unscaled hideInToc: true level: 4 url: https://docs.google.com/spreadsheets/d/14FmFGaqU7EDZ19zRZXBNX4La4VeIDXa7kbgP_g7ai9s/edit?usp=sharing


layout: iframe-unscaled hideInToc: true level: 4 url: https://static-01.dataforcanada.org/processed/ca_statcan_2021A000011124_d4c-datapkg-statistical_census_pop_dissemination_areas_digital_2021_v0.1.0-beta/#12.2/45.4294/-75.74374/0/60


layout: iframe-unscaled hideInToc: true level: 4 url: https://static-01.dataforcanada.org/processed/ca_statcan_2021A000011124_d4c-datapkg-statistical_census_pop_federal_electoral_districts_2013_representation_order_digital_2021_v0.1.0-beta/#4.93/56.91/-111.54


layout: center

2021 Census Data


layout: center level: 3

Foundation

  • Minimum information that a civilization needs to start from scratch.
  • Buildings, roads, address points.
  • Placenames

layout: iframe-unscaled hideInToc: true level: 4 url: https://pmtiles.io/#url=https%3A%2F%2Fdata.source.coop%2Fdataforcanada%2Fd4c-datapkg-foundation%2Fprocessed%2Fca_statcan_2021A000011124_d4c-datapkg-foundation_open_database_of_buildings_2025-04-15_v0.1.0-beta.pmtiles&map=15.17/45.402295/-75.691511


layout: iframe-unscaled hideInToc: true level: 4 url: https://pmtiles.io/#url=https%3A%2F%2Fdata.source.coop%2Fdataforcanada%2Fd4c-datapkg-foundation%2Fprocessed%2Fca_statcan_2021A000011124_d4c-datapkg-foundation_road_network_2021_v0.1.0-beta.pmtiles&map=15.17/45.402295/-75.691511


layout: iframe-unscaled hideInToc: true level: 4 url: https://pmtiles.io/#url=https%3A%2F%2Fdata.source.coop%2Fdataforcanada%2Fd4c-datapkg-foundation%2Fprocessed%2Fca_statcan_2021A000011124_d4c-datapkg-foundation_national_address_register_2025-07_v0.1.0-beta.pmtiles&map=15.17/45.402295/-75.691511


layout: center level: 3

Environment, Climate and Health


layout: center hideInToc: true level: 4

Source, Internet Archive Snapshot


layout: center hideInToc: true level: 4


layout: center hideInToc: true level: 4


layout: center hideInToc: true level: 4


layout: center hideInToc: true level: 4


layout: center hideInToc: true level: 4

And now citizens can all create their own air quality stations.

See opensensor.space for more information.


layout: center level: 3

Orthoimagery


layout: center level: 3

https://github.com/dataforcanada/d4c-datapkg-orthoimagery/issues


layout: iframe-unscaled url: https://pmtiles.io/#url=https%3A%2F%2Fdata.source.coop%2Fdataforcanada%2Fd4c-datapkg-orthoimagery%2Fprocessed%2Fca-on_province_of_ontario-2024A000235_drape_eastern_ontario_orthoimagery_2024_16cm_v0.1.0-beta.pmtiles&map=8.02/45.196/-76.357


layout: center level: 3 hideInToc: true

  • Currently working on downloading 100TB of QC, CAN orthoimagery.

layout: center level: 3

Web Corpus

Source


layout: center level: 3

Field Imagery

  • Latitude, Longitude, heading
  • Can be audio, video, etc.
  • Any device (ex. drone, webcam, )

layout: two-cols-header level: 3 hideInToc: true

::left::

Toronto Englinton LRT

::right::

Your browser does not support videos. You may download it here.


layout: center

Your browser does not support videos. You may download it here.


layout: center

Download


layout: center

Your browser does not support videos. You may download it here.


layout: center

Download


layout: center level: 2

Communicate to Your Audience and Create Trust


layout: center hideInToc: true

Collaboration

  • In essence: speak people's language. A scientist might be interested in the facts, other stakeholders into other things.

layout: center

Matrix Bridges


layout: center hideInToc: true

Questions?

Main Website · GitHub