Diego Ripley
|
dff7ea6fad
|
Fix mistake in creating pop_ctr Census of Population file. Select distinct pop_ctr_dguid before joining to Census of Pop attribute data
|
2025-09-18 18:52:45 +00:00 |
|
Diego Ripley
|
1a7ea7e40a
|
Update GDAL, install some other Python packages, and change port for postgres
|
2025-09-16 18:18:55 +00:00 |
|
Diego Ripley
|
901d051567
|
Made modifications and processed CMA for 2021 Census of Population
|
2025-09-13 15:13:29 +00:00 |
|
Diego Ripley
|
f5a2831cf6
|
Add code that was used for https://www.diegoripley.ca/files/census_of_population_vector_tiles_subset_august_12_2025/
|
2025-08-10 10:02:41 -04:00 |
|
Diego Ripley
|
bc9e7b5f8c
|
Made improvements
|
2025-08-09 17:02:35 +00:00 |
|
Diego Ripley
|
bd246e297a
|
2021 Census of Population for capital cities in Canada, and Toronto, at the Dissemination Area (DA) level
|
2025-08-08 13:24:31 -04:00 |
|
Diego Ripley
|
7d33ab4587
|
Playing around with uploading data to Zenodo
|
2025-07-10 17:04:28 +00:00 |
|
Diego Ripley
|
d38fb18de1
|
Add Census of Population 2021 example that describes the structure of the data
|
2025-07-10 16:56:42 +00:00 |
|
Diego Ripley
|
6b42a80529
|
Expand example to country
|
2025-07-10 16:11:11 +00:00 |
|
Diego Ripley
|
7462ab15bf
|
Rename presentation example
|
2025-07-10 12:09:06 -04:00 |
|
Diego Ripley
|
2c22c4fc10
|
Add example for presentation
|
2025-07-10 11:19:26 -04:00 |
|
Diego Ripley
|
aafb10b261
|
Add image showing download speed of torrent
|
2025-06-28 21:36:24 +00:00 |
|
Diego Ripley
|
bf653b7160
|
Add s2maps torrent example
|
2025-06-28 17:17:25 -04:00 |
|
Diego Ripley
|
ed319c2e92
|
Create torrent of a dataset. Has HTTP seed URLs that work (when created through py3createtorrent). I am able to max out my internet connection when downloading the torrent.
|
2025-06-28 21:08:05 +00:00 |
|
Diego Ripley
|
4d8fb1e7e9
|
Download datasets using rclone. HTTP end-point is https://data-01.dataforcanada.org
|
2025-06-27 09:47:31 -04:00 |
|
Diego Ripley
|
6ad2e2c4d6
|
Scraping the table names from from https://www150.statcan.gc.ca/n1/en/type/data
Will compare against the productIds available at https://www150.statcan.gc.ca/t1/wds/rest/getAllCubesListLite
|
2025-06-27 09:12:08 -04:00 |
|
Diego Ripley
|
b88a2272b4
|
Clean Dockerfile
|
2025-06-27 08:49:32 -04:00 |
|
Diego Ripley
|
79a37a3afc
|
Some fixes for ubuntu user
|
2025-06-27 08:47:11 -04:00 |
|
Diego Ripley
|
2f346bfa4c
|
Fix #4. Needed to mount ~/.ssh and make sure that it has the same UID/GID as the host user
|
2025-06-26 15:14:33 +00:00 |
|
Diego Ripley
|
a55e1d325d
|
Made changes
|
2025-06-26 13:35:37 +00:00 |
|
Diego Ripley
|
4ed5fb4bbb
|
Add DuckDB example for duplicate column name
|
2025-06-25 15:38:12 +00:00 |
|
Diego Ripley
|
b71a7b326e
|
DuckDB issue with duplicate column names (ex. 'Value' and 'VALUE' are treated the same)
|
2025-06-25 15:30:36 +00:00 |
|
Diego Ripley
|
e929850d4a
|
Finish comment on issue with Value and VALUE columns being treated the same by DuckDB
|
2025-06-21 18:03:29 +00:00 |
|
Diego Ripley
|
8875722d10
|
Made changes to processing of data tables
|
2025-06-21 18:01:16 +00:00 |
|
Diego Ripley
|
7c8211cb5f
|
Found some issues with the output parquet files
|
2025-06-21 05:26:50 +00:00 |
|
Diego Ripley
|
887291d2f7
|
Read all DGUIDs from subset parquet output (100,000 records each)
|
2025-06-21 00:54:26 -04:00 |
|
Diego Ripley
|
72ca6c87e1
|
Made changes
|
2025-06-20 17:32:01 -04:00 |
|
Diego Ripley
|
5a95616b3c
|
Calculate CSV file size by viewing inside of zip file
|
2025-06-20 16:01:20 -04:00 |
|
Diego Ripley
|
e836363cd1
|
Had to optimize the code. Leaving it outside of function for now in case I need to continue working on it
|
2025-06-20 16:00:51 -04:00 |
|
Diego Ripley
|
f6d88c5fd0
|
Continue work on processing data tables
|
2025-06-19 15:58:30 -04:00 |
|
Diego Ripley
|
ab8f40c708
|
Keeping track of processed files in case processing crashes and I have to restart again
|
2025-06-19 11:46:31 -04:00 |
|
Diego Ripley
|
faa63451ab
|
Experiment with Jupyter notebook on downloading and processing statcan cubes
|
2025-06-18 21:26:51 +00:00 |
|
Diego Ripley
|
c0899080f4
|
Remove scratch files after processing. Was running out of space
|
2025-06-18 09:26:18 -04:00 |
|
Diego Ripley
|
f85ce79ff2
|
Add polars to Dockerfile
|
2025-06-17 21:19:42 +00:00 |
|
Diego Ripley
|
ea603f2914
|
Convert statcan CSV into parquet
|
2025-06-17 20:46:24 +00:00 |
|
Diego Ripley
|
4d16ef8232
|
Process 2024-06 national address register
|
2025-06-07 15:39:28 +00:00 |
|
Diego Ripley
|
b3c5f8767f
|
Move export of 2021 geographic boundaries
|
2025-06-06 22:51:49 +00:00 |
|
Diego Ripley
|
5e26ec282e
|
Process 2024-12 national address register. Still need to make some improvements
|
2025-06-06 22:49:47 +00:00 |
|
Diego Ripley
|
2350d6d8d7
|
Update README files
|
2025-06-06 11:20:41 +00:00 |
|
Diego Ripley
|
ce999a3a4d
|
Remove unused code
|
2025-06-04 17:02:21 +00:00 |
|
Diego Ripley
|
3565b7c5a4
|
Update DuckDB lonboard example
|
2025-06-04 17:01:04 +00:00 |
|
Diego Ripley
|
fad884efc8
|
Create 2021 cartographic boundary files
|
2025-06-04 16:36:45 +00:00 |
|
Diego Ripley
|
13a86dc3dc
|
Add 2006, 2011, 2016 cartographic boundary files
|
2025-06-03 14:34:53 +00:00 |
|
Diego Ripley
|
0203e04b45
|
Add 2021 cartographic boundary files
|
2025-06-03 13:33:03 +00:00 |
|
Diego Ripley
|
23e3133118
|
Add 2021 cartographic boundary files
|
2025-06-02 22:03:49 +00:00 |
|
Diego Ripley
|
5a188a469c
|
Update README
|
2025-06-02 14:13:58 +00:00 |
|
Diego Ripley
|
7e0b001075
|
Enable downloading of 2021 hydro
|
2025-06-02 13:56:50 +00:00 |
|
Diego Ripley
|
4eae3622d2
|
Update DuckDB lonboard example
|
2025-05-30 22:30:35 +00:00 |
|
Diego Ripley
|
34d2c50046
|
Update DuckDB longboard example
|
2025-05-30 17:29:58 +00:00 |
|
Diego Ripley
|
bd7e8ee9f4
|
Should have used count_total_4 in calculation instead of count_total_1
|
2025-05-30 16:15:27 +00:00 |
|