Diego Ripley
|
a55e1d325d
|
Made changes
|
2025-06-26 13:35:37 +00:00 |
|
Diego Ripley
|
4ed5fb4bbb
|
Add DuckDB example for duplicate column name
|
2025-06-25 15:38:12 +00:00 |
|
Diego Ripley
|
b71a7b326e
|
DuckDB issue with duplicate column names (ex. 'Value' and 'VALUE' are treated the same)
|
2025-06-25 15:30:36 +00:00 |
|
Diego Ripley
|
e929850d4a
|
Finish comment on issue with Value and VALUE columns being treated the same by DuckDB
|
2025-06-21 18:03:29 +00:00 |
|
Diego Ripley
|
8875722d10
|
Made changes to processing of data tables
|
2025-06-21 18:01:16 +00:00 |
|
Diego Ripley
|
7c8211cb5f
|
Found some issues with the output parquet files
|
2025-06-21 05:26:50 +00:00 |
|
Diego Ripley
|
887291d2f7
|
Read all DGUIDs from subset parquet output (100,000 records each)
|
2025-06-21 00:54:26 -04:00 |
|
Diego Ripley
|
72ca6c87e1
|
Made changes
|
2025-06-20 17:32:01 -04:00 |
|
Diego Ripley
|
5a95616b3c
|
Calculate CSV file size by viewing inside of zip file
|
2025-06-20 16:01:20 -04:00 |
|
Diego Ripley
|
e836363cd1
|
Had to optimize the code. Leaving it outside of function for now in case I need to continue working on it
|
2025-06-20 16:00:51 -04:00 |
|
Diego Ripley
|
f6d88c5fd0
|
Continue work on processing data tables
|
2025-06-19 15:58:30 -04:00 |
|
Diego Ripley
|
ab8f40c708
|
Keeping track of processed files in case processing crashes and I have to restart again
|
2025-06-19 11:46:31 -04:00 |
|
Diego Ripley
|
faa63451ab
|
Experiment with Jupyter notebook on downloading and processing statcan cubes
|
2025-06-18 21:26:51 +00:00 |
|
Diego Ripley
|
c0899080f4
|
Remove scratch files after processing. Was running out of space
|
2025-06-18 09:26:18 -04:00 |
|
Diego Ripley
|
ea603f2914
|
Convert statcan CSV into parquet
|
2025-06-17 20:46:24 +00:00 |
|