[HN Gopher] Kart: DVC for geospatial and tabular data. Git for GIS
___________________________________________________________________
Kart: DVC for geospatial and tabular data. Git for GIS
Author : starkparker
Score : 84 points
Date : 2023-10-30 18:22 UTC (4 hours ago)
(HTM) web link (kartproject.org)
(TXT) w3m dump (kartproject.org)
| ulrischa wrote:
| Wow cool. I needed something like this 10 years before for a
| project
| nix0n wrote:
| Neat!
|
| Does this improve Git's support for large binaries generally, or
| is it necessary to have introspection into any filetype you want
| to support?
|
| Is there good interoperability with existing Git repos?
| asabla wrote:
| Copied from their site:
|
| > Because Kart uses Git for data transfer and storage, you can
| host a Kart repository anywhere you can host a Git repository -
| for example, GitHub, Bitbucket...
|
| ref:
| https://docs.kartproject.org/en/latest/pages/basic_usage_tut...
| UberMouse wrote:
| >Does this improve Git's support for large binaries generally
|
| No, this still uses LFS for larger binary formats (ie raster or
| point cloud datasets)
| polemic wrote:
| Kart repositories are also Git repositories - they're
| 'interopable' in the sense that there is a lot of tooling that
| will work, but the storage structure for vector data differs,
| and using Git on a Kart repository won't work very well.
|
| Kart serializes vector/tablular data into datasets in the
| repository, and manages the process of writing them out to
| useful working copies (GeoPackages, or into databases).
|
| For large binaries - rasters and pointclouds - we're using LFS.
| We include some additional spaital information into pointer
| files to enable some very useful GIS functionality, like
| spatially filtered clones (this works for vector data too).
| mjhay wrote:
| Great to see stuff like this! Data in general, and not just code,
| gets updated or corrected constantly. Given that data is used
| collaboratively in a distributed setting, it should be a first-
| class citizen wrt diffing and merging, just as line-wise text is.
| Anybody working in geophsyics or other data-heavy scientific
| fields should see the value in this approach.
| solardev wrote:
| I was wondering about that. How DOES diffing work with this,
| like in a Geopackage?
| cyanydeez wrote:
| I assume it just displays rows edited.
|
| I don't see how they'd display anything other than points.
| That leaves XYZM for diffing.
|
| They might be showing summary stats like length, perimeter,
| area, volume, but that's usually not easy to generalize.
| polemic wrote:
| Hi there,
|
| Kart supports points, lines & polygons, as well as GeoTIFFs
| for imagery and LAZ for point clouds.
|
| Kart is a CLI tool, but provides fully machine readable
| outputs. You can use the QGIS plugin to get a visual diff
| of vector feature changes though.
| solardev wrote:
| > You can use the QGIS plugin to get a visual diff of
| vector feature changes though.
|
| This sounds like an amazing opportunity for a screenshot,
| btw :)
| asabla wrote:
| This looks super cool! I will for sure be testing this out and
| keeping an eye out for future additions.
| sccxy wrote:
| Cool project but homepage needs two things:
|
| * Docs should not be hidden in small font and as disabled link
| color, make it big button in features list or make features
| clickable to relevant docs.
|
| * Add some screenshots
|
| I spent way too much time clicking every heading to figure out
| what is this all about till I found Docs link.
| polemic wrote:
| Hi! It's neat to see Kart making HN. These are great points,
| we'll get that Docs link much more visible.
| Mertax wrote:
| Just saw this, which might be a better home page:
| https://koordinates.com/products/kart/
| peoplenotbots wrote:
| I wish all the success for this project, GIS is an under-valued
| and under served technical system.
| polemic wrote:
| Hi everyone, I'm Hamish, PM for KartProject here. If you want to
| learn more about Kart: * Our CTO Rob Coup
| presenting on Kart at FOSS4G 23:
| https://www.youtube.com/watch?v=1B-HB2Z3Vlc * Docs are
| available at https://docs.kartproject.org/en/latest/ * We
| also have a QGIS plugin! This gives you visual diffs of vector
| feature changes. https://plugins.qgis.org/plugins/kart/
|
| Happy to answer any questions!
| emj wrote:
| So how does it work for Openstreetmap? I mean git is not very
| good at handling large repositories, fsck and all that takes
| ages. So what performance do you get with an small geographical
| area?
| polemic wrote:
| Hi there,
|
| OpenStreetMap has it's own versioning mechanisms (and a fairly
| specific-to-OSM data model) and Kart isn't really designed to
| work with OSM data as such. Kart adds version control to the
| GIS data that planners, academics, architects, civil engineers,
| etc, use day-to-day. There's a lot of data out there!
|
| "Large" is relative, but Kart works well with quite big vector
| datasets for these typical use cases. For example, we're
| regularly working with datasets that have over 2 million
| features, with a decade of weekly data changes.
|
| Kart includes some feautres specifically for working with small
| geographic areas. We can spatially filtering cloned data so
| you're working with a small subset of a much larger dataset,
| but you still retain the abilityt commit/merge/push to the
| source repo.
| 1attice wrote:
| Cool product! How does this compare to (e.g.) dolt, which is
| pitched as 'git for data'?
| polemic wrote:
| Dolt is a neat project, but it's tightly coupled to MySQL. Kart
| supports MySQL as a working-copy format but MySQL has some
| limitations around geometry support that make it unsuitable for
| most GIS usage - see our docs for more info:
| https://docs.kartproject.org/en/latest/pages/wc_types/mysql_...
|
| Kart works with GIS working copies that are more familiar to
| GIS people - e.g. GeoPackage, Postgres/PostGIS & MSSQL
| databases. Differenet users can use different working copies,
| and still collaborate together too.
| everybodyknows wrote:
| If you're wondering where to find an architecture document, the
| nearest to such may be:
|
| https://docs.kartproject.org/en/latest/pages/development/tab...
|
| Top takeaway being that it's not just versioned geo feature
| items, but versioned per-feature formats. Various popular GIS
| database formats are supported as the "checked-out"
| representation, analogous to a git local-filesystem tree. Maybe
| does conversions between standard GIS formats well -- wasn't
| obvious.
|
| One question I'm left with is performance:
|
| > Every database table row is stored in its own file. ...
| polemic wrote:
| One of the benefits of building on Git is a lot of people have
| put a lot of time into make it work _really well_ with lots of
| objects. And even though we say "files", Git abstracts that
| into packfiles etc very efficiently.
|
| So, we're seening pretty good performance. We're maintaining a
| number of repositories with several millions features, with a
| decade of weekly updates of ~10,000+ rows. It _does_ take some
| time to push that data around, but it's _vastly_ better than
| old ways, and once you have your clone, maintaining updates
| becomes extremely trivial - a _major_ unsolved problem in the
| GIS/data world.
|
| I'd add - Kart has GIS specific features that nullify some of
| these issues. The ability to spatially index the objects, then
| filtering them on Clone, means I rapidly clone a tiny subset of
| the data to work with.
| everybodyknows wrote:
| > The ability to spatially index the objects, then filtering
| them on Clone, means I rapidly clone a tiny subset of the
| data to work with.
|
| Okay -- so is the "--depth=N" filtering option to git-clone
| supported as well? And does it remain useful in the context
| of Kart applications?
| polemic wrote:
| Yes, you can do shallow clones with `--depth` as well. This
| is incredibly useful - it means we can publish massive Kart
| repositories of spatial data with lots of versioning info,
| but still allow users to work with small subsets of the
| most recent changes. Very important for typical GIS use
| cases.
| Mertax wrote:
| Is there a public git repo available somewhere that
| represents a Kart repository?
|
| Are the raw files in the working repository GeoPackages? How
| is it tracking the changes made inside the geopackages? What
| happens if it's replaced with an updated copy of the
| geopackage the was edited via some other application? How
| does it diff the changes?
| satuke wrote:
| Have you tried out https://underhive.in/ for this?
| polemic wrote:
| I haven't, but it looks like it's a Git repo hosting solution?
| This issue with using Git with data directly, is you generally
| loose the per-row/feature change information. With common
| binary GIS data formats, just putting them into Git looses a
| lot of the utility and will blow out the size of the repo as
| you apply changes.
|
| Kart gives you row-level tracking, so you can see who made what
| change & when, and diffs small and fast to apply.
| SOLAR_FIELDS wrote:
| Why is this going to succeed when something like Geogig never
| took off? I was super interested in the project at the time of
| its initial release but it's been dead for years. What did that
| project fundamentally do wrong vs what Kart is doing? Or is it
| just a super niche thing?
___________________________________________________________________
(page generated 2023-10-30 23:00 UTC)