[HN Gopher] H3: For indexing geographies into a hexagonal grid, ...
___________________________________________________________________
H3: For indexing geographies into a hexagonal grid, by Uber
Author : wiradikusuma
Score : 96 points
Date : 2025-03-09 03:32 UTC (18 hours ago)
(HTM) web link (h3geo.org)
(TXT) w3m dump (h3geo.org)
| Traubenfuchs wrote:
| Is the big thing here that those hexagons have the advantages of
| a circles (almost even max distance in all directions from
| center) with the advantages of squares (no overlap)?
| Epa095 wrote:
| Sqaures have overlap if you put them on a sphere, no?
|
| There are some nice things with this. It contains cells at
| different levels (sizes). So you can use very small cells if
| you care about small areas, or large cells if you care more
| about larger areas. Those are given here
| https://h3geo.org/docs/core-library/restable/#average-area-i...
|
| From a coordinate it is fast to get the cell at any level, so
| it's fast to group coordinates at whatever level you want.
|
| If is less about comparing two coordinates, it is more about
| bering able to group large numbers of coordinates such that
| they go into non-overlapping groups, where everything in the
| same group is 'close' to each other.
| at0mic22 wrote:
| I would assume you wouldn't care a lot about overlapping
| squares as the representation on the screen is mostly square,
| you don't order uber from the globe.
|
| Thus most screen projections are derived from wgs84. The idea
| behind h3 (as I see it) that when moving the map to lets say
| top-right, you'd need to fetch 1 leaf comparing to 3 leafs
| with rtree
| jillesvangurp wrote:
| Most grid systems don't have overlapping squares because they
| are quad trees. Each point identifies a single rectangle they
| are in, and all the parent rectangles that fully contain that
| rectangle.
|
| H3 doesn't have that property between levels. Cells at a
| level don't overlap. But the the seven children of a cell
| might overlap with neighboring parent cells.
|
| Of course rectangles aren't rectangular when projected on a
| sphere. Because there are no straight lines. Rectangles and
| hexagons are 2d shapes that are applied to the 2d projections
| of spheres. They only look like rectangles or hexagons in 2D.
| cpa wrote:
| Yes, but it's important to note that it's quite a specialized
| index. - It's an index that doesn't depend on the data, unlike
| traditional r-tree indices. This means non-uniform data won't
| be queried as efficiently as with rtree, but it may be faster
| to build.
|
| - This data independence property is VERY important for
| distributed or streaming queries. For example, if you want to
| join datasets using Spark or other big data tools, each team
| can add a column for h3 cells independently and join somewhat
| efficiently. For large volumes of data, constructing the rtree
| is just not feasible, or more precisely, very disconnected from
| the rest of the "data ecosystem".
|
| - It doesn't work with any coordinate reference system other
| than EPSG 4326 (which you may want if you only work on specific
| geographies to get more precision in your floats)
|
| - It's clearly built with points in mind. Polygons, curves, or
| lines are an afterthought. For example, the polygonToCells
| function returns a set of cells that are entirely within the
| polygon. If you want to join, you'd need to also have the set
| of all cells that entirely contain the polygon. I've never
| found a reliable way to get that.
|
| That being said, it's not bad at all, but if you don't have so
| much data that you can't compute rtree indices, just stick with
| PostGIS.
| riordan wrote:
| > Polygons, curves, or lines are an afterthought. For
| example, the polygonToCells function returns a set of cells
| that are entirely within the polygon. If you want to join,
| you'd need to also have the set of all cells that entirely
| contain the polygon. I've never found a reliable way to get
| that.
|
| With v4 of h3 they (finally) have a clean syntax for this
| with polygonToCellsExperimental[0].
|
| Now there's options for
|
| - Cell center is contained in the shape (default) - Cell is
| fully contained in the shape - Overlapping (covering): Cell
| overlaps the shape at any point - BBOX: Cell bounding box
| overlaps shape
|
| Makes life a fair bit easier if you've gotta deal with H3
| polys. And if you're working locally, DuckDB Spatial's r-tree
| indexing[1] can make for a nice stand-in for PostGIS as a
| quick point-in-polygon solution without the need to spin up a
| service.
|
| [0]: https://h3geo.org/docs/api/regions/#polygontocellsexperi
| ment... [1]: https://duckdb.org/docs/stable/extensions/spatia
| l/r-tree_ind...
| cpa wrote:
| Nice, I did not know about polygonToCellsExperimental.
|
| As for DuckDB & rtree, it's alas not a replacement of
| postgis yet and the indices cannot be used (yet) in joins.
| In fact, I even have workflows where I iterate over rows in
| python and run duckdb queries one after the other rather
| than joining in just one query because of this very issue.
| ethbr1 wrote:
| Oh, mon frere... there are many reasons.
|
| Hexagons. Are the bestagons.
| https://m.youtube.com/watch?v=thOifuHs6eY
| ahoka wrote:
| How is this better than space filling curves? Or does this solve
| a different problem? It's a bit hard to see what and how it
| solves. Why hexagons?
| meowtimemania wrote:
| If you used curves it'd get complicated deciding where one area
| starts and the other ends. With hexagons it's easier to divide
| the world such that (mostly) no hexagon overlaps.
|
| The purpose is to be able to predictably map any coordinate to
| its associated hexagon.
|
| In database applications this makes it easier to query all data
| associated with a hexagonal area.
| xhkkffbf wrote:
| Why not just squares or rectangles? Is there a simple
| explanation?
| jt_b wrote:
| You want a shape that can represent an approximately equal
| area, anywhere on a 3 dimensional surface.
| jandrewrogers wrote:
| It depends on what you are trying to do.
|
| Hexagonal tessellations are optimized for display and
| visualization because they approximate equal surface areas on a
| sphere. They have poor properties for scalable and efficient
| analytical data processing because they are not congruent.
|
| Indexing for scalable analytical processing on a sphere
| requires congruent equal _volume_ decomposition, even if you
| only care about the surface. You can trivially project it to
| the surface later if needed. Binary space decomposition, of
| which space-filling curves are a subset, are strongly
| preferable for this type of indexing.
|
| In practice, a lot of data processing systems will render data
| as an H3 tiles only for visualization as a final step. That
| conversion is fast and trivial and it makes pretty pictures. It
| is not as commonly used to index the underlying data model
| because scalability and performance is prohibitive unless the
| data is small.
| dplarson wrote:
| I found this presentation helpful for an intro to H3's
| design/motivation: "Engineering Sub-City Geos for a Hyper-Local
| Marketplace with Uber":
| https://youtu.be/wDuKeUkNLkQ?si=-9JmxZQJ2LZo6Kh4
| naiv wrote:
| This online tool gives you great idea about what this means on
| different levels:
|
| https://wolf-h3-viewer.glitch.me/
| kemotep wrote:
| This is like the advantage of using 6 mile hexes in a tabletop
| rpg map.[0]
|
| Each hex can be divided into smaller hexes until you get to the
| level of feet/meters as opposed to miles/kilometers.
|
| [0]: https://steamtunnel.blogspot.com/2009/12/in-praise-
| of-6-mile...
| crazygringo wrote:
| I've read all the comments here and still don't understand the
| rationale for hexagons as opposed to squares.
|
| In every sense, squares seem to be much easier to reason about
| and easier to hierarchically partition than hexagons are.
|
| There are certain advantages to hexagons in certain contexts,
| like six degrees of movement instead of four in board games, but
| I don't see how any of those advantages translate here for
| geographical indexing.
|
| I'd love to understand why hexagons as opposed to squares in
| _this_ context are a superior solution rather than unnecessary
| complexity?
| gleenn wrote:
| IIRC it has to do with tiling the globe. You can't correctly
| tild a sphere with swuares. On a local level, it seems easier
| to reason about squares on a flat surface, but the Earth is
| round and Uber is international.
| crazygringo wrote:
| Thanks! I'm still a bit confused because you can't tile the
| globe with hexagons either, if I understand correctly --
| everything I can find online says you need a mix of hexagons
| and pentagons, and you can think of a soccer ball as an
| example.
|
| Has Uber figured out a way to do it with just hexagons?
| noplacelikehome wrote:
| Since cars don't need to cross water, maybe this isn't
| actually relevant. Uber aren't trying to model the planet,
| just the routes within the geographies they support.
| tiagod wrote:
| h3 has the required pentagons in the ocean (dymaxion
| icosahedron vertices)
| Tistron wrote:
| At least one of the pentagons is on Norway. And another
| one contains Beijing.
| zamadatix wrote:
| I think what GP is saying is if you zoom to the max level
| 15 there remain only 12 cells which come out as pentagons
| and all 12 of those are in the ocean (albeit sometimes
| very near shore). If you zoom to the global scale those
| 12 get inherited into ever larger parent cells which do
| cover large areas of the Earth though, as you found.
| setr wrote:
| iirc it's exactly 12 pentagons required to tile the sphere
| with hexagons, independent of hex-tiling size. So it's
| _almost_ fully consistent, and uber tries to put most of
| those pentagons in the ocean
| korkoros wrote:
| Squares have a contiguity problem: does a square in a grid have
| 4 neighbors (rook contiguity) or 8 (queen contiguity)? Hexagons
| do not have this problem. All neighbors of a hexagon share a
| full edge, not just a vertex. All neighbors of a hex also have
| their centers equidistant from that hex. A massive number of
| spatial problems turn on neighborhood definitions, and hexes
| are almost always a better representation of reality than
| squares.
| crazygringo wrote:
| But if you assume rook contiguity then it seems equivalent to
| hexagons. All neighbors share a full edge, all neighboring
| centers are equidistant.
|
| I get that you're saying hexes are almost always a better
| representation. I still don't see a concrete example of why,
| for geographical indexing specifically.
|
| [Edit: sibling reply explained that at the end of the day,
| it's not about indexing but rather route planning.]
| ruined wrote:
| and if you assume cows are spherical and frictionless it's
| quite convenient
|
| sometimes roads go diagonal
| alexanderchr wrote:
| Let's say you are looking for the closest gas station. One
| that is in the close corner of a "diagonal neighbour" would
| be closer than most points in the "edge neighbours". So if
| you want to find something nearby, you'd usually want to
| look at all 8 neighbours. The hexagonal neighbours look
| more like a circle centered in the original hexagon, thus
| more convenient for that purpose.
| duncanfwalker wrote:
| (closer to) equal distance between all the neighbouring cells
| is really helpful when you're modelling a problem as a network
| - eg analysing and planning routes and capacity.
|
| Hierarchical partitioning with exact containment is useful when
| aggregating but that's not always the most critical property.
| crazygringo wrote:
| Ah OK thank you that is making more sense -- that since Uber
| is primarily about navigation and route planning, hexagons
| minimize the number of cells or total land coverage you need
| to retrieve/analyze between two points? I think that was the
| missing piece of information I was looking for!
| jandrewrogers wrote:
| For the purposes of visualization, you want each cell to
| enclose approximately equal surface area. These are your
| "pixels". H3 is a way of rendering any subset of a sphere for
| display.
|
| While squares have superior properties for analytical
| geospatial data processing that H3 doesn't have, such as
| congruency, they really only work for Euclidean spaces and the
| surface of a 2-sphere is non-Euclidean. Any system using
| squares will be a poor approximation of "equal area" relative
| to hexagons, which makes them poor for visualization. To use
| squares for indexing, you need an extra step that allows non-
| Euclidean space to be addressable from Euclidean space. There
| are two main ways of doing this.
|
| First, one can project the surface of the sphere onto the
| surface of a Euclidean cube. Second, one can use an embedding,
| indexing the 2-sphere in Euclidean 3-space. Both of these can
| be trivially projected to a hexagonal system like H3 for
| visualization purposes and commonly are.
|
| If you primarily need visualization and your data is small,
| using H3 eliminates the step where you need to figure out how
| to map non-Euclidean data models to Euclidean data models. If
| you are doing large-scale geospatial processing, it becomes
| worth the effort for both scalability and performance reasons.
| doktorhladnjak wrote:
| You can more smoothly interpolate a function for points on a
| surface using hexagons than with squares. For Uber, think about
| things like surge pricing. If you compute a surge % for each
| hexagon, you can interpolate it for each rider location.
| feverzsj wrote:
| In practice, there is little to no advantage. Any spacing
| filling index is trying to fit into an ordered index like
| B-tree, so existing horizontal scaling solution can be directly
| applied. The problem is it's much harder and inefficient to
| implement computational geometry algorithms on spacing filling
| index than a real spatial index like R-tree. On the other hand,
| scaling can be easily solved by table partition on different
| area.
| zamadatix wrote:
| This will seem like a nit at first but it's really a key driver
| to why you'd look at other shapes: a hexagon is more comparable
| to a quadrangle, a square is more comparable to what is called
| "a regular hexagon". Regular meaning the sides and angles which
| make up the shape are all equal. In the 2D world, such as on
| board games, equilateral triangles, squares, and regular
| hexagons can all tile a plane perfectly. This is not the case
| for the surface of a sphere, there is no tiling regular polygon
| in that case.
|
| From there you're just trying to optimize uniformity in
| distance to neighbors, how big the adjustments to the irregular
| polygons need to be to get them to tile on the surface are, how
| easy the polygon is to split up into smaller similarly shaped
| variants of itself as sub tiles, and trying to be somewhat
| close to a circle in shape as that means the average distance
| to the center of the area defined by the index is as close to
| as it can be.
|
| If you chunk through those you'll find quadrangles aren't
| attractively simple anymore and hexagons tend to optimize the
| parameters very well. H3 actually uses both hexagons and the
| occasional pentagon (12 total, no matter the zoom level). It
| all comes down to "tiling isn't going to be perfect - what is
| the most optimal answer for the purpose of the tiling".
| hankchinaski wrote:
| From the docs
|
| >The cell shape of that grid system is an important
| consideration. For simplicity, it should be a polygon that
| tiles regularly: the triangle, the square, or the hexagon. Of
| these, triangles and squares have neighbors with different
| distances. Triangles have three different distances, and
| squares have two different distances. For hexagons, all
| neighbors are equidistant
|
| And
|
| >This property allows for simpler analysis of movement.
| Hexagons have the property of expanding rings of neighbors
| approximating circles
|
| >Hexagons are also optimally space-filling. On average, a
| polygon may be filled with hexagon tiles with a smaller margin
| of error than would be present with square tiles.
| KaiserPro wrote:
| I'm not sure either. http://s2geometry.io/ works well and you
| only have to really worry about 4 points to draw the circle.
| serjester wrote:
| I created an h3 polars implementation and I think one of the
| big one's is flow modeling. As other's have mentioned, each
| hexagon has 6 equidistant neighbors. As a result, it lends
| itself to analyzing telematics data (hence Uber's backing of
| h3) - I attached a colab notebook where I do some of this.
| Trying to do this sort of analysis any other way would be
| incredibly painful.
|
| [1] https://github.com/Filimoa/polars-h3
|
| [2]
| https://drive.google.com/file/d/18jIVEbE_1QbwTbHdMqj0AVqguf2...
| tomtomistaken wrote:
| Hexagons could be a new kind of political entity of the future,
| where, for example, rewards for climate progress (measured by
| satellites) could be put into practice.
___________________________________________________________________
(page generated 2025-03-09 22:01 UTC)