Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So a full planet OSM extract is about 300 million binary files, in total about 90 GB. The most popular ways to store it:

- MBTiles - an SQLite file, each file is a row in a table, you need a server to serve it.

- PMTiles - a single file, optimised for serverless usage.

- Extract them into a directory, which in practice should be on a partition image. This is the approach I choose.

I tested ext4 and Btrfs and I choose Btrfs. The reason is how ext4 and Btrfs handles inodes. Btrfs handles inodes so much better compared to ext4, it doesn't allocate them at start and also allows putting tiny files right with the metadata.

Because here the average tile size is only 405 bytes, most of the tiles can actually stored with the Btrfs metadata block. From the latest run:

Btrfs data is 51.57GiB

Btrfs metadata is 84.37GiB



This sounds like a perfect application for EROFS[1]. While it comes from an embedded systems background, it has seen some usage in container use cases and is moving towards a general "mountable tar" application. It would also avoid the tedium you have to go through in shrink_btrfs.py because you can just generate the image out of a tree.

I wanted to give repackaging the btrfs image a shot but the download was pretty slow - I assume your server is getting HN-hugged a bit so I didn't want to make it worse and stopped the download.

[1] https://erofs.docs.kernel.org/en/latest/index.html


Thanks a lot, I didn't know about it! I also liked the fact that Btrfs is probably super well tested in the Linux kernel by now.

btrfs.openfreemap.com just a public Cloudflare bucket, no idea why it might be slow.


> I also liked the fact that Btrfs is probably super well tested in the Linux kernel by now.

btrfs has certainly been around for longer, but in my (embedded systems only) experience, EROFS has been pretty solid - it's slowly being picked up by Android, so it is definitely seeing a lot of use in the wild (probably surpassing btrfs by the number of installations already).

> btrfs.openfreemap.com just a public Cloudflare bucket, no idea why it might be slow.

I'm getting 30 MiB/s (on a gigabit uplink) - not great, not terrible. A .torrent would be nice but I guess outside of being on the HN front page full-planet downloads by different people won't synchronize enough for this to be useful (and using web seeds is problamtic in its own right with small-ish chunks).


Do you think btfs would work well for old school pre rendered raster tiles on disk?


Yes, I think it would. Have a look at the extract_mbtiles script: https://github.com/hyperknot/openfreemap/blob/main/modules/t...


I apologize if this is something stupid, or answered elsewhere, but is there a good way to trim Planet down to a smaller area?

I've been making some custom map stuff for mountain bike trails and I'd really like to move to self-hosted vector tiles for all layers, but too much of what I find says to start with Planet.osm when all I really need is a State (in the US) or even a few-miles-wide area.

(My goal is to basically snapshot OSM data, generate tiles, and use that until I decide to do another snapshot down the line, so the underlying data doesn't change. And limit it to a small area because that's all I need.)

Examples of maps I've done this way, and want to improve, are: https://trailmaps.app/ramba/ and https://trailmaps.app/dte/


Look into Planetiler [1] (which OP uses for tile generation). It supports downloading regions that are listed on Geofabrik [2] and converting them to mbtiles or pmtiles. Geofabrik has separate extracts for all US states, for example. If you need to extract an even smaller area from that result, GDAL has support for mbtiles so you could use gdalwarp [3] to extract a new mbtiles file out of it using bounds.

Another option is to use the extract functionality in pmtiles [4] to extract your area of interest from their daily full-planet pmtiles builds. You can then statically host that file and use that in your client with one of their client libraries.

[1] https://github.com/onthegomap/planetiler

[2] https://download.geofabrik.de/

[3] https://gdal.org/en/latest/programs/gdalwarp.html

[4] https://docs.protomaps.com/pmtiles/cli#extract


If you know the tile numbers, you can just copy them out of the Btrfs image.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: