Prometheus on Raspberry Pi

laumars · on Feb 10, 2015

> "Cross-compiling this for Raspberry Pi is a pain. ARM != ARM, there are several variants and when I tried to cross-compiling Prometheus with CGO, it just lead to segfaults or invalid instructions."

I'm pretty sure I had similar issues on the original model B and found I was missing specifying which ARM version it was:

    export GOARCH="arm"
    export GOARM="5"

I don't know if the same is required for the Raspberry Pi 2 nor even which ARM chipset version it is (the same I'm going to assume since Raspbian / Arch / etc still run) but it might be worth checking you have your GOARM environmental variable set

simcop2387 · on Feb 10, 2015

The PI2 should be much easier to get going with this since it's finally updated to ARMv7 and can actually run normal distros along with raspbian. I don't know what other extensions it has but i'd hope it'll work out of the box there.

discordianfish · on Feb 10, 2015

Just for completeness, my answer to the same Q on the blog: ---

his doesn't solve cross compilation CGO. For that, you need the gcc cross compiler for the raspberry pi. I've tried to build it with CGO enabled like this:

CC=arm-linux-gnueabihf-gcc GOARCH=arm CGO_ENABLED=1 go build -ldflags="-extld=$CC"

But the binary still segfaulted on the raspberry pi. I also tried the cc from https://github.com/raspberrypi... but with the same result.

laumars · on Feb 10, 2015

Ah ok. Shame.

Thanks for the update though :)

kbody · on Feb 10, 2015

First time I heard about Prometheus (by Soundcloud). I wonder how it compares to InfluxDB as a time-series DB (ignoring the monitoring things).

detaro · on Feb 10, 2015

https://news.ycombinator.com/item?id=8995696

Previous discussion here, with both Prometheus and InfluxDB developers chiming in.

Short version from memory: Prometheus is more efficient for datasets were series have metadata, not so much individual points, since InfluxDB stores metadata per point, but InfluxDB will optimize this case in the next version.

freeman478 · on Feb 10, 2015

See the previous HN Post about it : https://news.ycombinator.com/item?id=8995696

and more precisely related to your question : http://prometheus.io/docs/introduction/comparison/#prometheu...

jrv · on Feb 10, 2015

See also the official announcement blog post by SoundCloud: https://developers.soundcloud.com/blog/prometheus-monitoring...

raindrop777 · on Feb 10, 2015

Can someone tell me why someone would want to write a database in Go/Python/...? Or is Go just a wrapper in this case?

I am looking to store small time series data (only 100 points), but millions of them. I currently use sqlite. Any other suggestions that I could use?

ethbro · on Feb 10, 2015

I suppose for the same argument that can be made for single-language-but-inefficient configuration systems.

When you're doing a basic process, reliability >> efficiency.

And sometimes more pieces in your stack just decrease reliability of the entire system. (Admittedly! Rolling your own version of an already well-implemented stack component is not without its peril as well)

bduerst · on Feb 10, 2015

It's all dependent on what you're trying to do. Sqllite is surprisingly robust for simple things. Are you experiencing any issues yet, and what kind?

What's the transactional activity with that time series you're storing? (i.e. are you writing these individually each millisecond or are you writing in batches every hour)

atmosx · on Feb 10, 2015

I'd go with PostgreSQL, takes 30-35 MB of RAM for one running connection. The entire stack (db + httpd + ruby/python/php) should take less than 150 MB. You have another ~ 300 in the prior model B.

Go should be faster and more efficient than python/ruby so I'd expect even less RAM/better performance.

SQLite3 these days is good for embedded devices but that's about it IMHO.

jrv · on Feb 10, 2015

Millions of time series with a small number of points each actually works very well with Prometheus.

rcarmo · on Feb 10, 2015

RRDTool?

rcarmo · on Feb 10, 2015

I'm intrigued. I collect real-time machine performance data using a UDP multicast daemon (written in C). It's fallible on some network topologies, etc., but it's extremely light and efficient.

I wonder if that could be plugged in to Prometheus without the overhead of HTTP collection...

bbrazil · on Feb 10, 2015

HTTP collection doesn't really add a lot of overhead.

You might be interested in the https://github.com/prometheus/collectd_exporter as collectd works in a similar way.

For machine monitoring http://www.boxever.com/monitoring-your-machines-with-prometh... covers how to set it up with Prometheus - it's pretty easy to get working.

rcarmo · on Feb 10, 2015

It does in the sense that an HTTP connection (either on the server or the client side) will expend roughly 20 times the CPU and buffers than generating or handling a single UDP packet. It all adds up on small systems.

bbrazil · on Feb 10, 2015

You'd usually transfer many metrics in a single HTTP request, so the cost is amortized in most cases.

rcarmo · on Feb 10, 2015

I can pack a lot on a UDP packet as well...

jrv · on Feb 10, 2015

Not sure if this is what you're comparing it to, but be aware that Prometheus' approach is fundamentally different from StatsD-like approaches where you send every event or a subsampling thereof to a monitoring server.

Prometheus is state-based, not event-based. It only stops by your monitored instances once every couple of seconds and gathers their current state. E.g. for counting events, clients simply expose cumulative counters over their lifetime which they can increment locally in memory, and Prometheus comes by for example every 15 or 30 seconds and stores the current counter state.

The HTTP traffic incurred in this case is not really a problem and you'll usually run into bottlenecks at other places (like storage sample ingestion) before you run into network transfer bottlenecks.

AYBABTME · on Feb 10, 2015

There's a problem with metrics over UDP: an overloaded server will not be sending its metrics, so it will appear as if it's doing ok.

rcarmo · on Feb 10, 2015

Your assumption is wrong, since lack of data points in a given time interval is, in itself, a data point. If metrics don't arrive on time after a set interval, the machine is flagged.