Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Man this one annoys me.

I have this server running a docker container with a specific application. And it writes to a specific filesystem (properly mount binded inside the container of course).

Sometimes docker starts before the filesystem is mounted.

I know systemd can be taught about this but I haven't bothered. Because every time I have to do something in systemd, I have to read some nasty obscure doc. I need know how and where the config should go.

I did manage to disable journalctl at least. Because grepping through simple rotated log files is a billion times faster than journalctl. See my comment and the whole thread https://github.com/systemd/systemd/issues/2460#issuecomment-...

I like the concept of systemd. Not the implementation and its leader.



> I know systemd can be taught about this but I haven't bothered.

I think After=<your .mount> will work. If you believe it can be taught (and it can) why do you blame your lack knowledge on the tool is not a strong argument against the quality of the tool.

> Because grepping through simple rotated log files is a billion times faster than journalctl.

`journalctl -D <directory of the journal files> | grep ...` will give you what you want. Systemd is incredibly configurable and that makes its documentation daunting but damn it does everything you want it to do. I used it in embedded systems and it is just amazing. In old times lots of custom programs and management daemons needed to be written. Now it is just a bunch of conf files and it all magically works.

The most fair criticism is it does not follow the 'everything is a file philosophy' of Unix, and this makes discoverability and traditional workflows awkward. Even so it is a tool: if it does what you want, but you don't want to spend time understanding it, it is hardly the fault of the tool. I strongly recommend learning it, there will be many Ah-ha moments.


You can also add fake filesystem parameters to the fstab entries that are parsed by systemd. Here the doc on this. You might be forgiven for having missed it. It's under the section fstab. https://www.freedesktop.org/software/systemd/man/latest/syst...

If you had followed my link to the systemd issue, you might have seen the commands I ran, as well as the tests and feedback of everybody on the issue. You might reach the conclusion that journalctl is fundamentally broken beyond repair.

edit: added link to systemd doc


> it does everything you want it to do

It does everything no one asked it to. I'm sure they will come up with obscure reasons why the next perfectly working tool has to be destroyed and redone by the only authority - the LP team. Like cron, sudo and yes - logging.

> journalctl -D ... will give you what you want

Look, I don't need the help of journalctl to grep through text. I can simply grep thru text.

> I used it in embedded systems

Good luck in a few years when you are flying home on the next Boeing 737-MAX-100800 and it fails mid flight because systemd decided to shut down some service because fuck you that's why.

> it does not follow the 'everything is a file philosophy'

It does not follow 'everything is a separate simple tool working in concert with others'. systemd is a monolith disguised to look like a set of separate projects.

> don't want to spend time understanding it, it is hardly the fault of the tool

It is, if we had proper tools for decades and they did work. I'm not a retrograde guy, quite the opposite, but the ideology that LP and the rest are shoving down our throats brings up natural defiance.

> there will be many Ah-ha moments

No doubts. systemd unit files and systemd-as-PID1 is excellent. It was NOT excellent for the whole time but now it is. The rest? Designed to frustrate and establish dominance, that's it.


> I did manage to disable journalctl at least

My goodness. Absolutely fuck journald - a solution in search of a problem. I have created a bunch of different scripts to init my instances [1] on all projects. I do it differently from time to time, but one thing they all have in common is that journald gets removed and disabled.

[1] https://github.com/egorFiNE/desystemd


  > Because grepping through simple rotated log files is a billion times faster than journalctl
This is annoying, but there's a "workaround"

  $ time journalctl | grep "sshd" | wc -l
  12622
  journalctl  76.04s user 0.71s system 99% cpu 1:17.09 total
  grep --color=always --no-messages --binary-files=without-match "sshd"  1.28s   user 1.69s system 3% cpu 1:17.08 total
  wc -l  0.00s user 0.00s system 0% cpu 1:17.08 total

  $ time journalctl > /tmp/all.log && time wc -l /tmp/all.log
  journalctl > /tmp/all.log  76.05s user 1.22s system 99% cpu 1:17.56 total
  16106878 /tmp/all.log
  wc -l /tmp/all.log  0.03s user 0.20s system 98% cpu 0.236 total

  # THE SOLUTION
  $  time journalctl --grep=sshd | wc -l
  5790
  journalctl --grep=sshd  28.97s user 0.26s system 99% cpu 29.344 total
  wc -l  0.00s user 0.00s system 0% cpu 29.344 total
  
It's annoying that you need to use the grep flag instead of piping into grep but it is not too hard to switch to that mindset. FWIW, I have gotten slightly faster results using the `--no-pager` flag but it is by such a trivial amount I'll never remember it

  > Sometimes docker starts before the filesystem is mounted.
Look at the output of `systemctl cat docker.service` and you'll see an "After" "Wants" and "Requires" arguments in the unit. You're going to want to edit that (I strongly suggest you use `sudo systemctl edit docker.service`, for reasons stated above) and make sure that it comes after the drive you want mounted. You an set the Requires argument to require that drive so it shouldn't ever start before

Alternatively, you can make the drive start earlier. But truthfully, I have no reason to have docker start this early.

Here's a link to the target order diagram[0] and Arch wiki[1]. Thing that gets messy is that everyone kinda lazily uses multi-user.target

[0] https://www.freedesktop.org/software/systemd/man/latest/boot...

[1] https://wiki.archlinux.org/title/Systemd#Targets


journalctl --grep is still much slower than grep on simple files. And if you use ripgrep like I prefer, it's even faster still.

No really I don't think journactl makes sense in its current form. It's just broken by design.

I do like the potential of it. But not the implementation.


  > journalctl --grep is still much slower than grep on simple files
Idk what to tell you. You had a problem, showed the commands you used and the times it took. So I showed you a different way that took less than half the time to just dump and grep (which you said was faster)

My results don't match your conclusion.

  > if you use ripgrep
I've been burned by ripgrep too many times. It's a crazy design choice, to me, to default filter things. Especially to diverge from grep! The only thing I expect grep to ignore are the system hidden files (dotfiles) and anything I explicitly tell it to. I made a git ignore file, not a grep ignore file. I frequently want to grep things I'm ignoring with git. One of my most frequent uses of grep is looking through builds artifacts and logs. Things I'd never want to push. And that's where many people get burned, they think these files just disappeared!

The maintainer also has been pretty rude to me about this on HN. I can get we have a different opinion but it's still crazy to think people won't be caught off guard by this behavior. Its name is literally indicating it's a grep replacement. Yeah, I'm surprised its behavior significantly diverges from grep lol


> I've been burned by ripgrep too many times. It's a crazy design choice, to me

Yeah, but storing logs in binary and having a specific tool to just read them is sure not a crazy design choice.



Given your criticisms of ripgrep, this is just deliciously ironic. What, you're the only one who can criticize the defaults of tooling? Oh my goodness, what a hoot.


> My results don't match your conclusion.

Alright. Let me entertain you.

In the data I provided, counting the lines in a big log file was 469.5 times faster than journalctl took to output all the logs.

From this information alone, it seems difficult to believe that journalctl --grep can be faster. Both had to read every single line of logs.

But it was on a rather slow machine, and a couple years ago.

Here /var/log and the current directory are on a "Samsung SSD 960 PRO 512GB" plugged via m2 nvme, formatted in ext4 and only 5% used. Though this shouldn't matter as I ran every command twice and collected the second run. To ensure fairness with everything in cache. The machine had 26GiB of buffer/cache in RAM during the test, indicating that everything is coming from the cache.

In my tests, journalctl was ~107 times slower than rg and ~21 times slower than grep: - journalctl: 10.631s - grep: 0.505s - rg: 0.099s

journactl also requires 4GiB of storage to store 605MB of logs. I suppose there is an inefficient key/value for every log line or something.

For some reason journalctl also returned only 273 out of 25402 lines. It only returns one type of message "session closed/opened" but not the rest. Even though it gave me all the logs in the first place without `--grep`?!

Let me know if I am still using it wrong.

    $ sudo hdparm -tT /dev/nvme0n1

    /dev/nvme0n1:
     Timing cached reads:   33022 MB in  1.99 seconds = 16612.96 MB/sec
     Timing buffered disk reads: 2342 MB in  3.00 seconds = 780.37 MB/sec

    $ du -hsc /var/log/journal
    4.0G /var/log/journal
    4.0G total

    $ time journalctl > logs
    real 0m31.429s
    user 0m28.739s
    sys 0m1.581s

    $ du -h logs
    605M logs

    $ time wc -l logs
    3932131 logs

    real 0m0.146s
    user 0m0.065s
    sys 0m0.073s

    $ time journalctl --grep=sshd | wc -l
    273

    real 0m10.631s
    user 0m10.460s
    sys 0m0.172s

    $ time rg sshd logs | wc -l
    25402

    real 0m0.099s
    user 0m0.042s
    sys 0m0.059s

    $ time grep sshd logs | wc -l
    25402

    real 0m0.505s
    user 0m0.425s
    sys 0m0.085s
PS: this way of using rg doesn't ignore any files, it is not used to find files recursively. But I don't have a .gitignore or similar in my /var/log anyways.


  > Let me know if I am still using it wrong.
You're not using it wrong, but you are measuring wrong.

Check out the filetype of the journal files

  $ file -bi /var/log/journal/1234blahblah/system@foobardedoopdedo.journal
  application/x-linux-journal; charset=binary
Your measurement procedure is wrong because the `journalctl` command is doing something different. It isn't just reading a plain file, it is reading a binary file. On the other hand, `grep` and `rg` are reading straight text.

  > it seems difficult to believe that journalctl --grep can be faster.
Why? It could be doing it in parallel. One thread starts reading at position 0 and reads till N, another starts at N+1 and reads to 2N, etc. That's a much faster read operation. But I'm guessing and have no idea if this is what is actually being done or not.

P.S.: I know. As I specified in my earlier comment, I get burned with build artifacts and project logs. Things that most people would have in their .gitignore files but you can sure expect to grep through when debugging.


What matters in this discussion is the outcome.

For the exact same logs:

journalctl takes 10s to search through 4GiB. And misses most of the matches.

(rip)grep takes (0.1)0.5s to search through 605MiB.

In other words. journalctl consumes much more space, and a lot more time, to return an order or magnitude less results to the same query.

What does journalctl offers, that makes this resource, time and correctness tradeoff worth it?


Their measurement isn't wrong. It's demonstrating the exact point in question: that if the logs were just stored in plain text, then grepping them would be an order of magnitude faster (or multiple orders of magnitude in the case of ripgrep) than whatever `journalctl --grep` is doing.


Their measurements are wrong because the journalctl command is also performing a decompression operation.

THEY ARE NOT DOING THE SAME THING

  > that if the logs were just stored in plain text
So store them in plain text then

  cat /etc/systemd/journald.conf
  https://www.freedesktop.org/software/systemd/man/latest/journald.conf.html


How it's doing the search is irrelevant. What's being measured here is the user experience. This isn't some kind of attempt to do an apples-to-apples grep comparison. This is about how long you have to wait for a search of your logs to complete.


> My results don't match your conclusion.

The results in your comment aren't measuring the same thing. There's no grep on the /tmp/all.log in the middle code block, which is the thing they're talking about comparing.


My second operation is covering that. The reason my results show better is because they are counting the decompression against journalctl. It is doing a decompression operation and reading while grep and rg are just reading.

Btw, you can choose not to store journald files as compressed.


Where exactly did you test the speed of "grep sshd /tmp/all.log"? The entire point of their argument is that's what's orders of magnitude faster than anything journalctl.


> The maintainer also has been pretty rude to me about this on HN.

This is AFAIK the only other interaction we've had: https://news.ycombinator.com/item?id=41051587

If there are other interactions we've had, feel free to link them. Then others can decide how rude I'm being instead of relying only on your characterization.

> but it's still crazy to think people won't be caught off guard by this behavior

Straw-manning is also crazy. :-) People have and will absolutely be caught off guard by the behavior. On the flip side, as I said 9 months ago, ripgrep's default behavior is easily one of the most cited positive features of ripgrep aside from its performance.

The other crazy thing here is... you don't have to use ripgrep! It is very specifically intended as a departure from traditional grep behavior. Because if you want traditional grep behavior, then you can just use grep. Hence why ripgrep's binary name is not `grep`, unlike the many implementations of POSIX grep.

> Its name is literally indicating it's a grep replacement.

I also tried to correct this 9 months ago too. See also: https://github.com/BurntSushi/ripgrep/blob/master/FAQ.md#pos...

For anyone else following along at home, if you want ripgrep to search the same files that GNU grep searches, then do `rg -uuu`. Or, if you don't want ripgrep to respect your gitignores but ignore hidden and binary files, then do `rg -u`.

It makes sense that folks might be caught off guard by ripgrep's default filtering. This is why I try to mitigate it by stating very clearly that it is going to ignore stuff by default in the first one or two sentences about ripgrep (README, man page, CHANGELOG, project description). I also try to mitigate it by making it very easy to disable this default behavior. These mitigations exist precisely because I know the default behavior can be surprising, in direct contradiction to "but it's still crazy to think people won't be caught off guard by this behavior."


Not gonna lie, that was a bit creepy. We're deep in a day old thread that you have no other comments in. Do you scrape HN looking for mentions of ripgrep?

Forgive me if I'm a bit surprised!

I still stand that silent errors are significantly worse than loud ones

  | it's worse to not get files you're expecting vs get more files than you're expecting. In the later case there's a pretty clear indication you need to filter while in the former there's no signal that anything is wrong. This is objectively a worse case.

  > The other crazy thing here is... you don't have to use ripgrep! 
If it wasn't clear, I don't ;)

I don't think grep ignoring .gitignore files is "a bug". Like you said, defaults matter. Like I said, build artifacts are one of the most common things for me to grep.

Where we strongly disagree is that I believe aliases should be used to add functionality, where you believe that it should be used to remove functionality. I don't want to start another fight (so not linking the last). We're never going to see eye-to-eye on this issue so there's no reason to rehash it.


> I don't think grep ignoring .gitignore files is "a bug".

I don't either? Like... wat. Lol.

> Where we strongly disagree is that I believe aliases should be used to add functionality, where you believe that it should be used to remove functionality.

Not universally, not at all! There's plenty of other stuff in ripgrep that you need to opt into that isn't enabled by default (like trimming long lines). There's also counter examples in GNU grep itself. For example, you have to opt out of GNU grep's default mode of replacing NUL bytes with newline terminators via the `-a/--text` flag (which is not part of POSIX).

Instead what I try to do is look at the pros and cons of specific behaviors on their own. I'm also willing to take risks. We already have lots of standard grep tools to choose from. ripgrep takes a different approach and tons of users appreciate that behavior.

> We're never going to see eye-to-eye on this issue so there's no reason to rehash it.

Oh I'm happy not to rehash it. But I will defend my name and seek to clarify claims about stuff I've built. So if you don't want to rehash it, then don't. I won't seek you specifically out.

> I don't want to start another fight (so not linking the last).

To be clear, I would link it if I knew what you were referring to. I linked our other interaction by doing a web search for `site:news.ycombinator.com "burntsushi" "godelski"`.

> If it wasn't clear, I don't ;)

OK, so you don't use ripgrep. But you're complaining about it on a public forum. Calling me rude. Calling me creepy. And then whinging about not wanting to rehash things. I mean c'mon buddy. Totally cool to complain even if you don't use it, but don't get all shocked pikachu when I chime in to clarify things you've said.


  > Calling me creepy
I didn't call you creepy.

I said it was creepy that you appears seemingly out of nowhere is a very unexpected place.

I'm only giving this distinction because this category of error has happened a few times.


That's a fair clarification. Then you can change what I said to, "calling what I'm doing creepy." I don't think much else changes. My points certainly don't change.


"I didn't call you creepy, I called your behavior creepy".

This is just rhetorics.


Where did you come from?

Yes, it is creepy when someone randomly appears just after you allude to them. It is also creepy when someone appears out of nowhere to make their same point. Neither of you were participating in this thread and appeared deep in a conversation. Yeah, that sure seems like unlikely circumstances to me and thus creepy.


This is a public forum read by millions of people


> It's just broken by design

I have the impression that a) the majority of systemd projects are broken by design and b) this is exactly what the LP people wanted.


> Look at the output of `systemctl cat docker.service`

No. Either the initsystem works in a straightforward way or it doesn't. As soon as we need special commands to just get an impression of what's happening with the service, this init system can - again - fuck off with all that unnecessary complexity.

Init must be simple.

Unfortunately it isn't anymore. Unfortunately, systemd will not fuck off, it's too late for that. Unfortunately we now have to deal with the consequences of letting LP & co do what they did.


  > As soon as we need special commands to just get an impression of what's happening with the service,
I agree this is bad design. I do not intend to defend `--grep` just was trying to help solve the issue. I 100% believe that this creates an unreasonable expectation of the user and that piping to grep should be expected.

Although, my results showed equal times piping to grep and dumping to file then grepping that file. IFF `--grep` is operating in parallel, then I think that's fine that it is faster and I'll take back my critique since it is doing additional functionality and isn't necessary. That situation would be "things work normally, but here's a flag for additional optimization."

Is the slowdown the file access? I do notice that it gets "choppy" if I just dump `journalctl --no-pager` but I was doing it over ssh so idk what the bottleneck was. IO is often a pain point (it pains me how often people untar with verbose...).


> things work normally

With text log files.

> but here's a flag for additional optimization

Which wouldn't be even needed in the first place if that very tool that wants this flag just simply did not exist.


> you need to use the grep flag instead of piping into grep

I don't. It's the journalctl that does. And it can absolutely fuck off with everything and all of it.

Log files must be in form of text files. This worked for decades and there is no foreseeable future where this stops working or ceases to be a solution for OS log collection.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: