Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Nginx 1.7.1 adds syslog support (nginx.org)
119 points by kolev on May 28, 2014 | hide | past | favorite | 43 comments


Why is logging to syslog as opposed to a separate file an important use case? Genuinely curious.


It's useful if you want to log to a centralized logging server. This helps to have all your logs in one place and also keeps the logs safe, if someone breaks into your server.


Also allows you to correlate them with backend services, Web App Firewalls etc to see if there is an current attacks or threats.


That's a pretty weak argument considering that syslog is entirely UDP and is bound to drop log data, sometimes en masse, most likely even silently. Not a good idea.

Why not use something like multilog or svlogd and wire up a tiny processor for it to kick logging data over someplace using something like rsync?

To boot, syslog is annoying to tune, depending on your particular implementation. rsyslog has a default buffer limit of 2k, whereas other syslog implementations (IIRC, syslog-ng and Solaris syslog at the very least) have default buffer limits of 1k, and this might not be obvious until you're running up against that and make the shocking discovery that you're losing data.

On an nginx server that services 2TB/mo worth of transit (which is distinctly possible since I've got infrastructure in production that does this), there's a good chance that you'll be stretching some of these limits a bit.


> That's a pretty weak argument considering that syslog is entirely UDP and is bound to drop log data, sometimes en masse, most likely even silently. Not a good idea.

rsyslog and syslog-ng have support for TCP

> Why not use something like multilog or svlogd

Additional point of failure.

> and wire up a tiny processor for it to kick logging data over someplace using something like rsync?

Additional point of failure (processor); additional point of failure (rsync/ssh); non-realtime log replication (which is bad for breaking/progressive system failure/etc).

> To boot, syslog is annoying to tune, depending on your particular implementation.

All of the examples you list are easier to learn about, tweak, and monitor than the suggestions you've proposed, however.

> On an nginx server that services 2TB/mo worth of transit (which is distinctly possible since I've got infrastructure in production that does this), there's a good chance that you'll be stretching some of these limits a bit.

If you're dealing with 2 TB/mo in transit, you're probably capable enough to understand the risks with centralized log management and mitigate/monitor them ahead of time.


Just FYI rsyslog (available in RHEL/CentOS 6.x) has tcp support.


In addition to what danudey says, modern sysloggers also support Unix domain sockets, which are reliable. Typically this is /dev/log; on Linux, I believe the GNU C Library's syslog() uses this by default.

On Linux, sending UDP to localhost is very reliable and fast, essentially going through kernel buffers with very little overhead. You will only see dropped data if the system is extremely overloaded. I did some testing, a few years back, and was not able to induce packet loss on localhost.

The usual way to set up centralized logging with syslog is to have each node run a local syslog daemon (eg., RSyslog), which then buffers the data and streams it to a central syslog daemon using a more reliable protocol such as RELP [1] over TCP.

[1] http://www.rsyslog.com/doc/omrelp.html


While I agree with you on the whole, one minor point about UDP/datagrams: they are not reliable even on localhost under some circumstances. The point of datagrams is that they are allowed to be lost without a trace if the consumer (syslog) is not consuming fast enough. For example, if process A starts spewing 10,000 log records (UPD or datagram UNIX socket packets) a second at syslog, and syslog can only handle 5,000, then the other 5,000 records will be lost. Any other process will also get its records lost as they will not be guaranteed to be processed. The rate of loss will be controlled by how large a datagram buffer the consumer's kernel has. Moreover, the processing will not be uniform: the buffer is LIFO, so older records will be processed while newer ones will be lost.

On the other hand if you use stream sockets, the producer will either block or be told that the consumer is not ready to read any more data (beauty of TCP). In either case, TCP produces enough overhead compared to UDP to slow down the actual useful part of your application, which is often not desirable.

Neither one of these is a good solution as either your consumer or your producer needs to keep their own very large buffers to accommodate spikes in traffic. Ideally, you do this anyways to ensure that you hold onto all the packets you received.

Having said that, I don't know exactly what rsyslog does so I cannot say if this would actually be a problem for it.


While it's certainly true that UDP is unreliable even on localhost, unix datagram sockets arn't, they have flow control.


I have never seen a reference to this. Can you link a source?


AF_UNIX with SOCK_STREAM will behave like TCP, in that a "full" stream will block. See http://stackoverflow.com/questions/1478975/unix-domain-socke..., for example.


The use case is people with:

- A large enough deployment to want centralized logging

but are:

- Cheap enough not to buy nginx (for good reason or not)

and

- Too lazy to maintain a patcheset against distro packages

and

- Too bad at linux administration to use the file pipe trick to log to syslog anyway

So, yeah syslog is nice but this change does have quite a narrow use case. What it did have were vocal complainers that knew the right places to complain online to be noticed.


This is kind of a cheap shot at people who just wanted built-in syslog support. There are plenty of reasons someone might not be able to patch or upgrade a binary, and using a fifo is a pretty bad kludge considering the whole thing can hang if they're started/stopped in the wrong order (meaning your startup scripts now have to be rewritten). Building in syslog support means just getting the damn thing working without special hacks and kludges.

But to answer OP's question, syslog is better than just appending to a file because syslog does a lot of things for you, like filtering your logs in real time and splitting them into new files, logging remotely to industry-standard aggregation devices, access control, (somewhat) standardized formatting, log rolling, etc.


My job as a sysadmin is to keep servers running and log their status and errors.

When a large team has many jobs:

-Yes, I want central logging

-I probably did buy nginx (you're telling me this was already available, but closed source? How rude.)

-No, managing distro patches and writing a web server are not my jobs

-No, I'd prefer avoiding hackish file redirects in lieu of real features, but I'll do what I have to.


The simplest answer I could think of is getting remote syslogging for free. This also makes it easier to process nginx logs in realtime without named pipe trickery. I'm pretty excited to clean up some log ingestion code.


I think one reason people wanted it was to reduce/eliminate disk IO.


Yeah, and then you can also optimise the logging server's secondary storage setup for mostly appending writes. Or so I hear.


syslog -> logstash -> elasticsearch -> kibana


Can I reach you privately? I need help with my ELK set up :/


You can usually get excellent support on the Logstash IRC channel. #logstash on irc.freenode.org


Thanks, I'll try that.


In short remote syslog-ing.


Especially considering that it's been available as a community patch since 0.8 days.

http://wiki.nginx.org/3rdPartyModules#Third_party_patches


True, it's been available, but having it in the core means that it will get into Linux distros, i.e. people don't have to compile.


Yes, I'm saying it's really about time they added it to the core, especially because it's been available as a patch for a long time.

I get that they need to differentiate their paid version, I just wish it was in areas where the community hadn't already provided something that works.


Distros have been patching things like this for a long time.


True, but I can't recall any having the syslog module out of the box.


What does Nginx do if the log message is longer than the 1KB limit imposed by the syslog protocol?


That's an implementation defined limit in practice. rsyslog for example defaults to 2K (apparently to be compatible with upcoming RFCs), but it's configurable to higher values. If you want to be compatible with other software though, it may need to stay at 1K.


How much buffering does it do, if any? If not sufficient, it can result in either reduced performance or dropped log messages..


Why would you buffer udp packets?



They had to add a checkbox to http://nginx.com/products/feature-matrix/


Any systemd experts know if there would there be any benefit to supporting journald directly, rather than through the syslog API, on Linux hosts?


Have a look at this[0] blogpost. Also, I'm just guessing, once kdbus is in the kernel using direct calls to journald will (or could potentially) be more efficient.

[0] http://0pointer.de/blog/projects/journal-submit.html


Finally... it worked well with logstash but a native implementation is always better. Is that TCP only, I don't see any UDP settings?


It's UDP only. They also hard-code the port instead of looking it up with getservbyname, which is a little weird. If you want to use TCP, send syslog messages to localhost, and have a local rsyslog/RELP forwarder send messages remotely.


getservbyname under glibc uses dlopen, effectively working around static compilation. while this is a non-issue on ubuntu, the nginx devs are aware of the number of people building against uclibc or musl to produce static binaries for embedded use. I've seen nginx running on bare metal -- just it and libc, no kernel underneath. kudos to them for making this easier than they have to.


This is very useful when you need to consolidate error_log from hundreds of Nginx servers.


Did someone manage to setup remote log using the buffer option? I get a "parameter "buffer=32k" is not supported by syslog" but I'm pretty sure I'm doing something wrong.


It was about time, although I won't make any use of this feature.

ps. I always found metalog so much better than syslog.


I hope they also add cache purge into the community version as well.


Hooray! Whiny complaining on the Internet works sometimes!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: