Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Nuclio: Serverless for Real-Time and Data-Driven Applications (nuclio.io)
71 points by dominotw on Jan 21, 2018 | hide | past | favorite | 27 comments


Slightly OT, but I find it interesting how the definition of real-time depends on the industry:

* Embedded systems: nanoseconds to milliseconds (depends on the system, but at the very least needs to be predictable within some bounds);

* Application software (including web): tens to hundreds of milliseconds;

* TV broadcasting: seconds to minutes;

I'd love to hear more examples from different industries!


I always liked the way Bryan Cantrill defined it in one of his wonderful talks, which was basically that the difference between 'hard' real-time systems (eg missile guidance) and 'soft' real-time systems (eg whatsapp) boils down to the consequences of being late.


Traditionally, "realtime" means you can tell, in terms of actual wall clock time, how long different operations will take. Tends to show up in embedded systems because they tend to control devices in which fixed deadlines matter.

Somehow it morphed into a shorthand for "fast".


Yup, realtime was always predictable(ex. 16.666ms for 60fps). Time unit largely doesn't matter.

Generally precludes GC(yes there are edge cases where you can shoehorn GC in, I don't care about them since malloc is usually out of bounds as well).


If we want to talk about the minutia. None of these are "real time". Heck even the universe puts a limit on how quickly events can happen. We can look at the light from galaxies so far away that we see the galaxy billions of years into it's past. So nothing can be instantaneous.

However, I would say real time generally means acceptable latency or acceptable time reach to consistency. Although, I generally think of real time more along the lines of embedded systems.

Moreover, the word server-less really irks me. It's not server-less. You just don't have to manage the server or really interact with the server software that much. It's still a server.


server-lite?


> nuclio is an open source serverless platform which is faster than bare-metal code

WTF? Can somebody explain to me what they are trying to convey here?


Yeah that's a terrible term to describe what's going on, "bare metal" usually means running directly on an OS that's installed directly to hardware (not going through a hypervisor).


nuclio is using a real-time arch (no context switch, zero copy, no Ser/Des, parallelism, .. ) Go function serves 400K events/sec/proc, Py does 70K events/sec/proc, this is faster than running your own function with standard HTTP serving on a bare-metal server.

for comparison most serverless frameworks do 1-5k events/sec and up to 100x higher latency.

disclosure: i work on nuclio


My point is that - as described - it is a dumb comparison.

> open source serverless platform which is faster than bare-metal code

does't make sense. It eventually has to run on top of bare metal (via hypervisor or directly), so claiming that something that eventually runs on top of bare metal is "faster than bare metal" doesn't make sense at all.


Like AWS lambda but in a cluster you own.


So you take pretty much the single benefit of serverless ie low cost of ownership at low volume and then ditch that and have to run your own cluster? I guess if it was a zero redevelopment migration path from lambda it makes sense.


Python and "real time"? Tensorflow as an example in the image/example?

Needs more buzzwords, for sure. Developers need to spend more time understanding "real-time" as well.


Clearly this needs blockchain in the title, it's not meaningless buzzword dense enough otherwise.


I had to look up "Serverless", especially after it says "here is how to start our docker app".

I might be old by now, but serverless feels like renting a sublet place per hour. I guess it was only going to happen, after we went from 1 server to many instances to unmanageable mess: "aaeaeh let it be someone else's mess! Take my money!"


"Let it be someone else's mess, take my money," is the basis of most businesses. That's why I don't grow my own food, and I often don't make it.


I think I'm following along with this and it looks pretty cool.

Is there management for handling things with a slow startup time?

I've got a few things that have startup times in the region of 30s+ but then each thing to process takes very little time.

I found this, which I think is relevant: https://github.com/nuclio/nuclio/blob/master/hack/examples/p...

Are there some pitfalls about this to be wary about / known bad edge cases?


What is the median latency in milliseconds?


this is what I'm curious about, serverless automatically makes me think of cold startup time.


Some poking around suggests that currently they scale to a single copy, so for sporadic workloads latency will be good (relative to normal latency for warm code).

Based on the roadmap in the repo[0] (under "In Design").

Disclosure: I also work on a FaaS project, Project Riff.

[0] https://github.com/nuclio/nuclio/blob/master/ROADMAP.md


nuclio auto-scale today (use k8s HPA), the roadmap item is to scale-down to zero (today holds one instance when idle and will soon be able to shut it down when idle).

nuclio latency is as low as 30 microsec in Py & Go vs 10s of milisec in Lambda


Warm vs cold start is not a very meaningful comparison, though. Lambda typically scales from zero. Any warm system is going to greatly outperform a cold one. Java on Lambda can vary from nanoseconds to seconds, depending on whether you hit a warm copy or a cold start.

Disclosure: I work on a different FaaS, Project Riff.


I benchmarked all the frameworks in cold & warm, Warm is also very slow, no concurrency etc'. how many events/sec can u do with a warm Riff instance e.g. in Go & Py? whats the latency?


Let me put it another way: are you saying you have no cold-start penalty?

Edit: and to answer your question -- no, I have not personally benchmarked it. Can you point me to your suite? I've been thinking it would be useful to work on a benchmark that different implementations can share to catch regressions (and earn bragging rights, of course).


Great idea, we suggested to create an open benchmark at CNCF, a bunch of members opposed, we can come with a joint proposal. You can use our test tool https://github.com/v3io/http_blaster, it can generate a workload of > 600K http req/sec with one client and is programable.


Do you have a link to the benchmarking discussion? I'd be interested to see what the objections were.

I raised the topic today in a team meeting and I'll probably get to tinker over the next few weeks. I'm interested in exploring a few different axes of stress. RPS is one, but also things like "how does it behave with ten thousand functions installed?", "how does it behave with traffic yo-yoing?" etc.


Well I think this is frickin awesome. Very Well Done!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: