More

ryah · on Jan 4, 2013

A recent article in nytimes discussed research that estimates the cost of some government regulations on businesses. It's done by first noticing that company size follows a power-law distribution and then measuring deviations. I wonder if you were able to get enough data on investment returns if you would find any interesting deviations?

http://economix.blogs.nytimes.com/2013/01/02/why-49-is-a-mag...

ryah · on Sept 7, 2012

Kudos should go to Bert Belder and Ben Noordhuis of Cloud9 who built much of libuv and Igor Zinkovsky of Microsoft has contributed a lot of important work as well. As well many other contributors.

ryah · on July 28, 2012

Trouble? Node.js has linear speedup over multiple cores for web servers. See http://nodejs.org/docs/v0.8.4/api/cluster.html for more info.

KirinDave · on July 29, 2012

You know, ryah; I like your work ethic, I like your enthusiasm, I think you're a cool guy and it's great your project has so much traction.

But you say things like this and it worries me. Because a lot of people look up to you and either you said this because you feel defensive about your project or you said it because you genuinely don't understand the cases we're talking about here. And this is a problem because a lot of people look up to you and what you say, so when you say something as baffling as this response, you run the risk of leading a lot of people astray.

I was sort of at a loss for how to reply in the time I have to spare for Hacker News, but thankfully Aphyr did for me. But let me clarify what I said a bit, since I was a bit terse: The problem Node.js has is a social one. A lot of node hackers take the stance, "I thought Node.js solved the problems threads presented," (https://groups.google.com/d/msg/nodejs/eVBOYiI_O_A/kv6iiDyy9...) like there is a single axis of superiority and Node.js sits above the methods that came before. But the reality is that Node.js is really just another possible implementation in the Ruby/Python/Pike/-Perl-¹ space, and shares most of the same characteristics as those languages.

So you have a lot of people who are aces at front-end programming in the browser thinking they have a uniformly superior tool for tackling high-performance server problems, but really they don't have that; they just have a tool with familiar syntax. And so they fearlessly (and perhaps admirably) charge into the breech of platform programming without realizing that the way people scale big projects involves a lot of tools, a lot of thought about failure modes, and a lot of well-established algorithms with very specific tradeoffs.

And so this is Node.js's problem. It's just another gun in the gunfight, but its community thinks its a cannon. In a world where high-performance parallel VMs like Java or Erlang have very powerful and helpful languages like Clojure or Scala on top, we're in a funny situation. It becomes increasingly difficult to justify all these GIL-ridden implementations of languages.

Which is not to say these implementations don't have their place (and Node.js is hardly the first javascript implementation outside of a browser), but increasingly they are losing their place in the pieces of your code expected to shuffle data around efficiently in the backend of modern distributed applications.

¹ Correction, perl 6 doesn't plan share this behavior. What I read suggests it's not done yet.

ryah · on July 30, 2012

Node is popular because it allows normal people to do high concurrency servers. It's not the fastest or leanest or even very well put together - but it makes good trade offs in terms of cognitive overhead, simplicity of implementation, and performance.

I have a lot of problems with Node myself - but the single event loop per process is not one of them. I think that is a good programming model for app developers. I love Go so much (SO MUCH), but I cannot get past the fact that goroutines share memory or that it's statically typed. I love Erlang but I cannot get the past the syntax. I do not like the JVM because it takes too long to startup and has a bad history of XML files and IDE integration - which give me a bad vibe. Maybe you don't care about Erlang's syntax or static typing but this is probably because you're looking at it from the perspective of an engineer trying to find a good way to implement your website today. This is the source of our misunderstanding - I am not an app programmer arguing what the best platform to use for my website--I'm a systems person attempting to make programming better. Syntax and overall vibe are important to me. I want programming computers to be like coloring with crayons and playing with duplo blocks. If my job was keeping Twitter up, of course I'd using a robust technology like the JVM.

Node's problem is that some of its users want to use it for everything? So what? I have no interest in educating people to be well-rounded pragmatic server engineers, that's Tim O'Reilly's job (or maybe it's your job?). I just want to make computers suck less. Node has a large number of newbie programmers. I'm proud of that; I want to make things that lots of people use.

The future of server programming does not have parallel access to shared memory. I am not concerned about serialization overhead for message passing between threads because I do not think it's the bottleneck for real programs.

rvirding · on July 30, 2012

"I love Erlang but I cannot get the past the syntax."

I cannot understand this hangup about syntax. Syntax is the easiest thing to learn with a new language, you just look it up. It's the semantics and usage where the real problems are.

Erlang doesn't have static typing, it is dynamically typed. It has always been dynamically typed.

And however you look at it doing highly concurrent systems with processes is much easier. And who says that processes imply "parallel access to shared memory"? Quite the opposite actually.

zem · on July 30, 2012

syntax is the user interface of the language. it is one of the parts you have to deal with on a constant basis, both writing and reading code. i can well see how an unpleasant syntax can be a constant, grating annoyance when dealing with a language. (i happen to really like erlang's syntax, but it definitely has very different aesthetics from javascript's)

rvirding · on July 31, 2012

I think it would be dangerous if Erlang's syntax were to resemble that of another language like Java or C. Especially for beginners. That is because if things look the same you expect them to behave the same, and Erlang's semantics is very different from those languages whose syntax people think it should be like.

There is no getting around the difference in semantics. For this I think that having a different syntax is actually better. Also there are things in Erlang which are hard to fit syntactically into the syntax of OO languages, for example pattern-matching.

KirinDave · on July 30, 2012

Honestly? Erlang's syntax is not that bad. It's not great in that it has prolog legacy with uncanny valleys to C-legacy left and right, but it's effective.

Really the only really tricky thing for a newbie is strings.

zem · on July 30, 2012

these things are very much a matter of taste. what does "effective" mean with respect to syntax? for instance, lispers will argue that their syntax is effective because of the things it lets you do, but that's orthogonal to whether it's actually a pleasant syntax to program in, which comes down to personal taste[1]. likewise, i love mlish syntax, but opa, a language steeped in ml semantics, nonetheless switched to something more javascripty for their main syntax because it proved more popular[2]. and read through the wrangling over ruby versus python sometime - the two languages are very similar under the hood, but their respective syntaxes are one of the things proponents of each language complain about when they have to use the other.

[1] as larry wall famously said, "lisp has all the visual appeal of oatmeal with fingernail clippings mixed in."

[2] http://blog.opalang.org/2012/02/opa-090-new-syntax.html

silentbicycle · on July 31, 2012

It takes a lot of chutzpah for Larry Wall to say anything critical about Lisp syntax. Larry Wall. Come on.

I'm not the author of the comment above, but I think Erlang's syntax is effective in that it strongly emphasizes computation by pattern matching. If you write very imperative code in it (as people tend to, coming from Ruby or what have you), yes, it will look gnarly. Good Erlang code looks qualitatively different. There are pretty good examples of hairy, imperative Erlang code being untangled in this blog post: http://gar1t.com/blog/2012/06/10/solving-embarrassingly-obvi...

The . , ; thing is a bit of a hack, admittedly -- I suspect that comes from using Prolog's read function to do parsing for the original versions of Erlang (which was a Prolog DSL), and reading every clause of a function definition at the same time. Prolog ends every top-level clause with a period, not "; ; ; ; .". (Not sure, but a strong hunch, supported by Erlang's history.) I got used to it pretty quickly, though.

rvirding · on July 31, 2012

As in Prolog ',' and ';' are separators: ',' behaves like an and, first do then and then do this (as in Prolog); while ';' is an or, do this clause or do this clause (again as in Prolog). '.' ends something, in this case a function definition. Erlang's functions clauses are not the same as Prolog's clauses which explains the difference.

It is very simple really, think of sentences in English and it all becomes trivially simple.How many sentences end in a ';'?. And you almost never need to explicitly specify blocks.

Here http://ferd.ca/on-erlang-s-syntax.html are some alternate ways of looking at it.

silentbicycle · on July 31, 2012

Indeed. It makes sense to me. Are my intuitions about the origins of ;s separating top-level clauses accurate?

(Hello, Robert! :) )

rvirding · on Aug 2, 2012

Sort of. The syntax evolved at the same time we were moving from Prolog onto our own implementation, which forced us to write our own parser and not rely on the original Prolog one. The biggest syntax change came around 1991, since then it has been mainly smaller additions and adjustments.

rdm · on July 31, 2012

To be fair, that was Larry Wall reacting to criticism in 1994 https://groups.google.com/forum/?fromgroups#!msg/comp.lang.l...

Anyways, my experience is that all languages will get people criticizing them. And, in my experience, those kinds of criticisms should almost always be categorized as "does not want to talk about language FOO" and a proper response is probably something like "if you don't want to give that subject the respect it deserves, let's change the subject to something you find interesting".

sausagefeet · on Aug 1, 2012

Erlang syntax maybe jarring but one benefit of it is there is actually very little syntax relative to languages like Java, C++, Python. There isn't that much to hold in your head. I can switch between Erlang and another language pretty effortlessly because the context switch is so small.

KirinDave · on Aug 13, 2012

Hi. This mentality you have? I disagree.

Syntax benefits are not all about subjectivity. Anyone claiming this has effectively lost the plot and decided to go turtle in the discussion.

ches · on July 30, 2012

FWIW I believe "Maybe you don't care about Erlang's syntax or static typing" was referring to his earlier mention of Go as statically typed. A comma would have improved the clarity :-)

aeden · on July 30, 2012

"I love Erlang but I cannot get the past the syntax."

I also thought this. For some reason though the more I use it the more I am coming around to its syntax. There are lots of gotchas when compared with other popular languages, but still its syntax has grown on me recently.

At the opposite end of the syntactic spectrum is Lisp. The more I use Clojure the more I am loving it as well.

"The future of server programming does not have parallel access to shared memory."

I agree. The future of server programming is also not spawning multiple child processes. The question I have is how far off is that future? I know that most of my web applications today simply run in multiple spawned OS processes.

aphyr · on July 30, 2012

I agree. The future of server programming is also not spawning multiple child processes. The question I have is how far off is that future? I know that most of my web applications today simply run in multiple spawned OS processes.

Barring a massive shift in hardware architectures, shared access by cooperating threads is your only option for high-performance shared state. Look at core counts vs core flops for the last five years. Look at Intel's push for NUMA architectures. This is a long-scale trend forced by fundamental physical constraints with present architectures, and I don't see it changing any time soon.

Anyone telling you shared state is irrelevant is just pushing the problem onto someone else: e.g., a database.

canweriotnow · on July 30, 2012

"The future of server programming does not have parallel access to shared memory."

What about STM in Clojure? It's technically parallel access to shared memory, but the transactional nature obviates the need for mutexes and all the crap that makes shared memory a pain in the ass.

gar1t · on July 30, 2012

Erlang's syntax is one of those things that's just too hard to get past. Along with the other obstacles, it makes Erlang nearly impossible to attain!

http://www.youtube.com/watch?v=zY-IueSMAPc

rvirding · on July 30, 2012

Beware of "Kenneth the Swede"! Just stay on the straight and narrow and you will reach the Celestial City.

slantedview · on July 30, 2012

Wait, so, you wrote off the JVM because of XML and IDEs? Really?

ticking · on July 31, 2012

There is a ton of Languages for erlangs BEAM,

My favorite: Joxa a Clojure inspired (really just inspired ;)) lisp http://joxa.org/

Then there are Elixir (mentioned already) and Reia, Both seem to be inspired by Ruby. http://reia-lang.org/ http://elixir-lang.org/

Erlang has a far superior computational model an implementation than Node, it can handle way more requests faster and is newcomer friendlier as any web request is simply a message received.

KirinDave · on July 31, 2012

I do not see how you can simultaneously care about making programming better but not care about how people use what you're making to solve that problem.

Programming is a verb describing what people do with programming systems, and we're nowhere near the point where it can be done automatically yet. You cannot remove people from the equation and claim to be interacting with it.

angelbob · on Aug 3, 2012

I think he's saying he wants to make programming better for newbies.

It really kinda sucks for them right now.

Serious hard-core engineers that need serious tools are actually pretty well served by current tools. No, no, they're not perfect. But we're way better off than somebody who's just beginning in terms of what tools are aimed at us.

tjholowaychuk · on Aug 3, 2012

Agreed, and as computing grows messaging will just get cheaper and cheaper. zeromq is a great example of that, 5 million messages per second over TCP on a macbook air is not half bad.

Personally I like static typing, especially if somewhat optional, it's something we effectively do through documentation anyway (via JSdoc or similar), but makes it concrete.

I dont think light-weight threads sharing memory is so bad, symmetric coroutines are more or less the same as an event loop IMO, the thought put into working with them is more or less identical, just without callback hell and odd error-handling, but I suppose going all-out with message passing could be fine. I think that's still a bit of an implementation detail unless you get rid of the concept of a process all together and start just having a sea of routines that talk to each other.

dano · on Aug 4, 2012

I think the reaction from many long-term programmers who have switched technologies, careers, frameworks, languages is to the "...use it for everything?" mantra. I applaud your goals and the success at getting more people to program and build things. That's what we really do need. Let's hope this audience is reading HN with an open mind and figure out, as many here have, that the problem to be solved is the most important decision, not the tool.

Thank you for the nice write up.

tholschuh · on July 30, 2012

If you like Erlang but cannot get past its syntax you might want to give Elixir a try.

http://elixir-lang.org/

preist · on July 31, 2012

Syntax looks very Ruby like.

tholschuh · on July 31, 2012

Is that good or bad? I personally don't mind Erlang syntax at all or the syntax overhead to you have to write a gen_server that does nothing. Elixir saves you some of that overhead. It's kind of like CoffeeScript for Erlang. It's fully compatible since it compiles to Erlang AST which compiles to BEAM. You can do "everything" with Elixir you can do with Erlang. Of course, you have to understand how to work with Erlang/OTP to actually benefit.

rektide · on July 30, 2012

I admire Vert.x for trying to bring deployability to the separate-non-shared-event-loop world (aka: where Node is): Vert.x has "verticles," which are instantiated multiple times but share no data. It's very similar to node's cluster execution, except Vert.x is a thorough answer to the deployment problem (going as so far as to bake in Hazelcast if you want to scale out from one multi-threaded process to multiple likely-cross-machine processes).

Yet Node itself can not and should not solve the deployment problem: node is a javascript runtime, and contrary to earlier claims I'd declare not opinionated, not one to make this decision for us. The scaling out story is indeed not easy: even tasks like validating session credentials need to be mastered by the team (persist creds into a data-store, or use group communication/pubsub: building for Node is a lot like building for multi-machine Java). The level of DIY-it-ness here are indeed colossal.

What I'd contrast against your view- and I agree with most of your premise, that node is extremely seductive and dangerous and many are apt to get in way way way over their head- is that the comforts you describe are what kill these other languages, what strange and prevent us from becoming better more understanding programmers. Ruby, python, php, less so perl, the webdev that goes on there happens by and large at extreme levels of abstraction: developers flock to the known explored center, the tools with the most, the places that seem safest.

The dangerous dangerous scenario presented by most good web development tools is that it is the tools that know how to run things. Contrary to the charge into the breech throw up ad-hoc platforms in production every day mentality (of node), these (ruby, php, python) platforms stagnate, they fall to the ruin as their tooling strives towards ever reaching greater heights: the tools accrue more and more responsibility, there are better carved out & expected ways to do things, and incidental complexity, the scope of what must be known, how far one has to travel, to get from writing a page to it getting shipped over the wire or executing, balloons.

If anything, Node's core lesson to the world has been about how much is not required. Connect, the only & extremely extremely low-lifed common denominator of Node web world, is the meager-est, tiniest smallest iota of a pluggable middleware system (if only Senchalabs had been courteous enough to be more up front about it being a complete and total rip off Commons-Chain & to not add a thing, I would not bloody loath it). That pattern? bool execute(Context context). Did you handle this request? No? Ok, next. You need to deploy a bunch of processes on a bunch of boxes? You an probably write up something perfectly adequate in a week. Don't have a week? Go find a module: certainly Substack has at least one for whatever your cause (here it's Fleet, https://github.com/substack/fleet).

Node modules are wonderful. They all have some varyingly long list of dependencies, usually the tree is 4-8 different things, but the total amount of code being executed from any given module is almost always short of a couple dozen KB: your engineering team can come in and understand anything in a day or three, and gut it and rebuild it in another day or two. Modules, unlike how development processes have shaped up in hte past decade, are wonderfully delightfully stand-alone: there are no frameworks, no crazy deployment systems, no bloody tooling one is writing to: it's just a couple of functions one can use. The surface area, what is shown, is tiny, is isolated, is understandable, there's no great deep mesh. This runs so contrary to the Drupal, to the Rails, to the Cakes or Faces of the world where one is not writing a language, they're at the eight degree of abstraction writing tools for a library that implements enhancements for a framework that is a piece of an ioc container that runs on a application server that runs on a web server that runs in a runtime that actually does something with the OS.

We need to get more developers willing to charge into the breech and break out a gun fight. This stuff is not that complicated,* and the tools we have are hiding that fact from us more often than not.

So, I admire and love approaches like Vert.x, that take the reactor pattern (what powers Node) and blow it up to the n-th degree, that solve deployment challenges, but at the same time I don't think there is a huge amount of magic there: most node developers have not advanced their runtimes to the level of parity that is called for yet, but this I do not see as a colossal problem. Node, shockingly, even when hideously under tooled, under supported, under op'ed, seems to stand up and not fall over in, in a vast amount of cases. Problems of the rich, good problems to have, when your node system is having worrisome performance problems: most projects will not scale this big, Node will just work, and hopefully you have enough actual genuine talent with enough big picture understanding wtc on your side to not be totally frozen out when your traffic goes up 10x in two days and no one anticipated it. Node is not a land for hand holding, and I don't think that's a bad thing: I think it'll help us listen better to our machines, to not follow our toolings lead into the breech, but to consider what it is we really actually are building for ourselves.

aphyr · on July 28, 2012

It's parallel in the same sense that any POSIX program is: Node pays a higher cost than real parallel VMs in serialization across IPC boundaries, not being able to take advantage of atomic CPU operations on shared data structures, etc. At least it did last time I looked. Maybe they're doing some shm-style magic/semaphore stuff now. Still going to pay the context switch cost.

ryah · on July 29, 2012

it's all serialization - but that's not a bottleneck for most web servers. i'd love to hear your context-switching free multicore solution.

this is the sanest and most pragmatic way server a web server from multiple threads

aphyr · on July 29, 2012

Threads and processes both require a context switch, but on posix systems the thread switch is considerably less expensive. Why? Mainly because the process switch involves changing the VM address space, which means a TLB shootdown: all that hard-earned cache has to be fetched from DRAM again. You also pay a higher cost in synchronization: every message shared between processes requires crossing the kernel boundary. So not only do you have a higher memory use for shared structures and higher CPU costs for serialization, but more cache churn and context switching.

it's all serialization - but that's not a bottleneck for most web servers.

I disagree, especially for a format like JSON. In fact, every web app server I've dug into spends a significant amount of time on parsing and unparsing responses. You certainly aren't going to be doing computationally expensive tasks in Node, so messaging performance is paramount.

i'd love to hear your context-switching free multicore solution.

I claimed no such thing: only that multiprocess IPC is more expensive. Modulo syscalls, I think your best bet is gonna be n-1 threads with processor affinities taking advantage of cas/memory fence capabilities on modern hardware.

this is the sanest and most pragmatic way server a web server from multiple threads

What is this I can't even.

aphyr · on July 29, 2012

Note--think I'm wrong about these types of process switches requiring a TLB shootdown. It think it's just cache invalidation.

aphyr · on July 29, 2012

Don't believe me? Try it:

Node.js: https://gist.github.com/3200829

Clojure: https://gist.github.com/3200862

Note that I picked the really small messages here--integers, to give node the best possible serialization advantage.

    $ time node cluster.js 
    Finished with 10000000

    real 3m30.652s
    user 3m17.180s
    sys	 1m16.113s

Note the high sys time: that's IPC. Node also uses only 75% of each core. Why?

    $ pidstat -w | grep node
    11:47:47 AM     25258     48.22      2.11  node
    11:47:47 AM     25260     48.34      1.99  node

96 context switches per second.

Compare that to a multithreaded Clojure program which uses a LinkedTransferQueue--which eats 97% of each core easily. Note that the times here include ~3 seconds of compilation and jvm startup.

    $ time lein2 run queue
    10000000
    "Elapsed time: 55696.274802 msecs"

    real 0m58.540s
    user 1m16.733s
    sys	 0m6.436s

Why is this version over 3 times faster? Partly because it requires only 4 context switches per second.

    $ pidstat -tw -p 26537
    Linux 3.2.0-3-amd64 (azimuth) 	07/29/2012 	_x86_64_	(2 CPU)
    
    11:52:03 AM      TGID       TID   cswch/s nvcswch/s  Command
    11:52:03 AM     26537         -      0.00      0.00  java
    11:52:03 AM         -     26540      0.01      0.00  |__java
    11:52:03 AM         -     26541      0.01      0.00  |__java
    11:52:03 AM         -     26544      0.01      0.00  |__java
    11:52:03 AM         -     26549      0.01      0.00  |__java
    11:52:03 AM         -     26551      0.01      0.00  |__java
    11:52:03 AM         -     26552      2.16      4.26  |__java
    11:52:03 AM         -     26553      2.10      4.33  |__java

And queues are WAY slower than compare-and-set, which involves basically no context switching:

    $ time lein2 run atom
    10000000
    "Elapsed time: 969.599545 msecs"

    real 0m3.925s
    user 0m5.944s
    sys	 0m0.252s

    $ pidstat -tw -p 26717
    Linux 3.2.0-3-amd64 (azimuth) 	07/29/2012 	_x86_64_	(2 CPU)

    11:54:49 AM      TGID       TID   cswch/s nvcswch/s  Command
    11:54:49 AM     26717         -      0.00      0.00  java
    11:54:49 AM         -     26720      0.00      0.01  |__java
    11:54:49 AM         -     26728      0.01      0.00  |__java
    11:54:49 AM         -     26731      0.00      0.02  |__java
    11:54:49 AM         -     26732      0.00      0.01  |__java

TL;DR: node.js IPC is not a replacement for a real parallel VM. It allows you to solve a particular class of parallel problems (namely, those which require relatively infrequent communication) on multiple cores, but shared state is basically impossible and message passing is slow. It's a suitable tool for problems which are largely independent and where you can defer the problem of shared state to some other component, e.g. a database. Node is great for stateless web heads, but is in no way a high-performance parallel environment.

ryah · on July 30, 2012

Yep, serialized message passing between threads is slower - you didn't have to go through all that work. But it doesn't matter because that's not the bottleneck for real websites.

Also Node starts up in 35ms and doesn't require all those parentheses - both of which are waaaay more important.

KirinDave · on July 31, 2012

It is definitely a bottleneck I face daily in designing crowdflower. Our EC2 bill is much higher than it could be because we have to rely so much on process-level forking.

A great example of this is resque. It'd be great if we could have multiple resque workers per process per job type. This would save a ton of resources and greatly improve processing for very expensive classes of jobs. It's a very real-world consideration. But instead, because our architecture follows this "share nothing in my code, pass that buck to someone else" model like a religion, we waste a lot of computing resources and lose opportunities for better reliability.

What I find most confusing about this argument is that I challenge you to find me a website written in your share nothing architecture that, at the end of the day, isn't basically a bunch of CRUD and UX chrome around a big system that does in-process parallelism for performance and consistency considerations. Postgresql, MySQL, Zookeeper, Redis, Riak, DynamoDB ... all these things are where the actual heavy lifting gets done.

Given how pivotal these things are to modern websites, it's bizarre for you to suggest it is not something to consider.

bascule · on July 30, 2012

It's more than that. Several processes with small, independently garbage collected heaps are not as efficient as a single process with a large heap, parallel threads, and a modern concurrent GC (e.g. the JVM's ConcurrentMarkSweep GC)

In addition to that, processes severely inhibit the usefulness of in-process caches. Where threads would allow a single VM to have a large in-process cache, processes generally prevent such collaboration and mean you can only have multiple, duplicated, smaller in-process caches. (Yes, you could use SysV shared memory, but that's also fraught with issues)

The same goes for any type of service you would like to run inside a particular web server that could otherwise be shared among multiple threads.

rektide · on July 29, 2012

I prefer someone keep rolling for sanity loss & resume the work on isolates!

Web serving is OK & all, but I'd love if node could be an ideal runtime for petri-nets and webworker meshes too.

ryah · on July 20, 2012

Node follows a even/odd version scheme where even releases are stable and odd releases are unstable. v0.9.0 is just the first release of the unstable v0.9 series.

The versioning scheme is unfortunate because it confuses people. After Node 1.0 we'll switch a less confusing version system.

pfraze · on July 21, 2012

Is it not standard semver? What will node switch to?

ryah · on July 21, 2012

isaac likes semver - so probably that

ryah · on May 27, 2012

"not a good fit"

You're speaking of HTTP: the protocol on which a great deal of world's computing infrastructure is being set on top of. The protocol which we are communicating with each other over right now. HTTP on TCP is extremely successful and achieves all it was designed to for and has been extensible to handle much more. There are many basic problems which are abstracted well into stateless request response cycles: file serving, RPC. There was no "mistake" - it was not a bad fit. There are only expanding use-cases involving existing software infrastructure - for which websockets is solution. That's not to say that HTTP is perfect... It could be better in many ways - but it's ridiculous to write off its success and call it a mistake.

strictfp · on May 27, 2012

I am not dismissing HTTP. The mistake I refered to was to limit TCP usage for www by only allowing the request-response oriented HTTP protocol. HTTP is opinionated and its REST-based architecture was explicitly chosen to disallow partial updates of web pages, thusly crippling TCP. But if one wants to build "thick client" type apps the restrictions of HTTP will weigh you down hard. Allowing for unrestricted TCP usage makes a lot more sense in that case, and is exactly what WebSockets delivers.

One can argue that we were better off with the pure REST web before XmlHttpRequest, in fact I am inclined to agree there, but the crippling of TCP was bound to be cracked. The real humor is that the crack is now re-launched as a new great feature, when it was explicitly excluded for the break-down of REST that it causes.

MichaelGG · on May 28, 2012

You say HTTP crippled TCP so as to not break-down REST. HTTP 1.0 does not seem to be designed for REST; it doesn't seem that REST was really published until quite a while later. Calling HTTP "REST-based" seems a bit of a stretch.

On top of that, could you give a few examples of what you actually mean? Like what would a partial update be?

TCP seems like a perfect fit for the underlying transport for a request/response model; I don't see how choosing it is some sort of deliberate "crippling". What should they have done? Built a custom request/response protocol on top of IP? Isn't that just crippling IP's flexibility as a layer 3 protocol? I don't understand your objections.

strictfp · on May 28, 2012

Roy Fielding was involved in HTTP 1.0 and I believe that REST was a generalization of some of the principles used therein. I explained the breakdown in another comment, see http://news.ycombinator.com/item?id=4033717

strictfp · on May 28, 2012

I'm not saying that TCP was a super-bad choice, just that they wanted only a subset of the features and got a bit more than they wanted. Also see http://news.ycombinator.com/item?id=4033822 .

ryah · on Aug 20, 2010

docs are wrong, it defaults to utf8

ryah · on March 12, 2010

build and/or packaging systems

_3ex7 · on March 12, 2010

Are you referring to:

http://groups.google.com/group/nodejs/browse_thread/thread/c...

?

ryah · on March 6, 2010

Something that's different, at least from POE, EventMachine, Twisted is that it is attempting non-blocking purity. You can snub your nose at this but it turns out to be very important. It ends up abstracting away a problem in a way that EventMachine or Twisted will never be able to do simply because of the existence of massive amounts of blocking Ruby and Python libraries. The programmer simply doesn't have to know what non-blocking I/O is - they don't have to know that if they do "node script.js < hugefile" - that STDIN_FILENO cannot be selected on.

Erlang is different. Node presents a different programming model than Erlang - Node handles thousands of connections per process - and keeps itself close to the metal. Erlang is it's own operating system handling one connection per processes but allowing very many processes. Is Node's model better? Well at the moment it's incomparable since Erlang is very much a solid, production ready system and Node is not more than some dude's hack. But supposing that Node becomes stable and usable at some point, a major advantage of the Node model is that for the first 10,000-1,000,000 concurrent connections the programmer doesn't have to know anything about concurrency - a single process will just handle it. There is no forking or IPC - it's just synchronous events and callbacks. In Erlang you have to think about about IPC on the first connection. The Erlang-model might turn out to be the wrong level of process granularity. Maybe the right level of process granularity is how computers are already designed: 10,000-1,000,000 connections per process, use one per core.

I also wish for less hype around the Node project but I think you're short changing it by saying it's same thing as Twisted but in Javascript instead of Python. The main selling point is not that you can run the same code on browser and server. The main selling point is that you don't have to know what you're doing, at least for the first 10,000 concurrent connections.

jerf · on March 6, 2010

"There is no forking or IPC - it's just synchronous events and callbacks. In Erlang you have to think about about IPC on the first connection. The Erlang-model might turn out to be the wrong level of process granularity. Maybe the right level of process granularity is how computers are already designed: 10,000-1,000,000 connections per process, use one per core."

Also false. An Erlang "process" is not an OS process. It already works the way you say, except it can use multiple cores simultaneously. Automatically. Node.js has nothing on Erlang except a familiar syntax. Nor does IPC come up like some kind of blocker; it's baked into the core and quite natural, often all wrapped up behind a simple function call.

And let's not even talk about the baked-in clusterable Mnesia database or the OTP library, which Node.js can only dream about. (Erlang isn't just multicore, it's pretty easy to make it multi-system.) Or how Erlang has code replacement (the Node.js complaint of the day) built right in to the core, where it really has to be for it to work. And Erlang laughs at your "non-blocking purity"; in Erlang, you can call sleep and it won't block anything! No special support needed, no magic from the user, no arranging things in callbacks manually (how 1980s). Try that in Node.js.

It's actually impossible for Node.js to do some of these things because you have to start with them.

You don't understand how Erlang works. That's fine, except you started to criticize it. I refer back to my previous comment that the people hyping Node.js don't seem to have used the competition. Maybe my problem is that I have used the competition, quite extensively. Come to think of it, I've yet to hear anyone say "Gosh, I was using Erlang but Node.js is just so awesome I had to switch!:" That probably says something important.

ryah · on March 6, 2010

You should read my reply again (or maybe for the first time) and comprehend. I did not say nor imply that I think Erlang's processes are OS processes. Sorry for not using quotes ('process') if that's what has confused you so deeply.

jerf · on March 9, 2010

You can hardly expect me to read your mind on that point when you bury it in so many other wrong ideas about Erlang, which I note you do not challenge.

grayrest · on March 6, 2010

> I also wish for less hype around the Node project

I'm tremendously amused that you write this when every other poster on HN is desperately trying to generate hype for their project/company.

ryah · on Dec 1, 2009

clearly you're unfamiliar with http.

bumblebird · on Dec 1, 2009

I certainly am. It's a complete mystery to me. Is it like visual basic?

The only PITA with HTTP is chunked encoding. Whoever thought that gem up should be shot. The rest is fairly trivial. Certainly parsing headers is. This implementation looks pretty silly. Having individual states for each of the characters in "HTTP" etc? WTF?

edit: Instead of just downmodding me, why not explain exactly what part of parsing HTTP headers is non trivial?

viraptor · on Dec 1, 2009

The parser correctness is not trivial. The RFC2616 contains the complete grammar you need, so it's fairly simple to implement. OTOH, if you write a parser on your own, you're likely to miss stuff like section 4.2, which explains that header values can be multiline if they include LWS. The parser from this article will fail on (just looking at the source, I'm 99.5% sure of this):

    abc:
     def

It also doesn't like tabs and will not support comma-separated header values. It's not rocket science to write a "good enough" http parser, but writing a fully compliant one is something completely different. There are also cool parts of the spec that you can read 10 times and come to different conclusions - for example what does the "\" CR LF section mean if it's inside a quoted string and does it finish the header value or not. Writing a "correct" parser is a LOT of fun...

Keeping separate states for characters in HTTP saves you a couple of cycles probably, because you match as you go and can reject the message early and with the exact place that didn't match. It's a bit useless for a 4-letter string though.

ryah · on Dec 1, 2009

> Sure - speed++, but at what cost?

it only has to be written once. i think it's an important enough problem to warrant such code - definitely could use a few more macros though