Funny you mentioned that this is a meme to you, while it is really a technical consideration to me, and I supplied some details about my concerns.
Answers following your numbering:
(1) Calling S3 "this week's fashionable data store" is like saying that an elephant is an interesting microbe. The rest of your points about "please do not innovate, we have filesystems for 40 years and this is how you store your data". I do not agree. Disclaimer, I was member of the team that moved amazon.com from using an NFS based storage to S3. It was a great success, and it solved many of our problems including dealing with the insane amount of issues that was introduced by running an NFS cluster at that scale. And I would like to emphasize on scale, because your operational problems are quite often increase worse than linear with you scale.
I know about legacy code, and running several legacy services in production as of now. I can tell you one thing. There is a point when it is not financially viable to keep rolling with the legacy code. This point is very different based on your actual use case, banks tend to run "legacy code" while web2.0 companies tend to innovate and replace systems with a faster peace. I don't see any conflict here. We even did a compatibility layer for the new solution and it was possible to run your legacy code using the new system and your software was untouched.
(2) Nested directories is a logical layer on the top of how the data is stored, aka a view, your are a distributed FS developer so I guess you understand that. S3 also supports nested directories no biggie here.
Security. Well this is kind of weird because last time I checked S3 had an extensive security http://aws.amazon.com/s3/faqs/#security_anchor
Now the rest of your question can be re-phrased: "I am used to X why isn't there X with this new thing???" I am not sure how many of the file system users use SElinux my educated guess is that roughly around 1-10%. It is a very complex system that not too many companies invest using. For our use cases the fine grained ACLs were good enough so we are using those. File system durability: yes it is very important, this why I was kind of shocked about this bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/...
I guess you are right about the overhead of reading and writing, dealing with http headers etc. If the systems that benefit the most from S3 where single node systems it would be silly to use S3 at the first place. We are talking about 1000 - 10000 computers using the same data storage layer. And you can tell me if I am wrong but if you would like to access the same files on these nodes using a FS than you are going to end up with a locking hell. This is why modern software that is IO heavy moved away from in-place edits towards the "lock free" data access. Look at the implementation of Kafka log files or how Aeron writes files. This is exactly the same schematics how use use S3. Accident? ;)
I would like to repeat my original question: I don't see huge market for a distributed FS. I might be wrong, but this is how I see it.
"please do not innovate, we have filesystems for 40 years and this is how you store your data"
Please don't put words in my mouth like that. It's damn rude. I never said anything that was even close.
"S3 also supports nested directories no biggie here."
Not according to the API documentation I've seen. There are buckets, and there are objects within buckets. Nothing about buckets within buckets. Sure, there are umpteen different ways to simulate nested directories using various naming conventions recognized by an access library, but there's no standard and thus no compatibility. You also lose some of the benefits of true nested directories, such as combining permissions across different levels of the hierarchy. Also no links (hard or soft) which many people find useful, etc. Your claim here is misleading at best.
"last time I checked S3 had an extensive security"
Yes, it has its very own permissions system, fundamentally incompatible with any other and quite clunky to use. That still doesn't answer the question of how you'd do anything like SELinux with it.
Open up your bug list and we can have that conversation. Throwing stones from behind a proprietary wall is despicable.
"you can tell me if I am wrong but if you would like to access the same files on these nodes using a FS than you are going to end up with a locking hell."
You're wrong. Maybe you've only read about distributed file systems (or databases which have to deal with similar problems) from >15 years ago, but things have changed a bit since then. In fact, if you were at Amazon you might have heard of a little thing called Dynamo which was part of that evolution. Modern distributed systems, including distributed file systems, don't have that locking hell. That's just FUD.
"I don't see huge market for a distributed FS."
Might want to tell that to the EFS team. Let me know how that goes. In fact you might be right, but whether there's a market has little to do with your pseudo-technical objections. Many technologies are considered uncool long before they cease being useful.
Answers following your numbering:
(1) Calling S3 "this week's fashionable data store" is like saying that an elephant is an interesting microbe. The rest of your points about "please do not innovate, we have filesystems for 40 years and this is how you store your data". I do not agree. Disclaimer, I was member of the team that moved amazon.com from using an NFS based storage to S3. It was a great success, and it solved many of our problems including dealing with the insane amount of issues that was introduced by running an NFS cluster at that scale. And I would like to emphasize on scale, because your operational problems are quite often increase worse than linear with you scale.
I know about legacy code, and running several legacy services in production as of now. I can tell you one thing. There is a point when it is not financially viable to keep rolling with the legacy code. This point is very different based on your actual use case, banks tend to run "legacy code" while web2.0 companies tend to innovate and replace systems with a faster peace. I don't see any conflict here. We even did a compatibility layer for the new solution and it was possible to run your legacy code using the new system and your software was untouched.
(2) Nested directories is a logical layer on the top of how the data is stored, aka a view, your are a distributed FS developer so I guess you understand that. S3 also supports nested directories no biggie here. Security. Well this is kind of weird because last time I checked S3 had an extensive security http://aws.amazon.com/s3/faqs/#security_anchor Now the rest of your question can be re-phrased: "I am used to X why isn't there X with this new thing???" I am not sure how many of the file system users use SElinux my educated guess is that roughly around 1-10%. It is a very complex system that not too many companies invest using. For our use cases the fine grained ACLs were good enough so we are using those. File system durability: yes it is very important, this why I was kind of shocked about this bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/...
I guess you are right about the overhead of reading and writing, dealing with http headers etc. If the systems that benefit the most from S3 where single node systems it would be silly to use S3 at the first place. We are talking about 1000 - 10000 computers using the same data storage layer. And you can tell me if I am wrong but if you would like to access the same files on these nodes using a FS than you are going to end up with a locking hell. This is why modern software that is IO heavy moved away from in-place edits towards the "lock free" data access. Look at the implementation of Kafka log files or how Aeron writes files. This is exactly the same schematics how use use S3. Accident? ;)
I would like to repeat my original question: I don't see huge market for a distributed FS. I might be wrong, but this is how I see it.
http://kafka.apache.org/ https://github.com/real-logic/Aeron