Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My understanding of this is that it's "A git style CMS for all your data"... so you can't nuke things as there's history of it, and you can put any data you want into it.

Where I struggle is that either the definition of "your data" is narrow, or I shouldn't be using it for all my data.

Back in 1999 when I first learned about MP3, I started ripping my CDs. I have several thousand CDs, this took a lot of time. Before I completed the task, at a rate of a few CDs each evening, FLACs came into my life and I started back at the beginning. I deleted the MP3s as I replaced them with FLACs.

I really don't ever need to keep some data. But maybe it's not the kind of data that I should be putting in Camlistore? I think of it as my data, after all these are my CDs.

I struggle with the concept of Camlistore as I have an 18TB NAS in RAID6, 12TB usable... and it's 80% full. If I had history I'd have a storage problem today.

I'm perhaps an outlier, I chose to self-host my data locally rather than rely on cloud based things. And I chose to keep everything... photos, documents, email, video, music. And everything I keep is in the highest possible quality: FLACs, DVD VOBs, raw photos, etc.

But then... who is Camlistore aimed at if not the people who like to store and have control over their own data?

I guess I just find delete too valuable a feature for the larger data I store.

And perhaps I'm just wrong on the use-case, maybe it's really "for all your data (that you cannot re-acquire)". I just don't want to ever rip those CDs again. But if I do, those old versions are dead to me.



It's not quite as much like git as you might think it is. It's git-style in that it stores data as blobs named by hash and tracks everything with pointers to those blobs, but isn't as committed to keeping everything forever. Git is designed to be able to reconstruct a set of data at any point in that data's history, so it makes sense to keep all previous data in its storage system.

However, even git will delete data if you delete the "tree" metadata, ie you nuke some branch that has no downstream dependencies because you never merged it or there are no branches off of it. In that case, if the blobs aren't reachable by any tree/graph, git can garbage collect those blobs.

Camlistore does the same thing: if you delete all pointers to the data, those blobs might eventually be reclaimed. As a matter of implementation, camlistore doesn't do that today, but it's not the case that camlistore can't or won't let you delete data.


I think you're an edge case. In fact, there is a very precise parallel with the fact that you store RAW/FLAC/VOB data, while the vast majority of people is perfectly contect of JPG/MP3/AVI.

In addition to that, I think that there is not such a thing as space limits, at least, generally (not always, of course).

Specifically, one would imagine that the general attitude is 'I have to store such and such - where do I find the space'?

Instead, I think that in general, it is 'Oh, I have such space for free/cheap price... let"s store stuff!'. Especially with the advent of the Terabytes order of magnitude, I guess most of the storage is simply composed of movies, even if they're never watched, either once or more times.

Again, pay much attention to the bias. As an amateur photographer, I'm tempted to think that "raw is the law", but in fact, for the vast majority of people, it simply isn't.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: