Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This looks like a data warehousing of the archive. The two billion listings probably represents all expired ads ever. There is no way they have 2 billion active ads at any one time.


The above comment is correct.

The archive does have to be accessed by users though, since users can access listings from many years back.

The entire archive seems to be under 4 TB from what he described in the video (2 billion documents at 2 kilobytes each). They do not retain photos.


Yup. You hit the nail on the head.


How much photo data do you handle? How long do you keep it?


The photos are removed once the posting is no longer live on the site (roughly). As for how many, I'd have to dig a bit to find that out...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: