In addition to sites such as those listed, Universities often provide institution-level data hosting. So long as the university doesn't go under, they ought to be stable. An advantage is that there are local people who can help the researchers with the process, e.g. in setting up useful metadata and so forth.
I worry a bit about people just dumping data into large repositories, without thinking much about the format or the later uses, but only focussing on a checklist that needs to be ticked off to get that precious bean (publication) for the bean-counters (deans).
I work in research in a public university in Canada. IT is basically tech support, fix-my-email. There's no chance they would support hosting our data or any other sort of service.
The university expects that researchers self-fund their own stuff using grants. Need laptops for your grad students? Grant money. Need a server, bunch of disks and a sysadmin to care for it? Grant money. Which is only realistic if you're a huge lab with millions a year in grant money. And even then, what happens whn this grant runs out? Your new grant does not pay for hosting some 10 year old data, all money is earmarked for your new project (literally, it would be illegal to spend the money on another project). So the old hosting quietly goes away.
I checked this out. Turns out I was aware of the one for my university, but they only host written documents. Mostly PhD theses, a few other bits and pieces. No code or huge datasets.
Zenodo is a great place for code and datasets. For every paper we put any associated code, and any data that doesn't have a place in a more specialized repository:
Dalhousie used to have an Academic Computing Services department within IT, which was designed to provide computing expertise to researchers - development of software, hosting, and related services to support research projects. I was told it was fairly unique. It's been a while since I was there but AFAIK it was axed or at least cut back as being unnecessary.
I wish, we asked our university and they essentially wanted to charge ridiculous amounts to our research group. IIRC we asked about 100TB and their estimated cost was like 10k euros a year.
Regarding your dumping data comment. Yes that's certainly an issue. The problem is that researchers are required to do this, but there is no real consideration of the time this takes (it does not make a difference to your career/reputation etc. if you publish good or bad data), it's really out of the expertise of most researchers and there is little help provided by universities.
> In addition to sites such as those listed, Universities often provide institution-level data hosting.
(For those that haven't spotted it, these are permitted under 'Generalist repositories')
> An advantage is that there are local people who can help the researchers with the process, e.g. in setting up useful metadata and so forth.
I helped set up such a service nearly 10 years ago, and still help run it. There undoubtedly are advantages to depositing with us for the reasons you mention, plus we permit far larger publications than most services (our largest are around 1TB).
However we are a large, general university, and so have to deal with deposits ranging from theology related images to CT scans of fossils specimens to synthetic chemistry data. And all points in between.
Being general limits our capacity for detailed help concerning metadata and format standards for researchers since we just don't have enough data librarians with these specialisms. So my advice is to use a community established repository where available (UK Data Archive is a good example).
You are right about people just dumping data. Since 2015 (iirc) researchers have been expected by funders and publishers to plan their data storage and make it available ultimately. That doesn't necessarily lead to quality publications, though our reviewers try their best.
To paraphrase a researcher "I intend to give this process the minimum required". (This is not a typical response, happily)
I worry a bit about people just dumping data into large repositories, without thinking much about the format or the later uses, but only focussing on a checklist that needs to be ticked off to get that precious bean (publication) for the bean-counters (deans).