Hacker Newsnew | past | comments | ask | show | jobs | submit | Nib's commentslogin

As someone who has actively participated in DDH for a while now, here are my views:

- A non-trivial part of the current contributions included "cheat sheets" which IMO, really required a lot of effort to ensure correctness/usability but don't really provide much improvement to search results(I don't think I myself used the feature in the past 1.5 years more than 3-4 times), so, this should really free up time for DDG staff to focus on the more important instant answers and features.

- The community has been, for a while now, getting smaller and less contributing in the recent past. Backed by data from official repos(the number of commits over time, that is)[1]. After all, there are only a finite number of instant answers before they just become redundant.

- The current model for the triggers(when an instant answer gets displayed) is quite restrictive. It's just regex-based. IMO, a lot more growth can be achieved using ML models for triggering, A/B testing etc.

I'm still kind of disappointed with this. Perhaps unrelated, but does anyone have any suggestions for people willing to work on similar open source projects.

[1]: https://github.com/duckduckgo/zeroclickinfo-spice/graphs/con... , https://github.com/duckduckgo/zeroclickinfo-goodies/graphs/c...


Kiwix - most people are too conditioned to think that search has to happen online and don't even realize what is possible offline.

Entire web archives such as the entire dump of wikipedia and stackexchange (including media and indexes for search) can be stored locally. The missing piece is Google level search quality on the local machine. Given that brute force substring search can process Gigabytes in seconds nowadays. If you have enterprise grade server hardware things are reaching 1000GB/s. At this rate, there is no reason to think in a couple years local search of all known human knowledge can't happen on a local device at Google level result quality.

For anyone interested in the search space look into whats possible today in local offline search.


This is a great observation & seems to dovetail with technologies like IPFS.[1]

[1]: https://ipfs.io


You might be right, but human knowledge is also expanding, of course. The question is: will it expand faster than hardware capabilities?

Anyway, I wish we'd see more search and NLP related posts here on HN. It deserves far more attention than it gets.


For the average person this rate does not matter. They don't need access to the cutting edge of quantum physics, astronomy, dance, art or javascript.

All you have to do is look at the speed at which new info is being added to Wikipedia and Stackoverflow which is stabilizing, i.e. it is not growing as it once was. Basic/foundational knowledge is more or less all covered. https://en.wikipedia.org/wiki/Wikipedia:Modelling_Wikipedia%...

And that sum total comes to 50-60 GB compressed. Think about that number. It's not big.


The sum total of our collective intelligence is equal to an install of gtaV... Crazy.


Wikipedia is not the sum of our collective knowledge. It's little more than the preface.


We're talking about the "long tail" of information, which is huge also outside of science. Think popular culture.


It would be awesome if you could download dumps of wikepedia filtered by category so You can get the size down. Probably a lot of information that is useless to me in there


Kiwix does this, at least to a certain degree: http://wiki.kiwix.org/wiki/Content


Listen to Wikipedia http://listen.hatnote.com


NLP is rightly ignored.

https://en.m.wikipedia.org/wiki/Neuro-linguistic_programming...

Edit: Fortunately I'm left feeling foolish, rather than horrified.



The average user's needs are so small.

You do not even need "Google level" for most of today's web users.

You can deliver what users need with respect to web search with much less than "Google level".

For example a simple "<title>" search. This is how Google started.

The entry point into the web should be search for domains. A "<title>" search can do that.

Most users today do not do much searching within websites via Google. They search for websites using Google.

Anyway, you are right about storage space and offline search but obviously that truth misaligns with the "cloud" business narrative and coaxing users to store all their personal data in datacenters instead of on their desk or in their pocket.

Expect much opposition to this simple truth.


http://web.archive.org/ now provides full-text search, mostly of website titles.

Try it out. You'll find that it's... it feels like a trip back to 1998.


I'd say especially the average user profits from a search system that's somewhat clever and finds things even if they do not ask the exactly right query.

And searching for domains is only a tiny part of it, especially now where a lot of information is stuck in general sites with a lot of content (wikis, Q&A sites, social media sites) and not on special-interest sites. And for many generic searches the special-interest domains are various levels of spam/affiliate marketing.


PCIe 3 x16 devices have a 16GB/s theoretical max, so 1000GB/s is still out of reach for single machine I/O (though it's not as though search needs anywhere near these bandwidths anyway).


The Intel i9-7900x has 44 PCIe 3.0 lanes and wikipedia tells me each lane has throughput 984.6 MB/s so there's ~40 GB/s, maybe fast compression could make a small integer multiple.

https://www.intel.com/content/www/us/en/products/processors/...


AMD Threadripper has 64 in all available models: https://en.wikipedia.org/wiki/Zen_(microarchitecture)



That blog seems to imply you're using a distributed architecture, ie. not a single machine.


I've been using Google and Wolfram Alpha for these things over the years, but it has always irked me that I'm sending this info to a third-party, to run through their services that I have no way to read or improve the code, and knowing that these things are only available to me if I'm online. I was really happy when I found out the DuckDuckGo Instant Answers modules' source code is open.

It's been on my list of things that I will almost definitely never take the time to actually work on, but I wished what I had was (A) a browser extension or GNOME extension that incorporates an offline version of all the DuckDuckHack modules, and (B) the same thing in an open source mobile app. (This kind of thing could just as easily live in a command line app, though, and I'd be super happy if a project maintainer incorporated them into something like GNU Units.) I looked into it, especially for (B), but I realized that the DuckDuckHack code depends on Perl.


Well, about offline availability, a large number of instant answers(spices and fatheads that are) use external APIs or indexed databases from websites, so they can't work offline.

DDG does have official(and unofficial) browser extensions and apps for iOS/Android.


> Well, about offline availability, a large number of instant answers [...] can't work offline

Sure, but there are a large number of instant answers that can and do work offline because they're simple, static tables, or are self-contained—existing only to apply transformations on the input (e.g., cheatsheets, natural language unit conversions, and calculations).

> DDG does have official(and unofficial) browser extensions and apps for iOS/Android

A browser extension that just sends the query the same as it would if you hit their homepage is in the "what's the point?" category, just like mobile sites that nag you to install their app when all it does is show you the same content that is (or could be) on the mobile site itself. The "is a browser extension" is not the interesting part. "Doesn't send data to a third party" and "can operate without being connected to the network" are.


Why can't we have an intermediary search service that grabs search results from Google and posts them on a search website anonymously?


Startpage [1] is what you're looking for.

[1] https://startpage.com


Right. StartPage.com delivers Google search results in privacy. Plus, it offers a free proxy with every search result so you can visit websites through StartPage anonymously, too.


In DuckDuckGo, !g more or less does this, in that it disables search bubbling, but I think google can see your client IP when the results are served to your browser.


Banging into Google using !G is like searching Google directly. Banging from DDG doesn't confer any privacy protections. A lot of people don't know this.


Start page does just that. Ddg something and use !sp to search there.


Let me save you a lot of time for the future:

!s is enough to redirect to Startpage. :-)


searx proxies user requests to different search engines.

https://github.com/asciimoo/searx

there are different instances : https://github.com/asciimoo/searx/wiki/Searx-instances


What do you think is the best way to create impact/maximise potential impact on the world for a high school student developer? Elon Musk named a few things he thinks would be world changing in 10-20 years from now? What do you think high school students should be working on right now?



Yeah, right now it's only this. You can see one of my projects at http://nibnalin.tech/internet-history/


you should put something up on github.com


It did have a case. And everyone in the world can have this crash, I mean it wasn't even 4 feet. Totally rekt. I feel my specific piece was flawed, it's impossible for it to break at such low heights, any idea how to get it checked in any manner by an Apple guy or someone?


I get that, but my 5 survived a crash, and this didn't even come close, there was a pretty normal landing I would say, but it just shattered. Totally unexpected for it to do so..


So? Sometimes you get lucky, other times you don't. A single set of anecdotes isn't any evidence of any sort of widespread difference between the two device classes.


Hi, Here's my two cents.

TL;DR: I don't feel this is Apple. Though, it might be wrong speculation.

I decided to subscribe to their mailing list, and this was the footer in their confirmation email:

For questions about this list, please contact: updates@faradayfuture.com

Now, I decided to go ahead and hit them a mail, I was hoping it'd expose some detail of the company. But here's what I got instead:

Delivery has failed to these recipients or groups:

updates@faradayfuture.com The email address you entered couldn't be found. Please check the recipient's email address and try to resend the message. If the problem continues, please contact your henpdesk.

Now, this is understandable, but if it's being run by Apple, I'll be damned if they make such major mistakes, seeing their own line of work. But okay, that isn't concrete evidence of that.

Here's the more interesting part of the email:

Original message headers:

Received: from FF-MAIL1.faradayfuture.com (10.0.0.6) by FF-MAIL1.faradayfuture.com (10.0.0.6) with Microsoft SMTP Server (TLS) id 15.0.847.32; Sat, 7 Nov 2015 08:11:00 -0800

Received: from mail-ob0-f178.google.com (209.85.214.178) by FF-MAIL1.faradayfuture.com (10.0.0.6) with Microsoft SMTP Server id 15.0.847.32 via Frontend Transport; Sat, 7 Nov 2015 08:11:00 -0800

[1]

Did you see it yourself? Well, the company is running on Microsoft SMTP Servers. I mean, Apple seriously would not be doing that. Using Microsoft servers themselves would be too much. It's still possible, they're taking too many measures to hide the fact that it's Apple, and a this a part of those.

Another thing, very startup-sy about the company, is that they have liked something on their Facebook page[2], posted by some fanboy, which is not really possible if unless they have the whole Apple PR department at their back, or they made it themselves and monitor it personally. I mean, it's tough for Apple to monitor it themselves. One more thing I saw their Facebook page is Verified. Many other Facebook pages, with far more likes, and possibly, equally big teams/impact[3], don't have their pages Verified. Might be Apple, this shows.

[1]=https://imgur.com/kam9Yf8

[2]=https://www.facebook.com/faradayfuture/

[3]=https://www.facebook.com/LitMotors/


I hope I'm not too harsh, but a straightforward no will be my answer. It's a work in progress, but atleast on an iPhone 5, it seems to be worse than the desktop version of it. It does save me a pinch or two, but at the end it doesn't look user friendly and the upvote/down vote button are still the same. And commenting doesn't seem mobile friendly to me. So essentially the only 2 things a user should be doing on HN: comments and votes haven't improved.

There is a web app, namely HackerWeb, which even though lacks enough functionality but looks more humane and real. Try that out; and if not copy it, atleast take hints to improve your version of the site.

One more thing, why don't you guys come up with an online open contest, and let the community hack a newer version? That'll be fun, and make your lives a little easier as well.


IDK, but really the Force touch adaptation, renamed 3D touch(surprise surprise) is a nice touch, but really, I wonder if it's worth buying a new iPhone. I guess, at least with me, it's a somewhat privacy issue. Imagine your favorite apps revealing 25% of what you do on an app simply by holding your phone's home screen. Moreover, I often hold/touch my phone in different manners, like, asleep, just-got-up, really angry, running-to-some-place. It'll be really irritating if this kind of a feature got in my way while working. Most of us do NOT always use our phone's the way the advertisements expect us to.


People crying about Steve Job's views on a stylus:

1. Really, he said that for a phone. But considering the argument valid even on a tablet, well, to put it in scale, the iPad Pro is barely a tablet. It's nearly the average laptop screen(not considering those shitty huge ones that don't fit in bags).

2. Have you even looked at that thing. It's pretty slick. Imagine the utility to artists(the pencil from FiftyThree is an example of its utility).

3. They don't put a ass-like slot to shove it up. Period. That is, I suspect the biggest problem I've had with a stylus. They just do NOT make it compulsory for you to buy one, contrary to what Microsoft or Samsung would have(and are still) done.


I mean, I'm looking to learn, so, even if you give me a project which I don't know anything about, I'll be happier...

Otherwise, I'm a competitive programmer(algorithms and stuff), but I did do web development a year or so back...


If you know algorithms well, create a visual simulator for algorithms. Take a user input and then display the results from each step of algorithm visually to people who want to learn algorithm. Most algorithm books show the steps with boxes etc, build them dynamically and displays the changes user input goes through during each step.


This would be awesome. One of my favourite articles of recent is http://bost.ocks.org/mike/algorithms/ which covers this kind of thing.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: