Hacker Newsnew | past | comments | ask | show | jobs | submit | ipsa's commentslogin

"Machine learning" used to be a safe haven. You could flee there to escape the Terminators and brain-on-a-chip graphics. Business PR deliberately killed that. They wanted their ML algorithms to be refered to as AI, so they could fully ride the hype train.

AI used to be a tight quirky community. Having the brain as inspiration led to all sorts of anthropomorphizing. This was ok. Researchers understood what was meant with "learning", "intelligence", "to perceive" in the context of AI. Nowadays, it is almost irresponsible to do this, not because you'll confuse your co-researchers, but because popular tech articles will write about chatbots inventing their own language and having to be shutdown.

Still, as a business research lab, it is good to get your name out there, so all the wrong incentives are there: Careful researchers avoid anthropomorphizing, and lose their source of inspiration -- you can not be careful with difficult unsolved problems, you need to be a little crazy and "out there". Meanwhile, profit-seeking business engineers and their PR departments, obfuscate their progress and basic techniques, all to get that juicy article with "an AI taught itself to X and you won't believe what happened next".

The researchers actually busy solving the hard problems of vision, natural language understanding, and common sense, do not have time to write books about how AI is not yet general. Nobody from the research community ever claimed that, nobody came forward to claim they've solved these decade-old problems. It is people selling books railing against the popular reporting of AI. Boring, self-serving, and predictive, and you do not need to fit a curve to see that.

All this quarreling about definitions and Venn diagrams and well-known limitations is dust in the wind. Go figure out what to call it on your Powerpoint presentation by yourself, and quit bothering the community.


What’s wrong with anthropomorphizing?

I’ve noticed at least as many people under-anthropomorphize as over. People who seem obsessed with human exceptionalism and are personally offended at the idea that plants and animals (and computers!) might have subjective experiences like our own.

But to me it seems obvious we are far more alike “lower” species than we are unlike them. I would say the cases of human exceptionalism are actually extremely rare. The main source of our uniqueness is that we amalgamate other species, not that we have transcended them.

My theory is that we are terrified that we might be simpler than we think, because socially we behave as if we are so singular. If we are simple, and animals and machines are like us, then maybe we should be treating them with more reverence.

But being afraid of that is OK for a random person. For a machine learning researcher I would hope they are more careful about what we have evidence for (the similarities between us) and what we don’t (that there is some ineffable magic about humans).


Anthropomorphizing is dangerous because it leads to metaphor that can both ascribe too much to the subject and create blind spots in the minds of researchers. Saying, for example, "Dogs want love," is fine for the owner but problematic for a researcher because love, as we understand it, is a human state. We'll never really understand what it means for a dog to feel loved. To the ethologist that is not to say that there are not similar emotional processes for dogs, it's to say that they cannot be understood by analogy to the human ones.

It's sort of like the color perception problem [1]. Dogs and machines do see colors, but what do they see?

1. https://newrepublic.com/article/121843/philosophy-color-perc...


You should go and read some stuff written by ethologists. Basically everything you said would be vehemently disagreed with by a large group of prominent ethologist. The term anthropodenial has even been coined to criticize your exact thinking and to describe the dangers of not anthropomorphizing enough. Not saying you can't over do it, but the GP's comment is much more in line with thinking by modern ethologist. Frans De Waal is a good place to start.


Ok things may have changed since I studied ethology


Right, to be fair to you this was a hotly debated topic in ethology (and still is to an extent), however I would say most modern ethologist have come out on the side of embracing evolutionary parsimony and viewing our human experience as a valuable asset to understanding animals (especially mammals).

Probably the most cited paper regarding this debate is by Marc Bekoff, "Cognitive Ethology: Slayers, Skeptics, and Proponents" (http://cogprints.org/160/1/199709005.html). Your original comment would be categorized as a "slayer" a position which is widely criticized. In fact Bekoff's focus is on canines and he used your exact example with dogs, but to opposite affect.


Phew, I'm surprised to see such an emotionally-charged article on the subject. Everyone who is uncomfortable with anthropomorphism is biased and misguided in some way, but extremist proponents are merely overly enthusiastic.

I do wonder about the theoretical bird scientist trying to figure out the "fixed action patterns" of other animals. If anthropomorphism is the way to go, surely it goes in the other direction in some way.


A review I just read (https://www.frontiersin.org/articles/10.3389/fpsyg.2018.0220...) suggests both of our viewpoints and seems to allow for a continuum of approaches without resorting to name-calling. I think that there's definitely stupidity in the history of "anti-anthropomorphism" if it's really true that people dismissed an article that started by saying bees appear to dance. After all, the fact that they have a behavior like that suggests something interesting is going on. It's also really easy to go overboard in simplifying animal behaviors to our own poorly-understood human behaviors.


Or people who say "The computer thinks...". No it's a machine that only does what people make it do.


We've seen that threshold crossed with neural agents like AlphaGo which can be reasonably described as thinking. It decides if moves are good or bad after a little pause for processing, its decisions improve with time, it has an opinion on the state of play, the opinion is formed using basically the same data as a human, different iterations of the neural network can have a different opinion but there is a link between it and the previous one.

I don't see a test that majorly distinguishes it from a human. It appears to be following the same process with a few tweaks around the edges. There are some exceptions in the 2-5 situations in Go where a human can actually use optimised logic to determine what will happen; but they aren't the meat of the game.


> We've seen that threshold crossed with neural agents like AlphaGo which can be reasonably described as thinking.

I don't recall ever reading in a technical paper, or in an interview, a leader in the field of ANNs claim they were thinking. If you have, I'd like to see a reference. Most are fairly honest about the differences between artificial neurons and real ones, and between human cognition and what ANNs are doing with data.


Is “thought” even a well defined scientific term? I doubt neuroscientists write about it either.


Chess is one of those areas where humans have developed computer-like abilities, such as exhaustive search. What's interesting is the appearance of intuition-like movement in modern chess computers, but is it ... intuition?


I feel that's just a semantics rabbit hole. "Think" is too broad of a term to be picky about.


They are both a problem, people do think human are somehow exceptional. We all agree that we are apes but none of us want to admit when we get horny in public.

But ML, AFAIK, is so simple; its literally a glorified polynomial functions. The only thing it get going for it is the large data set that we can train it on. It cannot "learn" anything from a small data set and extract any information out of it without a human imposing his/her knowledge on it.

For instance, take the concept of an even number. This simple knowledge is so powerful in solving algorithmic problems. But, its very hard to make a machine learn of this concept in general.

I think the problem is really overestimating how "intelligent" human are. We are only as intelligent with respect to our imagination. Its possible that there is an entire class of intelligent outside of our imagination that we cannot fully grasp its intelligent. Similarly, I am only conscious with respect to my own consciousness, but there may be another class of consciousness that is unimaginable to this monkey's brain.


"Don't antropomorphise computers. They really hate that" (NN)


Very well said. Also, curve fitting is not a corner case. Most relevant and intelligent things we care about can be solved with "just" curve fitting + extrapolation.


I think curve fitting is an important component of future AGI. But it definitely needs causal reasoning baked in, which leads to better models with less data [1,2].

My intuition is that there's a lot of important work to be done using logical representations of models and transforming them back and forth using well understood semantics operators. Deep functions will be part of said models, but the whole model does not necessarily need to be deep. We can already see hints of the field going in this direction in deep generative models [3].

[1] http://web.stanford.edu/class/psych209/Readings/LakeEtAlBBS....

[2] https://probmods.org/

[3] http://pyro.ai/examples/


Casual reasoning is one thing that is lacking. But what about creativity? What about drive and desire? What about belief and the will to fail on the road to success? What about collective intelligence and the need to peer up in efforts? What about emotional intelligence?

I personally do not believe in AGI since I also do not believe in psychology, sociology or neurobiology being anywhere near understanding the holistic nature of our own intelligence. We are getting better at emulating human traits for specific tasks with ML. We lack the specific knowledge of what the algorithm should mimic to become equal to us in terms of our intellect though.


>> But what about creativity? What about drive and desire? What about belief and the will to fail on the road to success? What about collective intelligence and the need to peer up in efforts? What about emotional intelligence?

All this resulted from evolutionary processes. Any approximation of AI which will deal with other agents will develop something like that and more in order to be competitive, collaborate and survive.


> All this resulted from evolutionary processes. Any approximation of AI which will deal with other agents will develop something like that and more in order to be competitive, collaborate and survive.

How can we assume that a simulated evolutionary process of a simple mathematical model or some arbitrarily sized multi-dimensional matrices yields similar evolutionary results?

Just think of the ongoing debate about quantum entanglement effects inside the neural signaling process. On a rather onthological level, we are still unable to formulate a mere definition of our consciousness or things like creativity that lasts longer than a few academic decades..


> Causal reasoning is one thing that is lacking. But what about creativity? What about drive and desire? What about belief and the will to fail on the road to success? What about collective intelligence and the need to peer up in efforts? What about emotional intelligence?

Hi, I work at one of the intersections of machine learning with certain schools of thought in neuroscience. The following is based entirely on my own understanding, but is at least based on an understanding.

Your list here really only has three problems in it: causal reasoning, theory of mind, and "emotional intelligence". Emotional intelligence works in the service of "drive and desire", considered broadly. Creativity likewise works for the emotions. To be creative, you need aesthetic criteria.

Most of that, we're still really working on putting into mathematical and computational terms.


Admittedly, that list is an arbitrary poke into areas of debate in your fields of profession.

As a take on your interpretation of creativity: I would argue that the act of forming new and valuable propositions is not related to emotion or aesthetics per se.

Aesthetic theory is observing a very narrow subset of creative processes. And even there, our transition from modernism into the uncertainty of the post-modernist world defies any sound definition of the "aesthetic criteria". Yet we perceive aesthetic human-creativity all the time.

In similar vain is the application of generative machine learning that spurs debate about computational aesthetics today. Nothing proofs better the incapability of modern ML forming real creativity than the imitating nature of adversarial networks spitting out (quite beautiful) permutations of simplified data structures underlying the body of Bach's compositions.

Now we could start on the assumed role of complex neurotransmitters in the creative process of the brain and the trivial way reinforcement learning rewards artificial agents, but that would push the scope of this comment.


>Now we could start on the assumed role of complex neurotransmitters in the creative process of the brain and the trivial way reinforcement learning rewards artificial agents, but that would push the scope of this comment.

You can't really separate emotion and aesthetics from the neurotransmitters helping to implement them! They're considerably more complex than anyone usually gives credit for.

Likewise, to form a valuable proposition, you need a sense of value, which is rooted in the same neurological functionality that creates emotion and aesthetics.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2666711/


> Likewise, to form a valuable proposition, you need a sense of value

Point taken


Wow. I want to thank you for engaging on that point! The "Hume's guillotine" dichotomization between "cognitive" processing and "affective" processing tends to be the thing our lab receives the most pushback on.


> The researchers actually busy solving the hard problems of vision, natural language understanding, and common sense, do not have time to write books about how AI is not yet general.

I've come to terms with the hype. There are still researchers doing the hard theoretical work, and they will still be toiling away after the next economic downturn. We can all choose every day whether to find fulfillment through seeking attention from other people, money, or satisfying our curiosity to solve problems.

> Nobody from the research community ever claimed that [AGI], nobody came forward to claim they've solved these decade-old problems. It is people selling books railing against the popular reporting of AI. Boring, self-serving, and predictive, and you do not need to fit a curve to see that.

Hear hear! That said, this is a good article by a respected researcher. Here's what LeCun had to say about it,

> ...In general, I think a lot of people who see the field from the outside criticize the current state of affair without knowing that people in the field actively work on fixing the very aspects they criticize.

> That includes causality, learning from unlabeled data, reasoning, memory, etc. [1]

[1] https://www.facebook.com/yann.lecun/posts/10156387222842143


This is currently true for almost all human endeavors. We're beset with PR people deliberately promoting misconceptions and out right lies. A recent article about "beewashing" is another good example of subverting human attention from real issues by over simplifying for the purpose of corporate PR. We are constantly bombarded by noise and lies so we won't be able to make sound and rational decision about anything. In recent years this transformed from a side effect of bottom line mentality to out right weaponization by powerful entities political and corporate.


Everything is a lie, until you're tautological. Machine learning itself seems a bit of misnomer. High dimensional curve fitting is a good description, imho.


Is nothing a lie, though?


If everything is a lie, then nothing is a lie, and that’s the truth (or not).


If everything is a lie, then "everything is a lie" is a lie.


Hm. No logic to be found there.


That was the point I tried to make. Arguably not very well...


"The researchers actually busy solving the hard problems of vision, natural language understanding, and common sense, do not have time to write books about how AI is not yet general."

Stuart Russell recently published a non technical book on AI. I really hope tech journalists take note


Honest question, aren’t the consequences for “real” researchers keeping their heads down quite severe? Won’t we have important policy decisions both public and private and billions in funding misdirected for years when they could best be put elsewhere? Sure the “real” researchers will have easier access to funding, which perhaps is a key motivating factor to not push back on the hype, but isn’t there a large opportunity cost to allowing hype and or bullshit to go unchecked because “they don’t have the time to write a book”?


The consequences of technical subject matter experts dabbling in policy are often pretty bad.

You can get involved in this, but it takes real work (i.e. time taken away from your research area) and an honest understanding that the policy issues their own deep specialty, and you are likely to be quite naive about it going in.


Hasn't this almost always been true?

On the plus side, it makes it fairly easy to ask cocktail-party-caliber questions and quickly suss out whether you conversation partner knows what the hell they're talking about.


> you do not need to fit a curve to see that

You haven't proven this statement. It's possible within your own brain is nothing more than a rudimentary curve fitting algorithm that allowed you to see this pattern.



This seems contradictory:

> AI can't predict social outcomes

> In most cases, manual scoring rules are just as accurate

So manual scoring rules don't work either for predicting social outcomes? There is some magic sauce that humans use for prediction that we haven't cracked yet? Nothing can predict social outcome?

AI is perfectly capable of predicting social outcomes, and only in very few cases are manual scoring rules as accurate as black box AI. The ethical concern is not about accuracy, but about our sensibilities when it comes to protected classes. The author cherry picked examples where simpler approaches also worked, but says nothing of practical feasability or increase in variance. Try actually doing face recognition or spam detection with manual rules.

Face recognition being way more accurate is just as much an ethical concern as a gun that is way more accurate. It all depends on who you point it at. Accurate face recognition at the border helps save lives as much as equipping the police with more accurate hand guns.

The talk of AGI is misguided. Everybody can see that the economy will be increasingly automated with narrow AI. Just because "big data" was a hype word, does not mean companies haven't been monitizing their big data (and were thus right to collect it).

We can predict probabilities about the future. The author is attacking these systems for not being 100% sure. Predictive policing is automated resource management. Militaries have been doing this for decades. It has its drawbacks, but also benefits (wiser usage of tax money, protecting low-income neighborhoods from falling in the hands of gangs).

The author also claims that algorithms automatically turn away people at the border for posting or liking or being connected to terrorist propaganda. But these systems just give a score and a human border guard makes the (more informed) decision.

A system not being 100% accurate is not an ethical concern, as long as we not treat those systems as 100% accurate and give proper recourse.

Just a spelling check can and does weed out poor candidates. Why does HR want to automate? Because they get 1000+ resumes for a single position. The manual glance they give them pale in comparison to what an automated system can do.

What is more likely? That these HR systems show promise? Or that the VC market has completely lost it (despite working with software and automation for decades, and have AI experts on staff) and is pumping billions into tealeaf reading, because now its called "AI"?

If you cheat the system by adding "Cambridge" or "Oxford" in white letters to your CV, is that ethical? Why not add it to your education section in black letters? Would you hire a good potential candidate, if you knew they acted like 90s search engine spammers? Maybe a candidate from Oxford or Cambridge really deserves to be on the top of the pile, or is it now unethical to look at education when hiring?

This presentation likes to mix ethics with technical success. Just say that a HR system is unethical, without calling it bogus with zero proof other than "some AI experts agree that this is impossible".

Yes, there is a lot of snake oil AI, and this will only increase. But these systems can and do work. I am sure there are AI experts building these systems right now.


The US economy (to which RenTech contributes with profits made on foreign markets) is a national security interest. Working to be very rich and then donating a large chunk of money to promote science - and maths education, also advances technological knowledge. Those depricated CRUD apps I wrote a few years back served neither.

We'd all like our opponent Poker players to play with their cards open, but if they did, they'd never win. Have to accept that you don't get to see the cards of winning players (which includes military technology until declassified).

I find this "Financial trading does not do any good for society" to be rather simplistic, romantic, and envious. The carpenter who turns wood into a chair is said to be producing value, but the investor who turned uncertainty into a profitable hardwood trade is not.


trading is largely a zero sum game. Companies like rentech earn money through speculation. If a craftsman makes a chair you have a char in the economy. If renetch extracts a few billion by betting on the stock market someone else has lost a few billion. The only hypothetical benefit is some liquidity but that is pretty meaningless in today's economy and there's even some evidence that high-frequency trading has negative net effects on volatility.

So no this has nothing to do with envy or romanticism. It's just a bad idea to have people with PhDs who could be changing the world play zero-sum gambling games on the stock market. These guys could be reinventing physics or bring us to mars. That's what makes people criticize these activities.

And as far as charity is concerned. Yes, Simon has done a lot of good. However, the other famous rentech guy is Robert mercer, and he is in the business of funding climate denialism. So whether you get a good philanthropist or a bad one is pure luck and has nothing to with the discussion on financial speculation.


Pretty sure the zero sum game trope is overplayed/inaccurate.


Strange that I think that a sarcastic reply that goes a bit further in moral punishment would be indistinguishable from a real reply these days. Mercer was practically forced to resign due to the backlash that his democratic political views generated against RenTech, but you deem the company as beyond redemption. You speak as if commenters here are idolizing, while being black-and-white religious and faux-cult-collective yourself, prescribing what others, we, should or should not worship. Take that to the street corner. If you gather a large enough audience, I will tell you that you are the one negatively influencing the lives of others. By my and your definition, companies and institutions would then be forced to disable you, your business, and your platform. But first let's go after Mercer.


I can only call “idolizing” the act of speaking good of someone only because he has had “outsized financial returns”. Where’s the “hacker” spirit in that? Why is that important for the alleged audience of this website?

Cry me a river for poor Mercer who was “forced” to resign. Was that comment satire? Had Simmons not known who his CEO was before the “backlash”? Of course he did.

I didn’t get the corner part, must be an US thing. Apparently if you demand some intellectual decency you’re “religious” (?!?)


So, assuming nothing against the law, how would they do it legitly? I am guessing:

- Treat the markets as a complex dynamical system and use the tools from statistical physics such as the Gibbs Ensemble, to derive internal states from input and output.

- Treat the markets as an encryption algorithm and use the tools from cryptanalysis, such as differential cryptanalysis: Even when unable to decipher the full algorithm (total break), one may still derive details and a subset of system functionality.

- They were probably the first to heavily use Hidden Markov Models (see Baum–Welch algorithm and the IBM speech recognition recruitment) and keep on the frontline with new machine learning algorithms (their deep learning revolution would have started 10-15 years before industry).

- They'd have an extremely solid backtesting pipeline, where any new feature can be stress-tested for signal. Features could be very arcane (% of mentions of the currency on neighboring state television) and are constantly (re-)added and removed: concept drift and market competition would gradually weaken signals, but fresh signals are added to keep the performance.

- All features are fed into a single final model (which may be an ensemble of many different forecasting techniques as to lower the variance). This model is very dynamic year-by-year (with just a few long-term signal features).

- Finally, I suspect there are strategies that only become available when you have 1 billion under control. In a physics sense: That is a lot of energy / control theory experimentation budget. Normally, hedge funds would like to avoid feedback loops and their trades moving the markets, but I suspect there is a lot of money to be made when you can calculate in which direction the market would move when the system is deprived of - or infused with a jolt of energy. More hands-off: Buy for 1 billion in stock at market open, sell at market close. Buy signal will take a few hours to converge and result in a higher price, so you make a profit when you sell your portfolio to the very buyer's market you created, causing a drop in price to complete the loop.

- The extreme returns for 2007/2008 could be due to the increase in volatility of the crisis (you can make more money when there is a lot of action, and competitors suffer from human herd bias / hysteria), but also, in part, due to them being the first to effectively exploit signals in growing social media platforms and search engines. A few years later it was public knowledge that gauging frequency and sentiment on Twitter was once a valuable signal.

- The NSA/CIA type recruits would not work on industrial spying, but on cryptanalysis, (graph) data mining, OSINT, HUMINT, IMINT, and for the security of the firm (which probably runs a tighter security than the intelligence agencies of smaller countries).


> Treat the markets as a complex dynamical system and use the tools from statistical physics such as the Gibbs Ensemble, to derive internal states from input and output.

No. This is what people like LTCM believe. It does not work, the underlying processes driving markets constantly change.

> - Treat the markets as an encryption algorithm and use the tools from cryptanalysis, such as differential cryptanalysis: Even when unable to decipher the full algorithm (total break), one may still derive details and a subset of system functionality.

- They were probably the first to heavily use Hidden Markov Models (see Baum–Welch algorithm and the IBM speech recognition recruitment) and keep on the frontline with new machine learning algorithms (their deep learning revolution would have started 10-15 years before industry).

Yes, and as a fun note, Peter Brown, their current CEO, was Geoff Hinton's grad student.

- They'd have an extremely solid backtesting pipeline, where any new feature can be stress-tested for signal. Features could be very arcane (% of mentions of the currency on neighboring state television) and are constantly (re-)added and removed: concept drift and market competition would gradually weaken signals, but fresh signals are added to keep the performance.

- The extreme returns for 2007/2008 could be due to the increase in volatility of the crisis (you can make more money when there is a lot of action, and competitors suffer from human herd bias / hysteria), but also, in part, due to them being the first to effectively exploit signals in growing social media platforms and search engines. A few years later it was public knowledge that gauging frequency and sentiment on Twitter was once a valuable signal.

- The NSA/CIA type recruits would not work on industrial spying, but on cryptanalysis, (graph) data mining, OSINT, HUMINT, IMINT, and for the security of the firm (which probably runs a tighter security than the intelligence agencies of smaller countries).

All correct.


We all know of market manipulation, like the story of the high school kid spamming stocks to Yahoo message boards [1], but what of competitor, - industry - or country manipulation? You find out that your competitor is trading on search trend signals, so manipulate the signals and bet against them. Model predicts high uncertainty for the British Pound, so bet against the Pound, and promote a Brexit. Model predicts copper will see long-term uptrend, so go long on copper futures, invest in heavy copper-use industry to expand their operations, strategically advise them to go long-stock on copper so they have an advantage over competition, later stage investors will notice the positive industry news and growth - and the uptrend in copper. All of demand and scarcity for copper will rise and your investments profitable.

BTW: RenTech made a fortune when they were long on oil futures and the Iraq war happened. Another possible legit use for the NSA/CIA type recruits could be for geopolitical intelligence.

[1] https://www.nytimes.com/2000/09/21/business/sec-says-teenage...


The one who ordered the drone strike is responsible (but hardly held responsible). Easy to draw the parallel with ordering the deployment of anti-personel landmines: the one who ordered deployment is responsible (but may not have signed the treaty, and thus, is hardly held responsible).

Autonomy or explainability is often a red herring. Look at who gives the orders. It is unlikely to ever be the programmer, even if they made a grave mistake. We have a history for that with smart rocket systems.


Television incentivizes forgetable reality TV, the radio incentivizes meaningless poppy music, social media incentivizes bickering about the controversies of today.

But, nowadays, you can also use your TV to watch a French arthouse film, to go to Youtube and be recommended a Japanese jazz album from 1974, to join the conversation on Twitter and ask questions to leaders in their respective fields.

Now you can swim against the current: Force all these power - and money - hungry institutions to fundamentally change their tune. Or you can find one of the many new waves to surf. Life is good, science is good, progress is good. The choice, as a scientist, is up to you. Can't write one groundbreaking paper a year? Write two or three mediocre ones. No amount of foundational change is going to make you a groundbreaking scientist. And change the channel once in while: the world is only getting bigger and more connected.


In the Netherlands there is the NCSC (National Cyber Security Centre). They also scan the internet: It continuously monitors all (potentially) suspect sources on the internet. When it identifies a threat (such as a virus or an attack on a website), it alerts public authorities and organisations. and can act as a mediator: If you discover a security flaw in another government body (such as a municipality or province) or in an organisation with a vital function (such as an energy or telecoms company), please contact the body or organisation first. If you receive no response, please notify the National Cyber Security Centre, which will mediate between you and the body or organisation concerned. with anonimity garantuee: The government treats the notifications it receives confidentially. It will not share your personal details with third parties without your permission unless required to do so by law or a court order. while avoiding court cases for doing your civic duty: When you report the security flaw, check that you comply with the conditions described above. If you do so, the government will not attach any legal consequences to your notification.


The fruit machine was reincarnated for pedosexuals: a device attached to their genitals measures if they get sexual arousal from pictures of children. Those that do are not deemed ready for rehabilitation.

Where most people yell scam or digital phrenology, I have a somewhat contrarian view: These systems do work. It is possible to tell, better than random guessing, if someone is gay or has a violent disposition, with just a single picture. Prisons for violent crimes see way more inmates that are bald, bearded, acned, square-jawed (signs of high testosterone). Replicated studies have shown that the profile pictures of gay men are significantly different from straight men, from subtle effects, such as more attention to grooming, to more physically noticable, like shape of the jaw being more rounded.

I have no reason to disbelief that an automated system could check for tell-tale signs that someone is hiding something: needing a lot of time to answer basic questions, using their lead hand to cover their chin, looking not in the direction commonly associated with recall, but that of imagination, trembling voice, anxious eye twitches, etcetera.

This is what flawed human border guards are already doing. Isreal has the most advanced airport security and trains European and American border guards to detect suspicious behavior. The TSA has over 3000 behavior detection agents. These are people with their own political and religious beliefs, prejudices, and variance -- and they can't be audited rigorously. We just never heard the accuracy, so we can't say if lie detectors can beat this (or can help as a human tool). But I bet they can.

I was dissapointed that the actual video chat with the journalist and the digital border guard was not included in the investigative article. They argue that the system be interpretable, but give no full transparency themselves. I'd trust that she did not tell any lies, but I don't trust that they did not try to game/fool the system, as to have an actual article to write about. Anyway, using just one test subject is majorly flawed, and comes close to not understanding that science can't provide 100% accurate predictions, just probabilities. I feel it is a reasoning flaw to discard any automated system, by honing in on a single mistake.


Why would you wait until someone is no longer a pedophile before deeming them ready for rehailitation?

Sure, one could grant that gay people on average are slightly more feministic, and criminals on average are more testosteronistic (as are athletes and law enforcement officers).

How do you define "work"? How is this information usable, even slighly, in a security context, overcome the completely predictable shitshow that it will create in practice?

You say "telltale", but that's not supported by the evidence.

> Israel has the most advanced airport security

Because they have highly trained officers interrogating people and searching packages, not running AI dowsing rods.


Rehabilitation in society: most people do not want convicted pedosexuals who show no signs of betterment to be around children, just like most people do not want murderers released when they say to the prison doctor that they still have an urge to kill.

Work, as in serve as a double-check for a human border agent. If someone failed to correctly (as deemed by a reasonably accurate system) answer all 16 questions, I do not want to fly with that person, before a border guard has had a second look. This is how fraud detection often works: An automated system gives a high score, and possible explanations for this score, and then a human analyst can make a more informed decision.

Here are some telltale signs that someone is lying: https://parade.com/57236/viannguyen/former-cia-officers-shar... & https://www.businessinsider.com/how-to-tell-someones-lying-b...

Model-performance based accuracy (both human and artificial neural networks) supports the evidence for efficiency.

> Because they have highly trained officers interrogating people and searching packages, not running AI dowsing rods.

These highly trained officers also sit behind a video camera to observe passengers. Do you think detecting suspicious behavior from video is AGI-complete? BTW: Isreal invests a lot into large scale face detection at its borders, has plenty of intelligent hardware devices aiding its security, uses statistics to skip a pat-down of a 5-year old Isreali boy, they track cars the moment they enter the parking lot and track the time there -- and cross-reference if this car has been near the border or power plants, they may (not sure) do social media analysis, like the US is doing now, the Isreali army unit Intelligence Corps 8200 is actively supporting airport security, the Isreali border patrol focuses all their attention on passengers, and not their luggage (why search their luggage after they've been cleared by a behavior check?), they use TraceGuard to swab clothes for substances, they have a similar Suspect Detection System called VR-1000 which automatically checks for signs of lies, such as profuse body sweat and eye movements, BellSecure ties up all sources of information on the web and in databases to get a better no-fly list, they track their own border agents with automated systems to spot opportunities for learning and malbehavior, WeCU also automatically checks facial clues, they have automated weapon scan systems, Vigilant's surveillance systems are deployed in Israel and the US and act as a digital border guard and motion/gait recognizer.

What may sound like an AI dowsing rod to you, could actually help combat airline terrorism.

> WeCU Technologies (as in "we see you") is a technology company based in Israel that is developing a "mind reading" technology for the purpose of detecting terrorists at airports. The company's products evaluate reactions to specific images for indications that someone is a potential threat.

> The technology involves projecting an image that only a terrorist would be likely to recognize onto a screen. The idea is that people always react when they see a familiar image in an unexpected location. For example, if a person unexpectedly saw an image of their own mother on the screen, their face and body would react. For the terrorist detection, the people passing by the screen would be monitored partly by humans, but mostly by hidden cameras or sensors that are capable of detecting slight increases in body temperature and heart rate. Other detection devices, which are more sensitive and currently under development, could be added later.


>> Here are some telltale signs that someone is lying: https://parade.com/57236/viannguyen/former-cia-officers-shar.... & https://www.businessinsider.com/how-to-tell-someones-lying-b....

That is all bunkum as evidenced by the sources quoted (business insider?).

Just to hand-pick an example I find particularly egregious - that touching one's face is a sign of lying. This guy would disagree:

https://www.youtube.com/watch?v=HlmNqwEhGIk

(Zizek ticking)


No the sources are from CIA and FBI agents trained in interrogation and spotting lies (and wanting to sell their books, like researchers want their research read). One of the agents used these signs to know that Timothy McVeigh was lying. They also give a counter to your hand-pick: Observe the person when they are not lying/natural environment, note any ticks, and discount these when interrogating.

Place your lead hand thumb on your cheek and two fingers on your chin and imagine you are talking to someone standing one meter from you. Do you feel sincere?

There is plenty of research that show that lie detection is not all bunkum, and that techniques such as cognitive overloading help catch lies and lower defenses (which need focus and don't come naturally to most people).


>> Place your lead hand thumb on your cheek and two fingers on your chin and imagine you are talking to someone standing one meter from you. Do you feel sincere?

I really can't think of anything I could do that could make me feel insincere when I was being sincere. This sounds a bit like the discredited claims about power-posing, or smiling to feel better etc.

I'm sorry but I really think you're letting yourself be taken in by some extraordinarily shoddy science and by the pseudo-scientific claims of people who are either engaging in magickal thinking and really believe they can "tell when you're lying" or just charlatans trying to take advantage of the naivete of others.


> I'm sorry but I really think you're letting yourself be taken in by some extraordinarily shoddy science and by the pseudo-scientific claims of people who are either engaging in magickal thinking and really believe they can "tell when you're lying" or just charlatans trying to take advantage of the naivete of others.

Did you win the Putnam?


Please do provide sources for all these claims



You're citing the Stanford gaydar paper, a pseudo-scientific attempt to cash in on the hype about neural nets. It was widely condemned for its ethical and technical deficiencies at the time.

e.g.:

https://thenextweb.com/artificial-intelligence/2018/02/20/op...

Edit: to clarify, I'm also interested in why you think all you say in your comment is true. The sources you cite either do not support your claims, or are disreputable like the deep gaydar paper [edit: or they are irrelevant like the sources about the training of border agents].

For example, I quote from the Wikipedia article on the plethysmograph:

>> 1998 large-scale meta-analytic review of the scientific reports demonstrated that phallometric response to stimuli depicting children, though only 32% accurate, had the highest accuracy among methods of identifying which sexual offenders will go on to commit new sexual crimes.

32% accuracy means those tests are incapable of detecting whatever they're looking for. Even if other tests are worse. My dowsing rod is better than my crystal ball at finding water, but that doesn't make it accurate.


> The sources you cite either do not support your claims, or are disreputable like the deep gaydar paper

"Measuring sexual arousal: https://en.wikipedia.org/wiki/Penile_plethysmograph & https://en.wikipedia.org/wiki/A_Place_for_Paedophiles" certainly seems to support the first claim: "The fruit machine was reincarnated for pedosexuals: a device attached to their genitals measures if they get sexual arousal from pictures of children. Those that do are not deemed ready for rehabilitation."


But the parent maintains that "these systems do work" when the wikipedia page says the opposite is true.


No. This is what the Wikipedia page says for measuring sexual response in pedosexuals:

> In one study, 21% of the subjects were excluded for various reasons, including "the subject's erotic age-preference was uncertain and his phallometrically diagnosed sex-preference was the same as his verbal claim" and attempts to influence the outcome of the test.[28] This study found the sensitivity for identifying pedohebephilia in sexual offenders against children admitting to this interest to be 100%. In addition, the sensitivity for this phallometric test in partially admitting sexual offenders against children was found to be 77% and for denying sexual offenders against children to be 58%. The specificity of this volumetric phallometric test for pedohebephilia was estimated to be 95%.

> Further studies by Freund have estimated the sensitivity of a volumetric test for pedohebephilia to be 35% for sexual offenders against children with a single female victim, 70% for those with two or more female victims, 77% for those offenders with one male victim, and 84% for those with two or more male victims.[30] In this study, the specificity of the test was estimated to be 81% in community males and 97% in sexual offenders against adults. In a similar study, the sensitivity of a volumetric test for pedophilia to be 62% for sexual offenders against children with a single female victim, 90% for those with two or more female victims, 76% for those offenders with one male victim, and 95% for those with two or more male victims.[31]

> In a separate study, sensitivity of the method to distinguish between pedohebephilic men from non-pedohebephilic men was estimated between 29% and 61% depending on subgroup.[27] Specifically, sensitivity was estimated to be 61% for sexual offenders against children with 3 or more victims and 34% in incest offenders. The specificity of the test using a sample of sexual offenders against adults was 96% and the area under the curve for the test was estimated to be .86. Further research by this group found the specificity of this test to be 83% in a sample of non-offenders.[32] More recent research has found volumetric phallometry to have a sensitivity of 72% for pedophilia, 70% for hebephilia, and 75% for pedohebephilia and a specificity of 95%, 91%, and 91% for these paraphilias, respectively.

These systems work! And, while scary, or invasive, or not 100% accurate, this is no argument to reason that they don't.


There has been no peer-reviewed paper calling in question the gaydar paper. There has been a master student who tried to replicate the study with his own crawled dataset, and got better than human guessing, but slightly below the paper accuracy. News outlets ran with that to say that the study was flawed. Another was by a Googler who claimed that the neural net solely looked at eye shadow or glasses, but he also got better than random and human guessing on his own sanitized dataset, and, one could argue that eye shadow and glasses are fair game when classifying from a face picture, as they are included in the picture, and these pictures were also shown to the human evaluators (even ground).

The next web article is by a journalist with a history degree, not an ML scientist. But based solely on the merit of his arguments, he also agrees with the results of the paper:

> there’s nothing wrong with the paper and all the science (that can actually be reviewed) obviously checks out.

and seems to take more issue with the ethical considerations, binary sexuality, and builds his point around: humans have no functioning gaydar at all, so it is insignificant that a neural net could beat a coin flip. His point is weak, as he gives no evidence for humans lacking a gaydar, and the paper (which was not wrong as claimed) includes human assessments which are higher than random guessing.

I think my contrarian view is true from mere pragmatism: Israel has the best airport security in the world, and uses these Suspect Detection Systems extensively, seemingly constantly improving and making enough profit for new players to enter the market. AKA the people that actually do this for a living keep innovating on it, and I find that rather unlikely if all of this is tea leaf reading.

I think, in general, that the HN crowd overreacts when it comes to controversial tech, and that a simplistic "this does not work, and is a sham, and fraud to take research money" is an uninformed weak claim. It takes a lot of chutzpah to denounce the many months work of legit scientists as obviously flawed from behind your keyboard when one probably has not even read the full paper. The authors, by picking such a controversial topic, are partly to blame for this pushback and popular media reporting, but that does not make it right.

I will not defend the use of plethysmograph and eye tracking studies to measure a sexual response. Just claim that it is better than random guessing, it allows for better treatment when measurements are out of line with self-reports, and that it is still in use and very similar to the Fruit Machine. The Fruit Machine is already back.

> My dowsing rod is better than my crystal ball at finding water,

This I do not get what you refer too (I know you as a ML knowledgable person from your other comments, so I am afraid to assume things, but if your crystal ball is random, and your dowsing rod is better than random, you are succesfully doing predictive modeling, no, not a sham? [1]). These systems do not need extremely high accuracy, if they do not auto-deny a person, and it is changing the goal posts a bit to demand accuracy when better than random guessing has been demonstrated (which is questioned by the majority of the commenters here).

> or they are irrelevant like the sources about the training of border agents

User kindly requested sources for all of my claims. I claimed this and sourced it. My point was that we already have human Suspect Detection Systems in place, so either those must go (you have a fundamental problem with SDS's) or they can't be automated (because you don't trust AI research or believe these systems need common sense problem solved first). I could then offer counter-arguments to both.

For the question about the eye direction, look at the sourcing for telltale signs of lies I posted in reply to another commenter. It depends on if you are left- or right handed.

[1] > A concept class is learnable (or strongly learnable) if, given access to a source of examples of the unknown concept, the learner with high probability is able to output an hypothesis that is correct on all but an arbitrarily small fraction of the instances. The concept class is weakly learnable if the learner can produce an hypothesis that performs only slightly better than random guessing. In this paper, it is shown that these two notions of learnability are equivalent. - The Strength of Weak Learnability


Regarding the gaydar paper, yes, I have read the full paper (if memory serves, I read two versions, a pre-print and the published paper). At the time, I wanted to publish a rebuttal, perhaps a letter in a journal or something, but in the end I didn't think I'd be adding much to the debate and the paper had been widely discredited already anyway.

My objection with the methodology in the paper was that the authors had assembled a dataset where the distribution of gay men and women was 50% of the population, i.e. there were as many gay women as straight and as many gay men as straight in the data. This was for one of their datasets, the one were everyone had a picture. There were two more where the distribution was less even but still nothing like what it's usually estimated to be. This despite the fact that the paper itself cited a result that gay men and women are around 7% of the population.

The reason for this discrepancy was clearly to improve the results by reducing the number of false negatives which are expected when there are many more negative than positive examples in binary classification.

This from the point of view of machine learning. There were other flaws that others pointed out, e.g. the choice of metric (I don't remember what it was now, I can look it up if you like), the premising of the paper on prenatal hormone theory that is another piece of bunkum without any evidence to back it etc.

And of course there were the ethical considerations.

Sorry but I don't have the courage to reply to the rest of your comment. You write way too much.


Rebalancing an imbalanced dataset is common in industry and academicia. You use that when you focus on accuracy, to make claims like: We were 54% accurate on classifying sexuality of females easily interpretable, without needing a distribution-balanced benchmark (you simply know it is a coin flip).

If there is signal in the rebalanced dataset, there should be signal in the imbalanced dataset. If they'd switched to logloss or AUC and an imbalanced dataset, do you think now their results would be as good as random? Because that is what you are implying and you are basically implying the research is fraudulent. This is a very strong claim to make, in the absence of legit discrediting studies that failed to replicate any predictability, and requires more than guessing the authors rebalancing act was "clearly" to improve the accuracy (with 7% negative class, you could get 93% accuracy by always predicting positive class, so if they wanted to inflate the accuracy, they shouldn't have rebalanced).

The ethical considerations are moot/personal opinion, as they passed the ethics board of Stanford. Those are people who evaluate ethics of academic research for a living, or are you saying they were also shoddy and wrong to give this a pass?

Magical thinking is not wanting something to be true, because it would be an uncomfortable truth, and so deeming that something which is objectively true, must be false, so you can continue to think happy thoughts in line with your world view.

You keep talking about the paper being widely discredited, but can't provide a single academic source for this. Instead, you question my sources (business insider?) while posting articles from The Next Web written by a History degree journalist who does not want the concept of binary sexuality to be true, or even allow it in constructing a dataset of gay and straight people by self-classification.

It takes more energy and letters to attack a point than to make a point. You made quite a lot of weak points.


>> Rebalancing an imbalanced dataset is common in industry and academicia. You use that when you focus on accuracy, to make claims like: We were 54% accurate on classifying sexuality of females easily interpretable, without needing a distribution-balanced benchmark (you simply know it is a coin flip).

You quoted The Strength of Weak Learnability and I figured you must have at least a passing acquaintance with computatinal learning theory. In computational learning theory (such as it is) it's a foundational assumption that the distribution from which training examples are drawn is the same as the true distribution of the data, otherwise there cannot be any guarantees that a learned approximation is a good approximation of the true distribution.

The following is a good article on machine learning with unbalanced classes:

http://www.svds.com/learning-imbalanced-classes/

I recommend it as a starting point.

>> This is a very strong claim to make, in the absence of legit discrediting studies that failed to replicate any predictability, and requires more than guessing the authors rebalancing act was "clearly" to improve the accuracy (with 7% negative class, you could get 93% accuracy by always predicting positive class, so if they wanted to inflate the accuracy, they shouldn't have rebalanced).

The gay class was the positive class and the straight class negative, in this case. If you did what you say and identified everyone as straight, you'd get a very high number of false negatives: you'd identify every gay man and woman as being straight. You'd get very high recall but abysmall precision. The authors validated their models using an AUC curve plotting precision against recall and such a plot would immediately show the weakness of an always-say-straight classifier.

>> You keep talking about the paper being widely discredited, but can't provide a single academic source for this.

An "academic source", like a publication in a peer-reviewed journal is not always necessary. For example, you won't find any peer-reviewed work debunking Yuri Geller. In this case my instinct is that no reputable scientist would want to get anywhere near that controversy (and that was one reason I also stayed away).


As to the work being widely discredited, the following is an article that summarises and links to criticisms:

https://greggormattson.com/2017/09/12/tracking-wang-and-kosi...

Some of the criticisms are technical, some are from the point of view of ethics. It would be a grave mistake to discount the ethical concerns, but if you prefer technical explanations there is quite a bit of meat there.


Thanks! That article has a lot of critique and I also like that the author collected the responses from one of the authors.

But, to me, most of the critiques seem uninformed (not made by ML practicioners) and focus on the ethics (where I agree with the authors: we need solid research into weaponized algorithms and show what is currently possible by ML practicioners, who may use such technology adversarialy, and can look at reclassifying profile pictures to the same degree as we do information about sexuality, religion, or political preference). By my estimation, most of the critiques are by people who find this research to be threatening to them, their friends, and their sexual identity. That may very well be the case, but it also leads people to conclude the scientific study was flawed and that an automated gaydar can't possibly work. Two replications by scientists who took issue with the paper, and lack incentive to fudge the data or metric to dress up their paper, also demonstrated a better than random automated gaydar. These systems work! (And that poses a problem we can now tackle, where before we did not even know this was possible, and the majority in this thread still thinks it is all bunkum).


Many statistical assumptions are regurarly broken, for pragmatic reasons (it just works better), or because the world is not static (and so the IID assumption is broken). There is an entire subfield of learning on imbalanced datasets, which includes resampling, subsampling, oversampling, and algorithms like SMOTE. It is common to use these techniques to get a better performance, including on unseen out-of-distribution data. Fraud - and CTR - and medical diagnosis models are regurarly rebalanced for other purposes than trying to break assumptions or cheat oneself into a seemingly higher accuracy. Plus, the signal does not dissapear when training only on originally balanced data. These systems do not work by the grace of a rebalancing trick alone, but they may work better (as usually the case with neural nets, which do not even give convergence guarantees: something only a statistician would worry about).

You can switch negative with positive class and my point remains: if the authors wanted the fraudulenty hack the accuracy score, this is way easier with imbalanced data. AUC metric robust to class imbalance anyway: ranking won't change for unseen data out of distribution, you can just adjust the threshold to match it.

I'd say an academic source is necessary in this case, because you implicitly accuse these scientists of doing shoddy hyped up work, with fudging tricks to appear more accurate. I need more than popular media sources or previous HN discussions to admit this paper was "widely discredited".

Your Yuri Geller example is a red herring: one is a stage magician, the other is peer-reviewed science. But to oblige: https://scholar.google.com/scholar?q="yuri+geller"


Yes, of course many theoretical assumptions are broken- but that is because people who break them either ignore them completely, or deliberately voilate them in order to produce better-looking results. That is more common in industry where it's easier to pull the wool over the eys of senior colleagues, but it's not unheard of in academia, quite the contrary. Anyway, just because people do shoddy work and then report impressive results doesn't mean that we should accept poor methodology as if it was good.

In particular about the gaydar paper, the authors cook up their data to get good results and then use those results to claim that they have found evidence for an actual natural phenomenon (hormones influencing haircuts etc). That's just ...pseudoscience.

Is your google scholar link humour?


You seem to be under the assumption that rebalancing is always bad or ignorant. That techniques, such as SMOTE, are only used to produce better-looking results and pull the wool over someones eyes. This is simply not true. Rebalancing is not shoddy, but accepted practice. It is certainly fair to question it, but not to draw the conclusion of fraud or shoddy science (without making you look pretty silly).

Again, I do not think rebalancing data justifies the conclusion that the authors were cooking up their data to report better results. Take a step back and assume good faith: could there be any other reasons to resample data, other than wanting to commit fraud?

The Google scholar links includes 10+ cited and peer-reviewed papers on the Yuri Geller drama.

I don't know enough about hormone theory to say anything against or for their conclusion, just focusing on showing that working automated gaydars that perform better than average/random guessing exist and have been scientifically demonstrated. I can agree with you on that the connection is spurious, without dropping my point that this controversial technology actually works (rebalanced or no).


> On deep nets having better gaydars than average human: https://psycnet.apa.org/doiLanding?doi=10.1037%2Fpspa0000098

Didn't that recognition system boil down to being an eyeglass and eyeshadow detector?


No. (and I feel there is no justification for downvoting requested sources).


>> looking not in the direction commonly associated with recall

Which direction is that?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: