Roughly, we have three types of color-sensitive cone cells in our eyes. They each have different behavior in terms of how much they "react" to different wavelengths of light. Take a look at this chart: https://en.wikipedia.org/wiki/Trichromacy#/media/File:Cones_...
Each individual wavelength activates all three to some extent: think about this as a point in 3d space. For example, 400nm corresponds to something like (0.1, 0.05, 0.0), from that chart. 500nm might be (0.1, 0.4, 0.3).
But we experience a mix of many different wavelengths at once. So we can not only experience just these points in 3d space, but we can also experience any linear combination of any of these points. For instance, a mix of half 400nm and half 500nm light might be "sensed" by us as (0.1, 0.225, 0.15), even though maybe there's no individual wavelength that corresponds to that point. Any linear mix of any number of any wavelengths covers the entire gamut of what we can perceive.
The question then for someone picking primary colors for an additive display is: if I can only do a linear mix of three wavelengths, what wavelengths should I pick? What covers the biggest subset of the whole perceptible gamut? It just so happens that red, green, and blue do the best.
If you swapped green for yellow, there would be a section in that 3d space that you could no longer create. Specifically, the area where M cones are strongly activated compared to L and S. Unsurprisingly, this would be the greenest greens.
The tough thing is of course you are right. It's obviously in your best interest to vote against new housing if you own a house. But it's not in society's best interest.
1. why are you so sure? What makes you an authority on the whole society's interests?
2. even if so, are you saying that collective interests trump individual rights? This is socialism 101. Government will decide who should live (and work) where, and who should own what.
1. I am not sure. This is all just my layman's interpretation. I'll admit I could be very wrong and have no expertise. I'm just discussing.
2. No, in general I believe there's a difficult balance to be struck between individual rights and collective interests. I tend to lean more toward individual rights actually.
I would love it if local governments would stop restricting the rights of local developers and allow the free market to determine what is built where more often.
It's been about 4 years since I've been in this world, but I remember there being several products all doing a very similar thing: Presto, Hive, SparkSQL, Impala, perhaps some more I'm forgetting. Is the situation still the same? Or has Presto "won out" in any sense?
Presto and SparkSQL are SQL interfaces to many different datasources, including Hive and Impala, but also any SQL database such as Postgres/Redis/etc, and many other types of databases, such as Cassandra and Redis; the SQL tools can query all these different types of databases with a unified SQL interface, and even do joins across them.
The difference between Presto and SparkSQL is that Presto is run on a multi-tenant cluster with automatic resource allocation. SparkSQL jobs tend to have to be allocated with a specific resource allocation ahead of time. This makes Presto is (in my experience) a little more user-friendly. On the other hand, SparkSQL has better support for writing data to different datasources, whereas Presto pretty much only supports collecting results from a client or writing data into Hive.
I know Hive can definitely query other datasources like traditional SQL databases, redis, cassandra, hbase, elasticsearch, etc, etc. I thought Impala had some bit of support for this as well, though I'm less familiar with it.
And SparkSQL can be run on a multi-tenant cluster with automatic resource allocation - Mesos, YARN, or Kubernetes.
People tend to focus on what has been left out, but think about what they actually did learn about:
Drafting from that pool, item and skill builds, last-hitting, creep aggro, laning in general, jungling, item and spell usage, ganking, team-fight positioning, pushing objectives, warding, map control, farm priority, when to retreat vs engage. All of these require an understanding of micro vs macro goals and how they relate.
Some who administers some SAP thing for a huge supermarket chain tells me they do all of their upgrades on weekends because their users aren't at work.
Makes sense for them, but it would be a crazy thing to do the consumer-focused interweb company that I work for.
I rewrote an app that used to only deploy on weekends because we do a lot of processing on weekday afternoons and nights. I changed the deployd to the mornings.
I justified as saying if something breaks we have everyone there to fix it. Plus I'm not going to spend my weekend working if I don't have to.
My app has global traffic. Sometimes stuff breaks Sunday night and Asia is the first to find out.
Ugh, I know the feeling - one client I worked for would only do deploys a few times a year if that, and only at night. I mean we'd get extra pay or free time for doing night shifts, but really, it's not good practice.
I read the article, thought to myself, "let's see how HN finds a way to say this is actually bad for privacy", clicked through to comments here, and was not disappointed. The hivemind anti-Google kneejerking is quite out of control.
I'm an engineer who has worked on ad systems like this and I'm really struggling to make sense of this article - what hope does a layman have?
Here's my understanding: Google runs real-time bidding ad auctions by sending anonymized profiles to marketers, who bid on those impressions. The anonymous id used in each auction was the same for each bidder, which is in violation of GDPR. If Google were to send different ids for each bidder, it would be ok? Is this correct?
Why would it matter that the bidders are able to match up the IDs with each other, aren't they all receiving the same profile anyway? Wouldn't privacy advocates consider the sending of the profiles at all an issue?
This is a problem because companies can use this ID to correlate private user data, without anyone's knowledge or consent.
There are companies that specialise in sharing user information. Some of them work by only sharing data with companies that first share data with them (an exchange).
If you got this Google ID, and you had a few other pieces of information about the user, you could share that data with an exchange, indicating that the Google ID is a unique identifier. Then, the exchange would check if it has a matching profile, add the information you provided to that profile, and then return all of the information they have for that profile to you.
So, let's say you're an online retailer, and you have Google IDs for your customers. You probably have some useful and sensitive customer information, like names, emails, addresses, and purchase histories. In order to better target your ads, you could participate in one of these exchanges, so that you can use the information you receive to suggest products that are as relevant as possible to each customer.
To participate, you send all this sensitive information, along with a Google ID, and receive similar information from other retailers, online services, video games, banks, credit card providers, insurers, mortgage brokers, service providers, and more! And now you know what sort of vehicles your customers drive, how much they make, whether they're married, how many kids they have, which websites they browse, etc. So useful! And not only do you get all these juicy private details, but you've also shared your customers sensitive purchase history with anyone else who is connected to the exchange.
I have no doubt that if you had a record of my browsing habits for 2-3 days you could readily identify who I am the next time you have my browsing habits for that period of time.
I wouldn't be surprised at all if 2-3 hours of active browsing was enough for this.
It seems likely that the ad network could detect the change in ID if the expiration happens in the middle of a browsing session. Which, considering user habits, they are probably online at the same time every day, or have habits that cycle weekly.
Also, considering we largely do the same things every week and every day, I suspect a single day to give you at least 50% of a user's identifying data, and a week to give you at least 80%. That leaves a whole week of pretty accurate tracking.
I think you've made a pretty wild claim that 14 days isn't enough time to build a useful profile. Regardless, even if the usefulness of the data over two weeks is questionable, it's still illegal to share the data in this way. You wouldn't be too happy if someone broke into your house and "only" stole a single fork.
Considering how much time many people spend online, and how efficient these profiling systems have become, I wouldn't be surprised if 14 days was plenty of time.
The time of validity and how hard it might be to build a profile are not factors in whether or not this is legal under GDPR. Here's the actual text from GDPR on pseudonyms and synthetic keys of this type[1]
> The principles of data protection should apply to any information concerning an identified or identifiable natural person. Personal data which have undergone pseudonymisation, which could be attributed to a natural person by the use of additional information should be considered to be information on an identifiable natural person
So PII that has been pseudonymized (mapped to a gid in this case) is protected in exactly the same way as if it had not been if the pseudonymized data could be mapped to a natural person by the use of additional data. The pseudonym (gid) is itself also considered PII under gdpr.
[1] https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CEL...
> The pseudonym (gid) is itself considered PII under GDPR.
I know of multiple systems that use a UID but throw away a user’s information, including the UID mapping, when the user leaves. This allows historic metrics to be retained without ever identifying a user who isn’t still using the system.
Thank you: that explanation is the first that makes sense to me.
I get the impression that this structure would require an exchange: retailers would not trust each other otherwise.
Wouldn’t commercial pamphlets, interviews with salespeople, etc., from the exchange be obvious proof of illegal behaviour there? Google’s implementation is imperfect but, for the loophole to work, it would need coordination between several competitors and third party with a business model explicitly and almost exclusively about going around against GDPR.
If I can risk a comparison, that would be Google is like a chemical company selling fertilizer, and the exchange is selling bombs made from raw material bought by other people.
Am I missing the point? Shouldn’t this article be about those exchange and their clients, not Google?
> Why would it matter that the bidders are able to match up the IDs with each other, aren't they all receiving the same profile anyway?
I would guess that yes, they're all receiving – _from Google_ – the "same profile" but they also are collecting additional info that they can then share with each other and, because they can match profiles exactly, they can access each other's info about specific people.
> Wouldn't privacy advocates consider the sending of the profiles at all an issue?
I'd imagine that the profile Google has and shares is by itself fairly anodyne, but I could be (very) wrong about that. The problem seems to be more (if not entirely) that different advertisers can share info using a common profile ID.
I'd imagine that even a single advertiser would be able to perform a similar 'attack' by, e.g. running multiple different campaigns, but I may be misunderstanding exactly what info is being shared. It's possible advertisers are able to match the Google profiles to specific unique identities and thus are sharing much more than just the info they're collecting directly from their ads.
I'd imagine they are responsible too, not just alone, and that Google is a much more attractive target for GDPR enforcement both because they're larger, have more money, are more visible, but also because they're directly facilitating the "different advertisers" sharing that info.
If Google ceases to provide them the means of readily sharing info then all of those entities will no longer be violating the GDPR, in the scenario anyways.
Are they maybe only receiving a partial profile, with info relevant to that ad buy? And by compiling that data with the unique identifier, they can match it with other partial data from other ad buys?