Easy. 1. Find names appearing in both log files. From this create a Set of custo... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		corethree on Nov 14, 2023 \| parent \| context \| favorite \| on: My favorite coding question to give candidates Easy. 1. Find names appearing in both log files. From this create a Set of customers that appear in both. In python it's just creating a set from the first file, creating another set from the second file and unionizing them. O(N) 2. concatenate both files to treat as one. create a Map with key: Customer and value: Set[Page]. This is basically iterating through the log, when you see a customer append the customer_id to the map and add the page to the set if it already exists. O(N) 3. Filter the map for all customers with length(Set[Page]) > 1. To get the Set of all customers that visited more than one page. O(N) 4. Combine the sets of customers who visited multiple pages with customers that appeared in both log files into a new set. O(N). You can do this with, again the union operator in python. The irony is I'm more able to do this in python then I am in SQL. For SQL stuff at work I just look it up. I never remember the exact syntax. Total runtime O(N) total memory O(N) This is just your basic hashmap stuff for fast look up and aggregation. Usually fang questions are harder than this.

bvrmn on Nov 14, 2023 [–]

union -> intersection

corethree on Nov 14, 2023 | [–]

Thanks for the correction.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact