Not specific to this, but if you can solve issue with progressive lenses, you have a big market. Progressive lens sucks. They simply do not work and causes too much strain and practically unusable for daily operations. What would be great if there's a digital way where the lens changes depending on your activity. Driving, then go to long vision. In front of laptop - change to reading glass etc. I will be your customer if you solve this
OpenAI continues to muddy the benchmarks, while Claude continues to improve their intelligence. Claude will win long term. It'd be wise to not rely on OpenAI at all. They are the first comers who will just burn cash and crash out I suspect.
When marketing talks about price delta and not quality of the output, it is DOA. For LLMs, quality is a more important metric and Nova would always try to play catch with the leaderboard forever.
Maybe. The major models seem to be about tied in terms of quality right now, so cost and ease of use (e.g. you already have an AWS account set up for billing) could be a differentiator.
Using LLMs via Bedrock is 10x more painful than using direct APIs. I could see cost consolidation via cloud marketplace a play - but I don't see Amazon's own LLM initiatives ever taking off. They should just lose those shops and buy one of the frontier models (while it is still cheap)
The major models are not tied in terms of quality. GPT-4 and GPT-o1 still beat everyone else by a significant margin on tasks that require in-depth reasoning. There's a reason why people just don't go for the cheapest option, whatever the benchmarks say.
Exactly. Citing cost has been an AWS play which worked during early days of cloud - so they are trying to stick to those plays. They don't work in AI world. No one would want a faster/cheap model that gives poor results (besides the cost of frontier model keeps coming down - so these are just dead initiative IMO).
On LLM, my experience with Claude has been much better than OpenAI models (though my use case is more on code generation)
For more complicated stuff, I did some experiments using LLMs to drive high-level AI decisions in video games. Basically, it gets a data schema and a question like "what do you do next?", and can query the schema to retrieve the info that it thinks it needs to give the best answer to that. GPT-4 and GPT-o1 especially are consistently the best performers there, both in terms of richness of queries they produce, and how they make use of them.
There's also a bunch of interesting examples along the same lines here: https://github.com/cpldcpu/MisguidedAttention. Although I should note that even top OpenAI models have troubles with much of this stuff.
https://github.com/fairydreaming/farel-bench is another interesting benchmark because it's so simple, and yet look at the number disparity in that last column! It's easy to scale, too.
Unfortunately, we're still at the point in this game where even seemingly trivial and unrelated minor changes in the prompt (e.g. slightly rewording it, and even capitalization in some cases) can have large effect on quality of output, which IMO is a tell-tale sign when the model is really operating in a "stochastic parrot" mode more so than any kind of actual reasoning. Thus benchmarks can be used as a way to screen out the poorly performing models, but they cannot reliably predict how well a model will actually do what you need it to do.
I continue to get puzzled with leetcode type things. Will coding exercises like this become obsolete someday or will the old guards continue to push for such things to be used during interviews? Today coding Copilots can easily replace leetcode "winners"
It replaced “did you go to a top university” and “do you know our exact stack and tools” as the top hiring practice. Now anyone can compete for the same jobs from anywhere in the world. It’s a leveller, in my mind.
I applied to Atlassian as a 50 year old from South Africa who has very broad and deep experience but has never worked in big tech. Failed at the last interview (of 5) but I found the process and feedback very fair and much better than others I’ve seen elsewhere. The focus was on collaboration and communication and not being a jerk, along with the problems to be solved. I enjoyed it.
The people / companies using it to hire will tend to hire people who studied leetcode.
The companies that don't use leetcode / don't use it as a strong signal will continue not using it.
In my opinion, leetcode is a poor signal if used as a binary decision ("were they right or wrong?"). The more important thing is communication and how they worked through the problem. I've heard this multiple places and I absolutely agree: attitude over aptitude. You can teach knowledge, you can't teach attitude or ability to problem solve.
The companies that use them as a strong signal are the ones that will be absolutely demolished in engineering because they are on a fast track for a staff full of rote memorization rather than strong creative problem solving. They'll be handicapped when it comes to solving problems that aren't covered by leetcode because no one there bothered to learn it.
> The companies that use them as a strong signal are the ones that will be absolutely demolished in engineering because they are on a fast track for a staff full of rote memorization rather than strong creative problem solving.
What metric are you using to measure this? I think its a positive correlation at best as far as good business outcomes are concerned, and maybe uncorrelated at worst. Google and Meta have been known to ask contrived puzzle questions for years, even before Leetcode (and Meta now is infamous for having a high standard for Leetcode interviews), and I do not see Engineering being "demolished" here, as far as results are concerned.
What people don't realize is that Leetcode selects for generally positive traits, no matter how its solved. In my mind, here are things LC selects for:
1. Actual innate skill. If someone never practiced leetcode before but can solve an arbitrary new coding question, they probably have a pretty decent aptitude for problem solving.
2. Determination. If someone practiced leetcode for hours a day just to pass an interview, thats commendable. Does it really matter that they don't "know" the problems if they studied hard and passed? They might be more likely to work hard on the job given the right rewards.
Leetcode style interviewes were basically initially created as a thinly veiled aptitude test, and in theory they are still good at that. If you _can't_ solve an easy to medium leetcode question then what does that say about you, assuming it basically is the inverse of the two singals above?
> I've heard this multiple places and I absolutely agree: attitude over aptitude. You can teach knowledge, you can't teach attitude or ability to problem solve.
Seems like society should try to figure this out.. if attitude is so important, why can’t we cultivate it systematically?
The way to cultivate it is letting people figure things out on their own and rewarding atypical solutions that arrive at the same conclusion.
A lot of education tries to deliver in a specific box and a specific way and any other method is punished. Society would likely get more value if education was more diverse in perspective and methods. It took me way too long to realize that I didn't care about anything but computers. I started doing better in my classes when I framed everything through my lens rather than viewing it through the instructor's lens.
I don't know the correlation but I think a lot of entrepreneurial families tend to have good problem-solving children or maybe it is a selection bias among friends. It could be genetic but I think these families tend to reward / not punish ingenuity. In addition, I believe part of the equation is leaving children alone to figure things on their own. It sucks but I think that letting your kids jump into the deep end of the proverbial pool makes them better for it even if they flail for a bit.
Give them boundaries but give them the space to figure it out and make mistakes. I know it drove my parents crazy that me and my siblings took everything apart and argued systematically but it paid off in our adulthood.
> Seems like society should try to figure this out.. if attitude is so important, why can’t we cultivate it systematically?
We do. It's called culture.
But culture doesn't optimize for something so narrow since it exists in a much wider context than just "what's good for knowledge workers in a capitalist system".
If candidates know going in that it's a leetcode interview it filters for "do you want to work here so much that you'll spend a month or two drilling on these exercises". Which can be a useful filter if there are thousands of qualified applicants.
Not sure what you meant. Leetcode type things solves a problem of past. Future will be copilots doing lot of grunt coding work and human value add would be instructing copilots the right way, bringing broader contextual information. Solving binary search tree problems during interviews will eventually go away (once old guards go away)
We agree, solving binary search tree problems do not need solving. That's an old problem. Problem is if you need people to choose and adapt algorithms for a specific new problem, copilot will have a hard time and so you will have a hard time if you don't know how the algorithms work and never adapted one. Because copilots are not good at choosing the correct algorithm and adapting it for new problems, as far as my experience with the competitive programming course I took goes anyway.
If you have an old problem, copilot can solve it, true.
This is unreal. I have never seen anything this fast. How ? I mean, how can you physically ship the bits this fast, let alone a LLM.
Something about the UI. Doesn't work for me. May be I like openAI chat interface too much. Can someone bring their own data and train ? That would be crazy!