This video shows the systems being built and shipped with cooling, cabling, etc.
It’s pretty mind blowing what this crisis shows from the manipulation of atoms and electrons all the way up to these clusters. Particularly mind blowing for me who has cable management issues with a ten port router.
This interested me as a simple-ish approach to a tough-ish promblem. From the example Trip Planner file:
"""Command line interface to process a trip request.
We use Gemini flash-lite to formalize freeform trip request into the dates and
destination. Then we use a second model to compose the trip itinerary.
* This simple example shows how we can reduce perceived latency by running a fast
model to validate and acknowledge user request while the good but slow model is
handling it.
The approach from this example also can be used as a defense mechanism against
prompt injections. The first model without tool access formalizes the request
into the TripRequest dataclass. The attack surface is significantly reduced by
the narrowness of the output format and lack of tools. Then a second model is
run on this cleanup up input.
*
Before running this script, ensure the `GOOGLE_API_KEY` environment
variable is set to the api-key you obtained from Google AI Studio.
Sadly not! It lives in our FE monorepo at work, it just kind of jumped out at us as something to split out and demo once we started using it to test some of our ideas. Its something wed consider for sure, but for now the lack of an open repo is kind of a bit of tech debt in a way. Easier to dev in our monorepo to get something out fast.
Iran (and some neighbors) starts the new year on the Spring Equinox, the first day of spring. It’s named Now Ruz which translates to new day. Kinda makes sense to kick off the year at spring. It’s also pretty precise give that it’s an astronomical event. It dates back to at least Zoroastrian times (15th century BCE).
All the equinoxes and solstices are celebrated there. The winter solstice is named Yalda Night, which was a few nights ago and Christmas may be related to this astronomical event. There is also Mehran and Tirgan. Ancients did like to get together and party.
I like that. I'm in favor of a calendar that works that way. The spring equinox does make a lot of sense. It's when plants start growing again where most people live in the northern hemisphere. The southern hemisphere seasons being the opposite of the north actually makes an equinox more equitable choice for a global calendar start/end point than a solstice.
"dec"ember used to be the 10th month which puts old new years at the beginning of march, a few weeks before the equinox. also, i haven't even noodled this in my head much but I think it works out in which direction it slips the date (hmmm maybe not), but it wasn't till Pope Gregory that it was realized there was a 100 year non leap year problem serious enough to impact the calendar.
The future of training seems to, at least partly, be in synthetic data. I can imagine systems where a “data synthesizer” LLM is trained on open data and probably some licensed data. The synthesizer then generates data “to spec” to train larger models. MOE type models will likely have different approaches in so far as something like a Mathematical expert likely gets a long way with training data from out of copyright works by Newton, Euler, et al.
It's already how we fine-tune open source LLMs. All of them live off data exfiltrated from GPT-4. And it seems to help closing the gap fast. Microsoft had a whole family of papers on this idea: TinyStories, Phi-1, Phi-1.5, Phi-2...
Synthetic data has many advantages - it is free of copyright issues, the downstream models can't possibly violate copyright if they never saw the copyrighted works to begin with.
It is also more diverse and we can ensure higher average quality and less bias. It can also merge information across multiple sources. Sometimes we can filter using feedback from code execution, simulations, preference models or humans. If you can "execute" the LLM output and get a score, you're on to a self improving loop. LLMs can act as agents, collecting their own experiences and feedback.
I think GPTs are a ploy by OpenAI to collect synthetic data with human-in-the-loop and tools, to improve their datasets. This would also be in-domain for users and for LLM errors. They would contain LLM errors and the feedback. Very good data, on-policy. My estimations for 100M users at 10K tokens per month per user is 1T synthetic tokens per month. In a year they double the size of the GPT-4 training set. And we're paying and working for it.
But fortunately 12 months after they release GPT-5 we will recover 90% of its abilities in open source models.
> Synthetic data has many advantages - it is free of copyright issues, the downstream models can't possibly violate copyright if they never saw the copyrighted works to begin with.
I feel like we don't know if this is true or not. If we decide models trained on copyrighted data aren't fair game, it's possible we'll decide "laundered" data also isn't.
I mean, maybe that's not feasible. And I hope we don't decide training on copyrighted material is bogus anyway. But I don't think we know yet.
But also - you can totally violate copyright of something you never saw.
Sure, but what matters for copyright is output, not input. For now.
If we make the (poor, imo) decision to prevent training on copyrighted data, that's a restriction on the training process, not on its result.
And in the world where we're making bad decisions to put legal restrictions on the training process, "can't train on data obtained by models that were trained without these restrictions" seems on the table.
This supports the hypothesis that the early molecules we think are needed to get to life happen elsewhere in the universe.
There are only a few chemistries that could even support life. Carbon is the one we know the most about, and also one that we find the most evidence for getting to that complex of chemistry in the real universe. Nobody knows of course, but odds seem good if there is other life out there is it carbon based. (nobody really can know either - the universe is so far away we can't really detect details very well)
Hydrogen, carbon and oxygen are amongst the most common elements in the universe. So probabilistically life will be carbon based. And it will probably be first formed in water. But it would develop tools on land once opposing thumbs are formed (from climbing trees). So I think there is a high chance aliens with technology will look a lot like us.
That logic follows... right up until the leap to arboreal lifeforms with opposable thumbs being an inevitability (or even a prerequisite to tool use!).
Opposable thumbs are not a guarantee. Nor are tree-dwelling lifeforms, nor trees, nor thumbs, nor digits, nor four limbs. For all we know, intelligent life elsewhere might better resemble intelligent octopi using alkaline metals as their first rudimentary energy source as we did with fire.
Why even assume multicelluarity as some kind of inevitability?
Hell, it took 2 billion years to get to single celled eukaryotes on earth. But the earth was teaming with prokaryotes the whole time, two entire separate branches of them, too, and they seem to have sprung into existence almost as soon as the earth cooled enough.
There's still a lot more of them in terms of total weight than animals and protists. A lot.
I was describing the most probable path, not the only path.
I do think that developing more sophisticated tools under water is difficult. Once you have plants on land, taller “trees” are very probable, because they are competing for light. Once you have trees, it’s likely that animals will climb then for safety or food. Once they climb, they will likely develop better gripping, etc…
I guess another way to look at this is that life on Earth is not special (although it is still an insanely amazing occurrence).
Abundant carbon is not the most compelling reason for aliens to be carbon based.
Carbon is one of a few elements to be able to easily bind to other elements including itself and able to form long and complex chains. Silicon is another that can form long, complex chains.
Leopards can't begin to approach what a monkey can do [1], since they can't hang safely, since they don't have thumbs. Leopards rely on penetration of their claws, or being on top of the branches, like squirrels. It works for the squirrel because they're small, and the tension forces applied to the bark are negligible, so they can just crawl upside down. You won't see a leopard hanging from anything, in a non-comical way. A leopard does have an advantage of clawing up thick trunks, which is one of the reasons they hunt monkeys that are first on the ground, rather already in the trees.
Yes great at hanging, like a coat hanger. You can only easily pull with a hook. With a thumb, you can push, pull, and easily grip, with much less risk of "unhooking". That's the same risk for a leopard: unhooking. Sloth can't climb a thinner vertical rope, or hang from a thinner branch, because their grip strength is relatively weak, due to the leverage. And, good luck using any sort of tool, like a stick to get terminates, with giant hooked claws!
I think there are lots of worlds that don’t have dry land. I doubt they would develop advance technology - hard to build fire to melt metals. But perhaps nature will find another way?
The trace chemicals are even more interesting. Phosphorus while only needed in small amounts has a critical role in life as we know it. Phosphorus also turns out to be extremely rare in the known universe. to the point it is noteworthy to find a star that has it.
I don't think this is an accurate reading of that article. See forex https://arxiv.org/abs/1704.08282 whose abstract reads in part:
> We also found average [P/Si] = 0.02 ± 0.07 and [P/S] = 0.15 ± 0.15 for our sample, showing no significant deviations from the solar ratios for [P/Si] and [P/S] ratios.
Quoting from the introduction I find:
Phosphorus abundances have been derived in planetary nebulae .... and in damped Lyman alpha systems using ionized phosphorus lines
...
Anomalously high phosphorus abundances have been measured using optical phosphorus features in blue horizontal branch stars ....
Molecular forms of phosphorus, such as PO, PN, and CP, have been detected and used to understand phosphorus chemistry in the interstellar medium ...
or example, phosphorus molecules have been detected in the interstellar medium ... and in star forming regions ... Phosphorus molecules have also been found in the circumstellar envelopes of evolved stars (... Finally, the diffuse interstellar medium has been measured using P II lines....
From afar it seems like the issues around Maven caused Google to pump the brakes on AI at just the wrong moment with respect to ChatGPT and bringing AI to market. I’m guessing all of the tech giants, and OpenAI, are working with various defense departments yet they haven’t had a Maven moment. Or maybe they have and it wasn’t in the middle of the race for all the marbles.
It’s pretty mind blowing what this crisis shows from the manipulation of atoms and electrons all the way up to these clusters. Particularly mind blowing for me who has cable management issues with a ten port router.
https://youtu.be/1la6fMl7xNA?si=eWTVHeGThNgFKMVG
reply