> The presence of terrestrial microorganism within a sample of Ryugu underlines that microorganisms are the world's greatest colonizers and adept at circumventing contamination controls. The presence of microorganisms within space-returned samples, even those subject to stringent contamination controls is, therefore, not necessarily evidence of an extraterrestrial origin.
Basically that preventing terrestrial contamination of extraterrestrial samples is super tough, and in the specific case of Ryugu the study concludes that contamination did occur.
These two phases have pretty different performance characteristics - prefill can really maximize GPU memory. For long contexts, its can be nigh impossible to do it all in a single pass - frameworks like vLLM use a technique called "chunked prefill".
The decode phase is compute intensive, but tends not to maximize GPU memory.
If you are serving these models, you really want to be able to have larger batch sizes during inference, which can only really come with scale - for a smaller app, you won't want to make the user wait that long.
So, long contexts only have to be processed _once_ per inference, which is basically a scheduling problem.
But the number of decode passes scales linearly with the output length. If it was unlimited, you could get some requests just _always_ present in an inference batch, reducing throughput for everyone.
Decode speed is generally memory bandwidth bound. Prefill is typically arithmetic bound. This is the reason for mixed batches (both decode and prefill) - it let's you saturate both memory and arithmetic.
Chunked prefill is for minimizing latency for decode entries in the same batch. It's not needed if you have only one request - in that case it's the fastest to just prefill in one chunk.
I'm pretty sure the sibling comment is right about different length limits - it's because of training and model talking nonsense if you let too long.
Chunked prefill or some similar technique is also necessary for serving long context requests where there is not enough GPU memory available, regardless of concerns about latency.
For example, consider a prompt sent to Llama 3.1 405B that uses 128k input tokens.
The KV cache will be 123GB. No matter how many GPUs you shard the model across, you are not fitting that KV cache in GPU memory (a H100 has 80GB)
You can do tensor parallelism 8 ways (8 KV heads). You can also do pipeline parallelism (there is 126 layers). Either way would work. A million tokens is possible just very slow.
Also, 405b has 8 KV heads of 128 size (hidden_size/num_attention_heads) times 126 layers [0] times 2 (K and V) times 2 bytes (bf16) is 504k per token. At FP8 it's 252k.
It is also a training issue. The model has to be trained to reinforce longer outputs, which has a quadratic train-time cost and requires suitable long-context response training data.
They definitely have to be trained to reinforce longer outputs, but I do not believe this adequately explains the low-ish generation limits.
We are starting to see models with longer and longer generation limits (gpt-4o-mini having 16k, the o1 models going up to 64k), as well as longer and longer context limits (often 128k, google offering a million).
I find it very unlikely they are actually training with inputs or outputs near these maximums.
If you want to convince yourself, do the attention calculation math for these sequence lengths.
You can also see how openai restricts the sequence length for fine tuning to 64k - almost certainly bound by available GPU sizes
I suspect the 4096 limits have been set as a "reasonable" limit for a myriad of reasons.
It kind of reads like it was LLM generated..."Write an announcement/apology about ElasticSearch finally being open source again (for now), where each paragraph starts with a relevant title from a Kendrick Lamar song"
To me, at first, it read as satire - but that doesn't make sense, coming
from the official blog. Being LLM-generated is a plausible explanation
- considering the circumstances, saying "open source is in our DNA" is
right inside the uncanny valley.
You could make the argument that two things that we don’t understand are the same thing because we’re equally ignorant of both in the same way that you could make the argument that Jimmy Hoffa and Genghis Khan are probably buried in the same place, since we have equal knowledge of their locations.
Clearly there is a difference between a small person hidden within playing chess and a fully mechanical chess automaton, but as the observer we might not be able to tell the difference. The observer's perception of the facts doesn't change the actual facts, and the implications of those facts.
The Mechanical Turk, however, was not a simulation of human consciousness, reasoning, chess-playing or any other human ability: it was the real thing, somewhat artfully dressed-up as to appear otherwise.
Is it meaningful to say that Alphago Zero does not play Go, it just simulates something that does?
Well, I do not proclaim consciousness: only the subjective feeling of consciousness. I really 'feel' conscious: but I can't prove or 'know' that in fact I am 'conscious' and making choices... to be conscious is to 'make choices'... Instead of just obeying the rules of chemistry and physics... which YOU HAVE TO BREAK in order to be conscious at all (how can you make a choice at all if you are fully obeying the rules of chemistry {which have no choice}).
A choice does not apply to chemistry or physics: from where does choice come from - I suspect from our fantasies and nothing from objective reality (for I do not see humans consistently breaking the way chemistry works in their brains) - it probably comes from nowhere.
If you can explain the lack of choice available in chemistry first (and how that doesn't interfere with us being able to make a choice): then I'll entertain the idea that we are conscious creatures. But if choice doesn't exist at the chemical level, it can't magically emerge from following deterministic rules. And chemistry is deterministic not probabilistic (h2 + o doesn't magically make neon ever, or 2 water molecules instead of one).
Experience and choice are adjacent when they are not the same.
I specifically mean to say the experience of choice is the root of conscious thought - if you do not experience choice, you're experiencing the world the exact same way a robot would.
When pretending you are in the fictional character of a movie vs the fictional character in a video game. one experience's more choice, is making conscious decisions vs a passive experience.
Merely having an experience is not enough to be conscious. You have to actively be making choices to be considered conscious.
Consciousness is about making choices. Choices are a measure of consciousness.
I don't think this is clear at all. What I am experiencing is mostly the inner narrator, the ongoing stream of chatter about how I feel, what I see, what I think about what I see, etc.
What I experience is self-observation, largely directed through or by language processing.
So, one LLM is hooked up to sound and vision and can understand speech. It is directed to “free associate” an output which is fed to another AI. When you ask it things, the monitoring AI evaluates the truthfulness, helpfulness, and ability to insult/harm others. It then feeds that back as inputs to the main AI which incorporates the feedback. The supervisory AI is responsible for what it says to the outside world, modulating and structuring the output of the central AI. Meanwhile, when not answering or conversing, it “talks to itself” about what it is experiencing. Now if it can search and learn incrementally, uh, I don’t know. It begins to sound like assigning an Id AI, an Ego AI, and a Superego AI.
But it feels intuitive to me that general AI is going to require subunits, systems, and some kind of internal monitoring and feedback.
Because you don’t see X is not a proof that X doesn’t exist. Here X may or not exist.
X = difference between simulated and real consciousness
Black holes were posited before they were detected empirically. We don't declare them to be non-existent when their theory came out just because we couldn't detect them.
Throwing all the paintings made prior 1937 into an LLM would never get Guernica out of it. As long as it's an LLM this stands, not just today but all the way to the future.
This empty sophistry of presuming automated bullshit generators somehow can mimic a human brain is laughable.
The author fails to provide any argument other than one of incredulity and some bad reasoning with bad faith examples.
The dollar bill copying example is a faulty metaphor. His claim of humans not being information processors and he tries to demonstrate this by having a human process information (drawing from reference is processing an image and giving an output)...
His argument sounds like one from 'it's always sunny'. As if metaphors never improve or get more accurate over time, and that this latest metaphor isn't the most accurate metaphor we have. It is. When we have something better: we'll all start talking about the brain in that frame of reference.
This is an idiot that can write in a way that masks some deep bigotries (in favor of the mythical 'human spirit').
I do not take this person seriously. I'm glossing over all casual incorrectness of his statements - a good number of them just aren't true. the ones I just scrolled to statements like... 'the brain keeps functioning or we disappear' or 'This might sound complicated, but it is actually incredibly simple, and completely free of computations, representations and algorithms' in the description of the 'linear optical trajectory' ALGORTHIM (a set of simple steps to follow - in this case - visual pattern matching).
I honestly don't mind an enterprise browser, as I assume anything I do on my work machine is/can be tracked by my employer (and also assume they have some responsibility to track it due to legal/security obligations). However, I would not be bringing my own device no matter what.
Also, small nitpick:
> Most of these browsers (save obvious exceptions like Microsoft Edge for Business) are platform-agnostic.
Actually Microsoft Edge for business "is available on Windows, macOS, Linux, Android, and iOS." according to their website.
Pretty skeptical about AI-driven development, but otherwise pretty excited. Languages like Python, Go and dotnet continue improving, getting faster, simpler, and safer. Web stuff perhaps is getting overtly complicated in dependency management, but also has exciting developments like bun, next, and million.
Also maybe hoping for a mild bounce back in the job market.
Probably should not fully automate this, but if you omit the "approve or deny" part then you got yourself a nice system that can pre-screen and surface statistical concerns with applications. You can still have a human making the final decisions
If I understand correctly, the browser would need to accept text/html as a response to an <img /> tag. Why not just disable this/drop the response if it isn't an image MIME?
Basically that preventing terrestrial contamination of extraterrestrial samples is super tough, and in the specific case of Ryugu the study concludes that contamination did occur.