Well, for one, by eliminating external tool calling, the model gains an amount of security. This occurs because the tools being called by an LLM can be corrupted, and in this scenario corrupted tools would not be called.
There will be no "unlocking of AGI" until we develop a new science capable of artificial comprehension. Comprehension is the cornucopia that produces everything we are, given raw stimulus an entire communicating Universe is generated with a plethora of highly advanceds predator/prey characters in an infinitely complex dynamic, and human science and technology have no lead how to artificially make sense of that in a simultaneous unifying whole. That's comprehension.
The fear of missing out is driving all of this. The article even states that the majority of users have no use case. Except handing around $20 a day to the service providers.
I just have no words. There will be so many scams and other issues if (or when) OpenClaw is hacked... Identity thefts, bank transfers, deleted accounts, stolen photos, etc.
I gave this advice to a non-developer friend yesterday: There is a huge interest in creating all types of automation. Don’t waste your time doing this also. Someone else is going to create a way to automate things that is much more secure than the extremely insecure manners the crowd is experimenting with right now. Let them experiment, make messes, and probably figure out this method, that method, and 10 others people dream up and try are not good for long term use. Let them drive wreaked infrastructure for a while, and then pick up the working, clean and safe methods 2–3 years from now. Long after this early experimenters crowd have cognitively damaged themselves and intellectually exhausted themselves multiple times over. Let their pain be your future easy solution.
But who am I kidding, this is fashion and has nothing to do with tech.
Liveness detection of what appears to be a person in video, but is a photo, a skulpture, or some other attempt to falsely identify to an identity system as another.
I'd expect the primary use case to be liveness detection, to validate that the person the facial recognition identifies is not a photo, a sculpture, person wearing facial prosthetics, or a mask. I've coding such software for that exact purpose.
I don't think this paradigm will last, or be what becomes the more common structure in the future. This will still suffers from conflicts of persona and objective, plus has the issue that individual apps will need protected file hierarchies to prevent malicious injections. I don't see this as a solution, just a deck chair shuffle.
I've been researching and building with a different paradigm, an inversion of the tool calling concept that creates contextual agents of limited scope, but pipelines of them, with the user in triplicate control of agent as author, operator of an application with a clear goal, and conversationally cooperating on a task with one or more agents.
I create agents that are inside open source software, making that application "intelligent", and the user has control to make the agent an expert in the type of work that human uses that software. Imagine a word processor that when used by a documentation author has multiple documentation agents that co-work with the author. While that same word processor when used by a, for example, romance novelist has similar agents but experts in a different literary / document goal. Then do this with spreadsheets, and project management software, and you get an intelligent office suite with amazing levels of user assistance.
In this structure, context/task specific knowledge is placed inside other software, providing complex processes to the user they can conversationally request and compose on the fly, use and save as a new agent for repeated use, or discard as something built for the moment. The agents are inside other software, with full knowledge of that application in addition to task knowledge related to why the user is using that software. It's a unified agent creation and use and chain-of-thought live editing environment, in context with what one is doing in other software.
I wrap the entire structure into a permission hierarchy that mirrors departments, projects, and project staff, creating an application suite structure more secure than this Filesystems approach, with substantially more user controls that do not expose the potential for malicious application. The agents are each for a specific purpose, which limits their reach and potential for damage. Being purpose built, the users (who are task focused, not developers) easily edit and enhance the agents they use because that is the job/career they already know and continue to do, just with agent help.
Your project, while interesting as an approach, is orders of magnitude more complex than the proposition here - which is to rely on agents skills with file systems, bash, python, sed, grep and other cli tools to find and organize data, but also maintain their own skills and memories. LLMs have gained excellent capabilities with files and can generate code on the fly to process them. It's people realizing that you can use a coding agent for any cognitive work, and it's better since you own the file system while easily swapping the model or harness.
I personally use a graph like format but organized like a simple text file, each node prefixed with [id] and inline referencing other nodes by [id], this works well with replace, diff, git and is navigable at larger scales without reading everything. Every time I start work I have the agent read it, and at the end update it. This ensures continuity over weeks and months of work. This is my take on file system as memory - make it a graph of nodes, but keep it simple - a flat text file, don't prescribe structure, just node size. It grows organically as needed, I once got one to 500 nodes.
It ends up being similar to how early PC software was written before people realized malicious software could be running. There used to be little to no memory safety between running programs, and this treatment of files as the contextual running memory is similar. It's a great idea until a security perspective is factored in. It will need to end up being very much like closed applications and their of writing proprietary files, which will need some security layer that is not there yet.
Reminds me of early data driving approaches. Early CD based game consoles had memory constraints, which I sidestepped by writing the most ridiculous simple game engine: the game loop was all data driven, and "going somewhere new" in the game was simply triggering a disc read given a raw sector offset and the number of sectors. That read was then a repeated series of bytes to be written at the memory address given by the first 4 bytes read and next 4 bytes how many bytes to copy. That simple mechanism, paired with a data organizer for creating the disc images, enabled some well known successful games to have "huge worlds" with an executable under 100K, leaving the rest of the console's memory for content assets, animations, whatever.
Which games were these out of interest? I enjoy reading about game dev from the nascent era of 3D on home consoles (on the Saturn in particular) and would love to hear more.
Of course not. It is just yet another example of a 7-8 figure expensive attorney and their billions dollar corporation wasting everyone' time, tax payers dollars, and demonstrating that the law applies to us and not them. I expect them to just stop showing up in court in time. What can the court do when these people own the people that write the laws?
There really should be some type of panel for frivolous legal arguments. If they are used by corporation all of the lawyers, leadership and shareholders involved are thrown into jail. Could even get jury on this and have them give majority opinion.
reply