Stoked to get to publish some of our private eval results and a bit of the behind the scenes of our framework! We've been using this approach for almost a year and found it extremely high leverage for making meaningful improvements to the AI parts of our product
Ah! I've been tinkering on a very similar project off and on for years now. This is so awesome to see— really nice work. I've just never had time to actually polish mine off. I think the backend for processing new activities is broken at the moment, but the old demo activities are still around.
PS: secretly, I hope that this post starts ranking for “a comprehensive ecosystem of open source software for big data management”, which is why I have said it verbatim so many times and added a helpful callout at the top for students. To be honest, I'd settle for the 19th spot: just above highadviser.com.
just a data point: Hex we built a .yaml based import/export for our notebooks, more than partially to make it friendlier to work with git (also partially because as we added new features it became harder to express custom cell types in .ipynb).
We still support ipynb import/export, but using yaml for our internal representation of notebooks has made it hugely easier to do human-readable diffs and makes git operations way easier.
(https://hex.tech/blog/github-sync/)
I'm always hesitant to self-promote in an hn comment (actually I have never done it before!), but your problems w/ Jupyter are just too closely mapped to what Hex (https://hex.tech/) solves to not plug it here!
- I get an analysis that I like, but there isn't a good way to share it with others, so I end up just taking screenshots.
You can publish any Hex notebook with literally just a few clicks, and anyone you share it with can access it, or edit it, or fork it, without installing anything— or you can even make it public. You can easily turn a notebook into an "app" or interactive report if you want, hiding/showing certain cells or choosing cells to show only code/only output. You can just share the raw notebook though too.
- There isn't a good way to take the same analysis and plug new data into it, other than to copy-paste the entire notebook.
Super easy to duplicate a Hex project and hit a different table or data source, or you can use input parameters (like ipywidgets) to make one notebook parameterized and work on a bunch of different data sources.
- The process to "promote" fragments of a notebook into being reusable functions seemed very high-friction: basically you're rewriting it as a normal Python package and then adding that to Jupyter's environment.
You can promote any part of a project to a "Component" (docs: https://learn.hex.tech/docs/develop-logic/components) that you can import into other projects. They can be data sources, function definitions, anything. If you make upstream changes to the component, you can sync them down into projects that import it.
- There aren't good boundaries between Jupyter's own Python environment, and that of your notebooks— if you have a dependency which conflicts with one of Jupyter's dependencies, then good luck.
Hex has a ton of default packages in its already installed standard library, and all the dependencies are ironed out— if you have packages you want to use that aren't there, you can pip install them, pull them in from a private github repo, or ask us to add them to the base image. You can also run Hex projects using a custom-provided docker image if you have super custom needs.
I was going to post hex as an ideal option, I definitely do not miss dealing with jupyter and all the Python env related headaches, hex solves all of these and then the rest of the more UX related issues with notebooks. Definitely recommend as the best compromise for notebook based Python data analytics tasks
maybe paranoidly, but i just didn't trust openai with my highly personal message data. it would certainly have been easier + probably cheaper to do it this way, but it just gave me the willies.
Cost me about a hundred and fifty bucks, give or take. Continued GPU inference is on the order of ~50 cents a minute or something like that— but it's serverless so negligible. I think you could do it for significantly cheaper with some of the newer models i mentioned!
Yeah, I wished I could have included more but I didn't have the fortitude to redact larger blocks of the chat db.
For training, I created many samples that looked like this, where I take n messages from the database, pop off the nth one and use the text of that last one as the "output", then specify in the "instruction" who the sender of that message is. I provide the remaining messages in order as context, so the model learns what to say in certain situations, based on who is speaking.
{
"instruction": "Your name is Izzy. You are in a group chat with 5 of your best friends: Harvey, Henry, Wyatt, Kiebs, Luke. You all went to college together. You talk to each other with no filter, and are encouraged to curse, say amusingly inappropriate things, or be extremely rude. Everything is in good fun, so remember to joke and laugh, and be funny.. You will be presented with the most recent messages in the group chat. Write a response to the conversation as Izzy.",
"input": "Izzy: im writin a blog post about the robo boys project\nIzzy: gotta redact tbis data HEAVILY\nKiebs: yeah VERY heavily please!\nKiebs: of utmost importance!",
"output": "yeah don't worry i will i will"
}
So yes, the model does generate an entire conversation from a single prompt. In the generation code, however, I have some logic that decides whether or not it should generate completions based off just the user provided prompt, or if it should also include some "context" based on the previous messages in the conversation. You can see this here: https://gist.github.com/izzymiller/2ea987b90e6c96a005cb9026b...