I'm not a commercial software developer, but a scientist working on technology development. I've been programming for 30+ years.
Jupyter has become my lab notebook. In the past, I always had illegible, disorganized notebooks, files, and program code, all over the place. A Jupyter notebook lets me organize all of that stuff in one place, in a narrative fashion, allowing me to reconstruct what I did, long after I've forgotten the details. The reasons for open communication of methods and results to the public, also apply to internal work.
My notebooks become my reports. I've abandoned PowerPoint, and my colleagues, including managers, don't seem to mind. Seeing the actual work might actually give them a feeling of involvement, like inviting them into the lab. They're also a good way of communicating a prototype of a process to the software development team, when an idea ends up in a product. Even if they don't like Python, the programmers can read and understand it.
I can actually run some of my data acquisition code directly within Jupyter. A code cell that spits out an inline graph is practically the default interface for a lot of this kind of work, so I don't have to build a unique GUI for every kind of test. This speeds up incremental refinement of an experimental technique, even if the routines that I write end up in a "straight" Python program when it's time to let an experiment run for a few hours or days.
Granted, Jupyter won't turn bad programmers into good. Learning good programming methods is still a gap in the education of scientists.
Probably the longest time-to-knowing-what-the-hell-this-thing-actually-is I've ever seen on HN. Clicked on the link, clicked on a Github link, clicked to the Github root, clicked on the link to the project's site, clicked on the first item in the table of contents, got a vague idea what it was.
I know. Open source folks, when you put a 'Home' button in the corner, make it go to the project home page, not the blog home page. If there's one thing I can't stand it's a blog post about an update for something that I don't know what it is, and my patience to click around trying to find out whether it is something I would be interested in or not is very limited.
I'm glad I did in this case because an open-source equivalent of Mathematica is a pretty sweet tool, but the site navigation sucks enough that it's likely limiting your audience a bit.
Just a heads up: Jupyter Notebook is not an open source alternative to Mathematica. Originally, Jupyter was iPython notebook, an IDE of sorts for data science and analysis in Python, by writing code and markdown together in a more coherent and integrated way. Then they incorporated a host of other popular open source languages for computational science such as R, Julia, F#, ect., so that we could use the best tools for their task, all in one document.
I'll add that it's a great way to teach python to students. Notebooks can be shared and students basically retain all their legwork as they learn. It's very helpful for visually seeing code work if you are new.
Well, not actually a IDE, I still use Emacs to create my IPython Notebooks in Jupyter.
Although it has and allow the creation of extensions which provide a lot of usefull features. In this version, the nbextensions are introduced as python packages, so that they're even easier to use/install.
The best thing with Jupyter Notebooks, is that one can write text with markdown, show formulas with LaTex (using MathJax), show code inputs and outputs in REPL-style, all together with possible Bash use, other languages' snippets, and nice cell %magics.
Nice thing with open source, is you "just" open an issue on GitHub: https://github.com/jupyter/jupyter/issues/139 , and it's fixed soon after. These are design issues we become blind to after working too long on the project, and we would love to have people helping us with that. It's just hard when user don't tell you !
Another thing which is often missing is a very clear "What's new in the latest version and when was it actually released" message on the homepage. Sometimes this very vital information is just impossible to find.
I agree that it is annoying to have the "home" or "main" link go to the blog home. But when you are at blog.domain.com, is it really that difficult for you to figure out how to get to domain.com?
Yes, on mobile phone it might take a few tries just to alter that web address, especially when the domain name looks like sslmadomaiinnahas. Try remembering that. Navigating to the front part of the url to copy that domain is borderline impossible on tiny safari screen. Clearing the blog part or the part after domain/long-url-ending can also make you rip your hair out because your device might delete the entire url.
Their landing page of their site isn't too bad the first thing you see is The Jupyter Notebook is a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more.
Though I am not sure why their blog doesn't link to the landing page of the site.
I explain notebooks as just how-to's with embedded coded/graphics/data that are editable in realtime; possible that's just as vague, but notebook sounds like a plaintext editor to me; term comes from logs, kept in notebooks, used in science labs to note/reproduce prior work.
It's something that allows you to embed code and text and execute said code and text, let me take a screenshot for you. Notice the latex integration in the second imgur link.
One is for R code and one is for asymptote graphs, which are amazing!
> The IPython Notebook is now known as the Jupyter Notebook. It is an interactive computational environment, in which you can combine code execution, rich text, mathematics, plots and rich media.
/opt/conda/lib/python3.4/site-packages/IPython/kernel/__init__.py:13: ShimWarning: The `IPython.kernel` package has been deprecated. You should import from ipykernel or jupyter_client instead.
"You should import from ipykernel or jupyter_client instead.", ShimWarning)
I love beaker. I spent a week or so trying to force myself to use Jupyter since others in my team use it, but I bailed out and went back to beaker eventually - it's got everything Jupyter has but is all round nicer to use. It's not without it's problems, but neither is Jupyter by a long shot.
But I'll add that I've never been able to set it up satisfactorily for everyone in my team to use it on a shared server. Shared installs don't seem to work very easily (I'm sure it's possible and probably easy, but it's very unclear to me how). It would be awesome if it had a daemon similar to RStudio where people can just log in with a user id on a server and it forks off an instance for them.
definitely look for the announcement next week for help on this front.
currently the std method would be users to ssh to the shared server and run their own beaker server with a shell, and then connect to the port/URL that it prints out. not exactly the easiest UI, but that's something we are working on...
Any idea if it's faster at presenting output? One of my biggest gripes with Jupyter is that it's crazy slow when presenting even a few hundred lines of something. It makes working with e.g. AWS APIs even more headache-inducing.
Thanks for the links! I've been having a lot of frustrations with jupyter and since it's almost impossible to find anything technical on google these days, I haven't been able to find any good alternatives.
There are several startups in that space, e.g., Domino Data Labs, that have been trying to make it easy to do collaborative/versioned notebooks. I have only seen the product videos so don't know how well they work etc.
For some background on the direction of the Jupyter project, check out this recent talk at PyData Amsterdam (http://pydata.org/amsterdam2016) by Min Ragan-Kelley & Thomas Kluyver.
They talk about how Jupyter has "evolved from a Python-specific tool to a general data science tool that supports many different languages."
I had my first experience with Jupyter last weekend when I was trying to learn about document clustering with Python. It seems like a cool idea, but in practice ended up being kind of annoying: https://github.com/brandomr/document_cluster/issues/7
Jupyter notebooks are for running ad hoc blocks of code. The biggest advantage of doing so is being able to check the output of each block to make sure the output is expected. This feature in particular makes for a great tool for tutorials. (Example of mine: https://github.com/minimaxir/facebook-page-post-scraper/blob...)
It is definitely not a tool to replace a typical Python workflow.
Jupyter is great for prototyping and playing with ideas, essentially its great it you want persistent data. But after you more or less know what you want and loading the initial dataset isn't a constant annoyance you're usually better off in an IDE or other real programming environment.
I use it to ask a whole bunch of exploratory questions about a dataset then productionize the result in PyCharm (my preference, other ways work great too :)
This is actually my main problem with the whole idea of these "notebooks". They explicitly encourage exactly the kind of ad hoc coding and practises that plague a lot of scientific work. It's nearly impossible to practise good software engineering while inside one of these things. I know the rationale will be that this is for ad-hoc exploration and the code should be rewritten / redesigned when it's moved into an app, but just like all prototype code that has ever been written, that is not what happens.
I would love something that combines this style with support for good software practices. For example, that let's you seamlessly move snippets of code into functions, classes, modules, and then create tests for them. RStudio is actually the closest I have found, which is ironic since as a language R is horrible for encouraging good software practices.
I teach an internal class on python at my office, and the notebook makes it significantly easier to work and play with code. It's a step between a REPL and a file you load and run each time you make a chance. It speeds up the dev/test cycle significantly.
It works really well in a class scenario. A professor can just show a notebook on the screen “live”: walk through it, or even change things and demonstrate the effects. And when finished, the whole thing can be posted as a file for students to download and try themselves.
Thanks for the feedback on the announcement blog post. I added a short description and link to the project home page at the top of the blog post after reading the feedback here. Thanks!
It's fun to play with code and try out new ideas in such a rich interactive environment. When you want to get the work done in production scenario, the shortcomings of unable to use version control and the overhead of interactive environment just kill it.
What works is that you get a subset of your data and try to develop some code to process it and generate a handful of graphs. You can then save the code in its true text form and edit with your favorite editor, and run it on your real data.
While working on a data science team some months ago, these notebooks helped me build something that explained, in detail and at a high level, the implementation details of an algorithm to sales and others not familiar with data science techniques. It was awesome and so easy.
I also used them when we did a capture the flag contest to help explain visually how a multi time pad vulnerability works.
"The Jupyter Notebook is a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more."
One of our data analyst had fun drawing all the shots from Kobe during his whole career.
He did everything on jupyter, which I discovered at that moment, and was mind blown on how well it worked and how powerful it is.
I don't think I would ever use this tool as a seasoned software engineer, but I can definitely see the power it has for newer people who want to learn, or simply people like him who know a little bit of code and just wanted to run it.
Your comment implies that using a Jupyter notebook is a programming crutch for those who are not skilled. That's far from the case; Jupyter notebooks allow for logical separation of code and output, which is important for comprehension.
It also allows for reproducibility of results, which is arguably even more important, especially in the data science case.
sorry, maybe my comment is wrong because of my lack of knowledge about the tool (I've seen it used 10 minutes) but it's seems like it's a nice UI replacing what a terminal does, if I'm right. So all I'm saying is, since i'm confortable using a terminal, and I'm usually already there using vim, etc., then it's more convenient for me to test stuff in my terminal than going into my browser to test in jupyter
Right, but what if you want to combine your test commands, their output, and comments? Share that with other people? Let them change things and re-run the examples in the same place? NOW you're cooking with notebooks.
As a seasoned software engineer, I use it as a persistent, better-organizable shell. Even when working on a remote host, I forward its port and work there rather than open a shell on the remote machine, and I'm someone whose IDE is vim.
It's just much better when you can see all your functions in one place, edit a function far back and have the changes propagate to the last command you ran.
I totally get the use for that, but I usually don't do this kind of things, which is why I said that I could see how amazing and powerful the tool is and don't see myself use it intensely
exactly. But I'm already in iTerm using vim, so I use the iTerm repl instead. Not saying jupyter wouldn't do a great job, just that seems more convenient to use what's in my environment instead :)
The nice thing about Jupyter notebooks is that rather than having a command history which can be unwieldy when tweaking functions, you have cells that are easy to go back and edit. Additionally, when you're done with your experiments you can tidy things up, stuff a few markdown comments in there and you have a nice tutorial for other developers.
can you comment on why you wouldn't use it? I am somewhere in the middle of software engineering and (biological) data science. I have seen some jupyter notebooks as companions to genomics software and I enjoyed those presentations. I was considering using it in the future based on that; so, I wanted to hear your thoughts.
As another "software engineer", I can say that we simply don't ever need a tool like Jupyter for our jobs. It's excellent for exploratory programming, research, and publishing -- but it's not really meant for software development.
I kind of missed the point of how notebook are used (and should be) and it seems actually very interesting :)
So you should definitely look into it and see if you like doing whatever you do with them!
I'm not a commercial software developer, but a scientist working on technology development. I've been programming for 30+ years.
Jupyter has become my lab notebook. In the past, I always had illegible, disorganized notebooks, files, and program code, all over the place. A Jupyter notebook lets me organize all of that stuff in one place, in a narrative fashion, allowing me to reconstruct what I did, long after I've forgotten the details. The reasons for open communication of methods and results to the public, also apply to internal work.
My notebooks become my reports. I've abandoned PowerPoint, and my colleagues, including managers, don't seem to mind. Seeing the actual work might actually give them a feeling of involvement, like inviting them into the lab. They're also a good way of communicating a prototype of a process to the software development team, when an idea ends up in a product. Even if they don't like Python, the programmers can read and understand it.
I can actually run some of my data acquisition code directly within Jupyter. A code cell that spits out an inline graph is practically the default interface for a lot of this kind of work, so I don't have to build a unique GUI for every kind of test. This speeds up incremental refinement of an experimental technique, even if the routines that I write end up in a "straight" Python program when it's time to let an experiment run for a few hours or days.
Granted, Jupyter won't turn bad programmers into good. Learning good programming methods is still a gap in the education of scientists.