Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> PPS: In theory GNU Coreutils is portable and you might find it on any Unix. In practice I believe it's only really used on Linux.

Oh how times have changed:) But even with the demise of most commercial unixen, I was under the impression that Darwin preserved the ancient tradition of installing the system and then immediately replacing its coreutils with GNU?



Darwin's coreutils are mainly updated from FreeBSD I think. They've got some gnu stuff, but it hasn't been updated since gnu switched to GPLv3.


Right; the coreutils built in to Darwin are BSD-derived, I was referring to a user installing the GNU versions over top of the OS proper.


I bet many of us do this due to small differences between BSD and GNU coreutils. One that always gets me is:

  sed -i '' 's/foo/bar/g'  # BSD sed
  sed -i 's/foo/bar/g'     # GNU sed


That is indeed an incredibly frustrating difference, because there is no version of it that works on both GNU and macOS. In practice I often end up using perl to do an in-place search-replace instead.


It takes a second to install gnu coreutils with homebrew and make Macs barely useable


I much prefer Linux but having used Macs a few times for work, I absolutely cannot recommend making GNU coreutils the default option on Macs.

If you make GNU coreutils the first binaries in PATH, you can expect subtle, _very nasty_ issues.

The last one I encountered was `hostname` taking something like 5 seconds to run. Before that, there was some bug with GNU’s stty which I don’t remember the specifics of.


Or just use Perl


As long as you're using stone knives, might as well throw in some bear skins.

https://www.youtube.com/watch?v=F226oWBHvvI


Fun! But honestly there's a place for Perl.

There's far less variance between Perls on different systems than there are for sed/awk/shell. So if you want performant portable code, Perl does better than all of those.

I'd never use Perl for a 'big' program these days but it still beats the crap out of the mess of sed/awk/bash/ksh/zsh.


Apple has been threatening to remove perl for ages. One of these days they'll find the "courage" to do so and we'll need a better strategy.


FreeBSD removed Perl from base ages ago, in FreeBSD 5 or 6. That went alright.

I don't even have it on my Linux system any more; for better or worse, fewer and fewer things use it.


But "sed -i'' s/../../ FILE" will work on both, no? Or -i.orig if you want to keep a backup file?


That does not work on macOS. "-i''" is the same as just a bare "-i", and it will then interpret the next argument as the backup suffix.


Oops yes, that should have been -i '' with a space. But I see now that GNU doesn't accept -i '', or even -i .orig. It MUST be without a space: "-i.orig". The reason for that I assume is there's no way to disambiguate:

  echo foo | sed -i pattern
  echo foo | sed -i .orig
But you can still use "-i.orig"; that should work on both. That will leave you with a .orig file to clean up, but arguably that's not a bad thing as -i can clobber files.


Yes, like I did first with homebrew and now with nix.


That was the case for a while, but when they were updated to use GPLv3, Apple stopped updating them, probably to avoid licensing problems on iOS. Nowadays you can install them from Homebrew, MacPorts, straight from source, or other methods.


Yes, macos takes its utils from FreeBSD.

An interesting factoid is that FreeBSD/macos sort(1) was using GNU code until recently, since this is quite tricky to implement. Eventually it was reimplemented for GPL avoidance reasons.

We do consider macos though, and ensure all tests pass on macos for each release


Does Apple compensate or sponsor that work at all?


Definitely not


I tend to install coreutils on FreeBSD. Besides some minor annoyance with every command being prefixed with "g", some of the programs work a bit nicer than the FreeBSD-shipped versions (or some Linux-centric programs just want the coreutils versions...).


I'd be interested in learning which commands you have in mind and what specifically is a bit nicer about their coreutils implementation.


In GNU utilities, option arguments can come after (or between) positional arguments. Personally I find this small convenience invaluable, because I'm used to it.


Oh, I had no idea that GNU utilities allow this.

As a Unix graybeard I always place options first. Options last feels like Windows command prompt, so nothing I want to see...

I always tell younger colleagues who place options in the end, it might work with some commands, but just don't do it. I did not know that "some" includes all of GNU coreutils? A single common code style is a virtue, even in interactive use if there are onlookers. So I guess I will continue to point it out.


Even many command line parsing libraries support it and scan the entire argv for options. You should always terminate the options with "--" if it's in a script and any of the positional arguments are variables that might or might not start with a dash.


Unfortunately not. macOS comes with violently outdated FreeBSD coreutils, for the GPLv3 situation. Though the default shell was recently changed to zsh from Bash 3.2 from 2007.


Right, which is why I thought it was common for people to install the operating system, then immediately use macports or homebrew to go get the GNU coreutils and a modern version of bash because the BSD versions are less friendly.

(Thus continuing and extremely long tradition of layering GNU over the vendor tools; ex. Sun had their own tools, but everyone liked the GNU versions better.)


Oh sorry, I interpreted your comment as asking whether Darwin installed GNU coreutils itself in some roundabout, very-expensive-lawyer sanctioned manner :)


FWIW that's exactly what I've done


People do it, but why you would when zsh is right there, is beyond me.

Possibly the same type of people that target bash specifically in shell scripts I guess.


What's wrong with bash though? Targeting bash specifically has massively improved my productivity and decreased the incidence of easily avoidable mistakes. Portable POSIX shell scripting is hell on earth but bash scripting with shellcheck can be surprisingly pleasant. I had the same experience with portable makefiles and GNU Make.

For example, I managed to create a surprisingly good test suite with bash and the rest of the GNU coreutils:

https://github.com/lone-lang/lone/blob/master/scripts/test.b...

It even runs in parallel. Submitted a patch to coreutils to implement the one thing it couldn't test, namely the argv[0] of programs. I should probably go check if they merged it...


> Portable POSIX shell scripting is hell on earth but bash scripting with shellsheck can be surprisingly pleasant.

Assuming you meant `shellcheck`: you know it works for POSIX compatible shell too right?

What's wrong with bash is that people invariably end up requiring features that aren't available in some deployed version, and then you've lost a lot of the benefit of writing in shell in the first place.

Similar to shellcheck, shunit2 works just fine for running unit tests in posix compatible shell.


> Assuming you meant `shellcheck`

I did. Edited my comment, thanks. I don't know why I'm so prone to that particular typo. My shell history is full of it.

> What's wrong with bash is that people invariably end up requiring features that aren't available in some deployed version, and then you've lost a lot of the benefit of writing in shell in the first place.

But I've gained quite a lot too. Bash has associative arrays. I just can't go back to a shell that doesn't have that.

Shell scripting makes it simple to manage processes and the flow of data between them. It's the best tool for the job in these cases. So there are still reasons for scripting the shell even if one is willing to sacrifice portability.


I just can't believe it's 2023 and we're actually praising a shell for having arrays and dictionaries. Or that there are still multiple shells in use that don't. Or that people still ask questions like "What's wrong with bash though?" with a straight face, as if they don't know.

Now how long are we going to have to wait until somebody invents a way to do named parameters? That will revolutionize the computer industry! I guess it's way to much to ask for a built-in json or yaml parser, all we can hope for is maybe a stringly typed sax callback based xml parser after another 20 years from now, because dom objects in a shell would be heretical and just so unthinkably complicated.

Why are people so afraid to just use Python? Shell scripting and cobbling together ridiculously inefficient incantations of sed, awk, tr, test, expr, grep, curl, and cat with that incoherently punctuated toenail, thumbtack, and asbestos chewing gum syntax that inspired perl isn't ever any easier than using Python, especially when you actually need to use data structures, named function parameters, modules, libraries, web apis, xml, json, or yaml.


This answer is pretty convincing:

https://stackoverflow.com/a/3640403/512904


Thank you! That's a great analysis, and I love the design of PowerShell, which addresses most of my arguments against bash. The Unix community (especially Sun Microsystems) has traditionally had this arrogant self-inflicted blind spot of proudly and purposefully cultivated ignorance in its refusal to look at and learn from anything that Microsoft has ever done, while Microsoft is humble and practical enough to look at Java and fix it with C#, look at bash and fix it with PowerShell, etc.

Here's the summary of a discussion I had with ChatGPT about "The Ultimate Shell Scripting Language", in which I had it consider, summarize, and draw "zen of" goals from some discussions that I feel are very important (although I forgot about and left out PowerShell, that would be a good thing to consider too -- when I get a chance I'll feed it the discussion you linked to, which made a lot of important points, and ask it to update its "zen of" with PowerShell's design in mind):

The Zen of Python.

https://peps.python.org/pep-0020/

Discussion about Guido van Rossum's point that "Language Design Is Not Just Solving Puzzles".

http://lambda-the-ultimate.org/node/1298

https://www.artima.com/weblogs/viewpost.jsp?thread=147358

Discussion of Ousterhout's dichotomy.

https://en.wikipedia.org/wiki/Ousterhout%27s_dichotomy

https://wiki.tcl-lang.org/page/Ousterhout%27s+Dichotomy

Email from The Great TCL War Part 1, started by RMS's "Why you should not use TCL".

https://news.ycombinator.com/item?id=12025218

https://vanderburg.org/old_pages/Tcl/war/

Email from The Great TCL War Part 2, started by Tom Lord's "GNU Extension Language Plans".

https://vanderburg.org/old_pages/Tcl/war2/index.html

Summarization of the important points in those discussions that apply to The Ultimate Shell Scripting Language, resynthesized into a "zen of" list.

Discussion of Support for Declarative and Procedural Paradigms and how it applies supporting standard declarative syntaxes including json, yaml, and xml (which bash still doesn't and probably never will, and PowerShell does of course).

Suggestions for some more "zen of" and design goals, specifically focused on addressing weaknesses or design flaws in popular languages like sh, bash, tcl, python, perl, etc.

Discussion of how "Simplified Debugging and Error Handling" can be balanced with "Macro Processing and Syntax Flexibility", better than debugging C++ templates, Lisp macros, TypeScript code compiled to minified JavaScript, etc.

Discussion of how JavaScript / TypeScript fall short of those goals.

Discussion of visual programming languages for interactive shell scripting as well and general purpose programming.

Discussion of layering visual programming languages on top of textual programming languages.

https://donhopkins.medium.com/the-shape-of-psiber-space-octo...

Interoperability of text and visual programming languages with LLMs for efficiently and reliably analyzing and generating code.

https://docs.google.com/document/d/1QJ98QwC2ubsTNKOFAzUAw6Zy...

Discussion of using Python to build a visual data flow node based shell scripting language on top of Blender (that just happens to support 3D, image processing, video editing, GPU programming, machine learning, Python module integration, and everything else that Blender is great at).

https://www.youtube.com/watch?v=JOeY07qKU9c

Discussion of how to make efficient use of token budgets when using LLMs with text and visual programming languages.

Here is the condensed discussion with the bulk text I had it analyze omitted, so you can more easily read the summaries and recommendations and "zen of" manifestos:

https://docs.google.com/document/d/1wKhdEoLWCZX9TNaftQxLp6ot...

Here is the entire unexpurgated discussion including all the email messages and articles I had it analyze, if you want to see what it was considering (they're interesting discussions to read in their entirety if you're interesting in that kind of stuff, but ChatGPT is excellent at summarizing them and highlighting the important and relevant points):

https://docs.google.com/document/d/1RTxeYjZ2vZsNU4xTj0ZhiXXj...

With the rise of LLMs I think now is a great time to reconsider the design of text and visual scripting and programming languages, with LLM interoperability and compatibility in mind.

ChatGPT:

Designing a text representation for a visual programming language that interoperates well with Language Models like GPT necessitates careful planning to ensure the representation is both meaningful and efficient. Here are several considerations and strategies to optimize the text representation:

Token Efficiency: The text representation should be concise to make efficient use of the token budget of the language model. Avoid verbose syntax and prefer shorthand notations where possible.

Syntax Simplicity: A simple and consistent syntax helps the language model predict and generate code more accurately. Overly complex syntax can lead to increased errors and confusion for both the model and the user.

Standardization: Using standardized syntax and well-known programming patterns can help the language model better understand and process the code since it has likely been trained on similar patterns.

Semantic Density: Each token should carry a significant amount of meaning. This can be achieved through the use of expressive keywords or operators that encapsulate complex actions.

Modular Design: Break down the text representation into modular components. This approach allows the language model to process each part in isolation, reducing the complexity and cognitive load.

Contextual Keywords: Use keywords that clearly indicate their function within the language, making it easier for the language model to provide contextually relevant suggestions.

Window Size Awareness: Design the text representation to be legible and understandable within a typical window size, avoiding lines that span too long and require horizontal scrolling.

Minimize Ambiguity: Ambiguity in text representations can lead to multiple interpretations by the language model, resulting in inaccurate predictions. Clear and unambiguous syntax helps maintain consistency.

Commenting and Documentation: Incorporate support for comments and documentation within the text representation. This not only aids human understanding but can also provide language models with additional context for generating code.

Escaping Visual Elements: For elements that are inherently visual and do not translate well into text, provide an escape mechanism that allows these elements to be described textually in a way that the language model can process.

Alignment with Model's Training Data: Align the text representation with the structures and syntax that the language model has been trained on. This can include following common coding standards and naming conventions.

Iterative Refinement: Use an iterative approach to refine the text representation based on feedback from testing with the language model, ensuring that the design evolves to better suit the model's capabilities.

By considering these aspects, you can create a text representation for a visual programming language that is both language model-friendly and conducive to an efficient programming experience.


Agreed. If you don't test your script in other shells and on other platforms, why advertise compatibility. And not just the script, but also the programs called.

I got burned by supposedly "portable" scripts by the whole Ubuntu dash disaster, and again when I started using Mac OS X, and then again once I used Cygwin and msys2 on Windows.

I do keep portability in mind when writing shell scripts to ease porting later, but without testing there's really no way to be sure "/bin/sh" is right. And some of the Bash features such as arrays are legitimately useful.


I favor POSIX sh myself, but BASH sits on a happy medium of portability and features; zsh might well win on features, but Darwin is the only OS I know of that installs it by default, where BASH is nearly universally installed by default on Linux distros and still has considerably more features than POSIX /bin/sh.


For scripts I'm with you on POSIX.

I was referring specifically to interactive use: ie why someone would install bash to use as their shell, when zsh is already there.


I use bash mostly out of habit and because it's installed by default, but also because zsh had some sort of incompatibility with some of my aliases that I never got around to debugging.


The BSD licensed stuff isn't outdated, it's actually directly from FreeBSD 14.0 afaict. ZSH is current (5.9) in Sonoma for example.


Installing GNU coreutils etc was also common on Solaris. However as it usually wasn't in your path before SysV utils it was only used with its full path.


and in case anyone wants to bring up GNU/kFreeBSD - that's officially dead as of July.


Tragically true:( But you can still install https://www.freshports.org/sysutils/coreutils/ on FreeBSD if desired.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: