Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Bringing GNU Emacs to native code [video] (toobnix.org)
223 points by zeveb on May 4, 2020 | hide | past | favorite | 83 comments


I tried this out on my system last night. Compile time was quite large (I would say about an hour and a half on my Ryzen 3600X.) I use the DOOM Emacs config, and was surprised to find most things working out of the box with the native compilation. I noticed no difference in startup time. The speed boost was surprisingly noticeable, however, when e.g. opening a buffer that causes a language server to start. Opening CCLS was... instant. It's usually quite quick, but this was noticeably faster.

Great job all around!


Similar story for me (though unfortunately on a few years old i7, so the compile time was well over 2h).

I haven't had any stability issues with it, either, it just works.


does it do anything to change/fix Emacs shitting itself on large buffers?

last time i gave it a try opening a large file would completely kill performance, and iirc in particular really long lines (ex. 1000+ chars) would make the thing chug even if the actual file wasn't super big or anything.


Large buffers are fine - long lines are not.

If you don't have long lines but you are experiencing slowdowns on large buffers, your mode is doing something. As an example, large XML files bring my Emacs to a crawl, but if I switch to a text mode it's fine.

For long lines, I believe the latest Emacs has a fix: https://www.reddit.com/r/emacs/comments/ccoksw/solong_mitiga...


I've had good luck on big (multi-GB) SQL dumps, which often have very long lines, with

https://github.com/m00natic/vlfi


I think I remember Emacs 27 will include some improvements in that regard, even without native code compilation.


> in particular really long lines (ex. 1000+ chars) would make the thing chug

My uninformed guess is, this is bottlenecked by an inefficient algorithm that accesses the memory too much, i.e. it's nonlinear with respect to the number of characters. I doubt that tweaking the overall speed would fix it.


My experience is it's generally either

- inefficient font-lock regexes, which are rarely benchmarked / optimized since most files don't have long lines and because font-lock behavior is quite complicated to begin with;

- inefficient thing-at-point implementations / use, e.g. an O(n*n) thing-at-point called at various points by an O(n) function;

- modes using font-lock regexps where they really just need "fixed" styled text, but other Emacs architecture makes it difficult, e.g. compilation-mode and derivatives.


Emacs dev here. The primary cause is something else.

Emacs's inefficiency in handling files with long lines is due to two factors: (a) The primitive unit of work for the display engine (the code that determines how to combine text, font metrics, syntax highlighting, inline image display, etc.) is a line (a newline-delimited span of characters). (b) The redisplay routine is called very frequently—not just when the screen is repainted, but pretty much any time the screen location of a buffer element need to be calculated, and so, e.g., during navigation. So Emacs is constantly, under the hood, going back to the previous newline and re-calculating how the buffer contents from that point forward should be rendered.


I’m not an Emacs dev so I’ll concede your expertise here - but if it is fundamental to the core navigation and redisplay, why does “stepping down” modes help so much? Fundamental and text mode certainly aren’t great with long lines, but are usually orders of magnitude better (eg a few seconds per command instead of 10s of seconds).

(I have not tried so-long-mode and am mostly on Emacs 26 with some 25.)


What you're observing is caused by what I'm describing. The features of these other modes are expensive because they do things that implicitly invoke redisplay.

Consider an ostensibly simple operation like determining the column in which a particular character appears. The answer to that isn't the number of characters since the last occurrence of `\n`, because whatever modes are active can inject arbitrary spacing, or display particular strings as something shorter or longer (a trivial example is the mode that causes `lambda` to be displayed as `λ`), or cause some characters to be displayed in a non-roman font where a single glyph is more than one column wide, and so on. To determine the character's column accurately you need to take into account everything that could affect how you'd paint the screen at that position, i.e., run the whole redisplay loop for some portion of a buffer. The richer the set of active modes, the more expensive it is to do that and the more often it is done.


Thanks for your work on Emacs :)


I routinely edit C# files in the 5k to 15k LOC range, and emacs works well for me. Occasionally the "smooth scrolling" feature stutters on large files but otherwise it's perfectly usable.


Are you opening it in fundamental mode? Without actually doing profiling, I'd guess it's syntax highlighting that's the real issue there. I've opened 100+ MB files without an issue, but that was on a box with an absolute ton of RAM.


Nah. IME it really is just line length.


Note: this is a video. Here are the slides: http://akrl.sdf.org/gccemacs_els2020.pdf


I'd recognize a LaTeX beamer presentation anywhere.

Here is the corresponding paper: https://arxiv.org/pdf/2004.02504.pdf


From org-mode too, as it has the pointless Outline page that org always inserts into beamer presentations.


It's not pointless, but the outline only shows up if the generated tex file is compiled twice. The default org to latex export will only compile it once.

Alternatively, it is easy to suppress the outline slide altogether (add "toc:nil" to "#+OPTIONS:").


Oh I know. I have toc:nil as my default because it annoys me.

It's just a bit of a giveaway as to how the file was made.


I have a Gentoo ebuild [0] almost working for this (I think I'm missing the step where the eln files are loaded before dumping the base image). The compile times are ... substantial. However they don't affect the development process for new code since the interpreter is always there, and I am excited to see what performance gains we will see.

From an engineering perspective this is a excellent example of a direct path from interpreted to compiled code. The trade-offs are clear (heck, they are number 0-3), and while there is complexity, all the engineering time has been effectively concentrated inside a single project, rather than forced upon tens of thousands of maintainers and users. Bravo. I wonder what other bytecode interpreters could benefit from this toolchain. Compile times from qlop.

On a Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz Without comp `2020-05-03T15:33:27 >>> app-editors/emacs: 6′40″` With compile `2020-05-03T18:35:43 >>> app-editors/emacs: 2:11:45`

0. https://github.com/tgbugs/tgbugs-overlay/blob/master/app-edi...


nice, thanks for the ebuild!


As a exclusive emacs user for the last 20 years, I'm quite excited; my main complain about emacs is its slow down with some more sophisticated packages. I wonder if it improves magit performance with large codebases.

Pigs fly just fine with with enough thrust.


One reason magit is slow (that last I checked still hasn't been fixed) is that it spawns a large number of processes calling out to git for each operation. Most of these are redundant and could/should go away with either a redesign or a working caching scheme. This issue is more than obvious on platforms (e.g. some macOS versions) where fork/vfork are not as fast as one would expect. I like the paradigm behind magit, but the implementation leaves a lot to be desired. So the pig in this case is definitely not Emacs.

Example:

  # Doing a simple magit refresh results in the following processes being spawned (tracked with dtrace)
  # Notice how many calls are completely redundant

  2020 May  4 12:46:22 11819 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11820 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11821 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11822 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11823 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11824 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11825 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11826 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11827 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11828 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11829 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11830 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11831 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11832 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11833 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11834 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11835 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11836 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11837 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11838 <67471> 64b  /opt/local/bin/git --no-pager -c core.preloadindex=true -c log.showSignature=false <...>
  2020 May  4 12:46:22 11839 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11840 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>
  2020 May  4 12:46:22 11841 <67471> 64b  /opt/local/bin/git --no-pager -c core.preloadindex=true -c log.showSignature=false <...>
  2020 May  4 12:46:22 11842 <67471> 64b  /opt/local/bin/git --no-pager -c core.preloadindex=true -c log.showSignature=false <...>
  2020 May  4 12:46:22 11843 <67471> 64b  /opt/local/bin/git --no-pager -c core.preloadindex=true -c log.showSignature=false <...>
  2020 May  4 12:46:22 11844 <67471> 64b  /opt/local/bin/git --no-pager -c core.preloadindex=true -c log.showSignature=false <...>
  2020 May  4 12:46:22 11845 <67471> 64b  /opt/local/bin/git --no-pager -c core.preloadindex=true -c log.showSignature=false <...>
  2020 May  4 12:46:22 11846 <67471> 64b  /opt/local/bin/git --no-pager --literal-pathspecs -c core.preloadindex=true -c <...>


This issue is more obvious on windows where process creation is very slow.

Emacs 26.1 should perform much better at process creation as it uses vfork on macOS instead of fork.

I don’t think that caching is an easy solution. Git can do a lot of weird things so heuristics are bound to be unreliable, brittle, or both. Furthermore the wins from caching are small for the typical users who use systems that are capable of starting processes in good time.


Magit is working on switching to libgit2 https://github.com/magit/magit/issues/2959 to fix this.


This issue has been open since 2017 and it's not panacea since it comes with its own -substantial- tradeoffs which is why as far as I know, magit is not "switching to libgit2" but planning to offer libgit2 as an additional option. Also for such an important feature, rather than going with FFI, it would be much better if Emacs linked with libgit2 (similarly to the improved JSON support in 27). That way the core Emacs development team would review all code and have a say in its design. That also implies magit being bundled with Emacs (or at the very least available in ELPA) which may not be realistic in terms of copyright assignment / code quality concerns.

Performance issues aside, a pure Lisp magit is the simplest and more universal setup which is why an attempt to fix the process spawning problems there should be made.


Why do I want the Emacs team reviewing libgit2 or Magit?


They would be reviewing the C interface to libgit2, not Magit itself, since -you know this- C code written either to work through FFI or as part of Emacs can potentially corrupt memory and destabilize Emacs in serious ways. Code written in Emacs Lisp, even low quality code, does not generally suffer from these issues.

So yes, I'd much prefer C code that Daniel Colascione / John Wiegley / Eli Zaretskii / Stefan / Paul .. have reviewed over newly-written FFI code that hasn't been through that thresher. Everyone that jumped on the emacs-libvterm train early knows what I mean.

I keep Emacs running for months at a time and it's the foundation of pretty much everything I do on a computer. Other than the OS, it generally is the most stable, continuously running, continuously stressed piece of software that I have ever used.


You can somewhat fix the macOS Magit perf problem --- at least the Aquamacs one --- by patching the vfork in the source. Doing that took me from keystroke-lag speed to regular interactive speed (but I've since switched to railwaycat Emacs, which I don't think has the problem, though maybe I just don't notice it).


This may have just been that you moved to railway at emacs at the same time you upgraded past 26.1 which fixed the virology thing on macOS


Here's a caching patch that eliminates a few calls to git rev-parse per magit-diff/status etc operation that John Wiegley posted in one of the magit performance issues (modified to apply to recent magit): https://github.com/dandavison/magit/commit/08317454bf180d502...


that's probably one reason, but as far as I can tell, a large cost is formatting an colouring the master diff buffer when there are a lot of changes. When doing very large merges, it takes forever to refresh the buffer but aborting whatever magit is doing will leave an uncoloured but perfectly usable buffer.


Also whenever somebody accidentally commits or stages node_modules. Seriously, had that happen to me before, and Magit choked on it.


Meaning, it's only really slow on macOS.


Multiple versions of macOS. And Windows. And possibly other less known systems.

Besides, why would a discerning engineer put up with magit creating all these wasteful processes knowing that most of them are redundant, regardless of performance feel? If I was a magit developer, I would surely try to fix this.


On Windows it's not as fast as Linux, but it's not terrible either (as far as I heard).

Also see https://stackoverflow.com/a/16902730/615245 (I think it's not needed with the latest versions of Git anymore, though).

> If I was a magit developer, I would surely try to fix this.

Since you're just a regular user, did you check out the issue tracker for related discussions?


Spawning processes on macOS is annoyingly slow :( I think the Linux kernel developers have put a lot more time into making sure forks complete much more quickly.


Here some rough, non-scientific testing/benchmark statistics.

It took me 124.98 mins to build the native-comp branch with a 4-cores/8-threads i7-4790k, 32GB RAM on a LXC instance while the master branch took me 244 seconds. Both branches are obtained from latest available git snapshot as 2020-5-4 20:20 CDT.

The following function is passed to (benchmark-run-compiled 10 ...) for each run.

  ;; -*- lexical-binding: t -*-

  (require 'cl-lib)

  (defun bf-1 nil
     (/
       (apply '+
        (cl-loop repeat 300000
         collect (cl-random 1.0)))
     300000.0))
gc-cons-threshold is set as 268435456 (~ 256MB)

Before each run, (garbage-collect) function is called.

With Emacs's built-in core lisp functions, cl-lib functions and the above benchmark function are all native-compiled, it took 0.5823 second to complete.

With Emacs built-ins and cl-lib are native-compiled but the benchmark function is byte-compiled, it takes 0.6411 second to complete.

With byte-compiled Emacs built-in/cl-lib/benchmark functions, it takes 1.3574 second to complete.

With byte-compiled Emacs built-in/cl-lib functions and interpreted benchmark function, it takes 78.054 seconds with 1 GC taking 75.094 second, which implies the execution roughly takes 2.96 seconds.

I also ran same benchmark on a 4-core A10-6800k. and observed similar ratios on the builds from 2020-5-3.


I posted the same content 5 days ago here: https://news.ycombinator.com/item?id=23021574

No reaction at all. I am really curios to understand what I did wrong at the time.


Nothing at all. You just lost the Hacker News lottery.


And to add insult to injury, your comment is being down-voted. It's a cruel web.


"Arguably the most deployed Lisp today?"

That's an interesting question. My other guess would have been Gimp, and based on a quick glance at Debian's popcon, I'd say Gimp might be slightly in the lead.

Then again, like Open Firmware deployed Forth on millions of computers, right under our noses, it would not surprise me at all if there were a simple Lisp implementation hidden on every computer in the world.

Is there anything more recent than DSSSL?


Looks really cool! One thing I noticed was the generated code seemed to have fairly poor register allocation, looking more like it was just pulling things straight out from locals into registers and immediately storing them back. From the talk, it looks like that was what was being provided to libgccjit, but surely it could optimize that further?


Just when I thought I know enough compsci to understand everything on a sufficiently high level, this guy totally lost me starting from LIMPLE


Compilers are "just" a series of transformations / translations from higher-level code to lower-level code. The top is code like C, python, elisp, whatever, and the bottom is machine code for amd64, arm7, whatever. All the in-between code is in some intermediate representation (IR).

Each successive step takes care of different optimizations, modifying the code as it goes down. At the last step, he converts LIMPLE to an IR (intermediate representation) that libgccjit understands, and hands it off to gcc for native compilation.

Could you just start with elisp and emit amd64 machine code in one step? Absolutely, but it would be hell to maintain, and then you lose out on all the pluggability of modern compilers. If you (consume and/or) emit standard(-ish) IRs, you get to participate in a pretty amazing ecosystem.


Also even if you do "one step" you'll want to have an intermediate data format which is SSA for the more efficient optimizations


Did anyone successfully compile it for x86 32-bit target? The docker image is 64-bit and the compilation seems to be hitting the gcc-i386 limit of 3GiB.


Very keen to try this out but that branch won't compile for me. What's the trick?


This seems convoluted compared to say moving the Lisp implementation from Emacs Lisp to Common Lisp, of which several native code compiling implementations exists.


Rewriting thousands of packages (some of which have had man-decades of work poured into them) would also be rather convoluted.


You don't need to rewrite packages if you have a layer where Emacs Lisp is compiled/interpreted as Common Lisp.

You can easily hack the readtable in Lisp, and rewrite Emacs Lisp sexps as Common Lisp sexps.

Some interesting things to consider are file-local variables, buffer-local variables, dynamic-scope/lexical-scope, how to handle floating point (-0.0e+NaN in Emacs Lisp, custom floating point rouding modes/traps in SBCL for example), but this looks like a reasonable approach overall.

For example (just a draft, this might be incorrect):

    USER> (defun make-local-variable (symbol)
            (eval `(define-symbol-macro ,symbol (bvar (quote ,symbol)))))
    MAKE-LOCAL-VARIABLE

    USER> (make-local-variable 'foo)
    FOO

    USER> (macroexpand-all '(list foo))
    (LIST (BVAR 'FOO))
    T
    T
The BVAR forms would then be able to access a buffer variable, with the current buffer. This works with SETF too (but Emacs Lisp SETQ should be replaced by SETF during the transform).

BVAR could expand into code that calls FFI functions, for compatibility. Using an FFI approach would allow to progressively rewrite some parts of the runtime into CL.

Just to clarify, I know this is a huge work to undertake, but this comes from the assumption that we want to keep all existing Elisp files running identically.


>You can easily hack the readtable in Lisp

The funny thing about lisp is that all things are easy, but yet, other eco systems offer more finished libraries.


Other ecosystems also become obsolete or dead within years counted on one hand. That’s perfect for today’s culture of disposable code though. Lisp isn’t good for that.


That's the curse right here. Anyone can easily make a half-baked solution to their problem, so those solutions end up as half-baked libraries, if they get released at all.


I would expect that an ELisp interpreter written in Common Lisp would likely be slower than the existing ELisp interpreter written in C.

As for compilation to Common Lisp, I don't know how feasible it would be. Apparently they tried something similar with Guile and it ran into problems.


Guile (the VM) runs elisp just fine. There has been zero optimization work done though so it is quite slow. The reason they did not go with guile was more political than anything else. It was also understandable from their point of view (and I say this as a guile weenie)


It was somewhat political, but it was more of an issue of manpower. Also, Guile is basically maintained by one developer (as far as I've heard). It's pretty dangerous for Emacs to rely on a project like that.


No, the reason is still slow string buffers. Politically everybody wants emacs to switch to guile.


Iirc Guile-emacs relies on Emacs for all string related functionality. The overhead comes for dynamic bindings that are currently slower in guiles elisp implementation. I haven't looked into it since talking to Robin about it in 2016.

You maybe have some more inside info, but back in 2016 I remember a lot of complaints and that guile was a liability for the much bigger Emacs project. Maybe even people threatening to retire as maintainers


Emacs contains a quarter million lines of C. What do you do with all the elisp that calls into it? Maybe that could be turned into a "libemacs" and used from FFI.


The real trouble of using Emacs C core from another language is that you have to use all the internal datastructures used by C core, and this exactly equivalent to all elisp datastructures.

This implies you have either to convert back and forward everything or you just can't use the native datastructures of the new programming language (making the whole operation often quite pointless).

I must confess that most of the comments in this thread seems to start from the assumption that people working on this in the last two+ decades are probably dumb, and this is sad.


Or maybe you rewrite it in Common Lisp. If you're going to take the Common Lisp route, I think it might be worth going all in and just re-implementing Emacs in Common Lisp.


That's already been done. More than once, in fact. The problem is that, although there are perfectly serviceable Emacsen written in Common Lisp, none of them are GNU Emacs.

For example, there's Hemlock from CMUCL, and its descendants built into Lispworks and Clozure Common Lisp.

Those implementations aren't likely to work well as a substitute for GNU Emacs. For one thing, there's a substantial ecosystem of software that depends on specific APIs and other characteristics of GNU Emacs. Writing code to bridge GNU Emacs APIs with those available in the Hemlock descendants would be a lot of work.

There are some other obstacles as well. CCL's implementation of Hemlock uses the Cocoa text architecture, so it's not portable to platforms other than macOS.

The Lispworks implementation is portable across Windows and numerous UNIXEN, but the Lispworks license will not allow delivery of a proper substitute for GNU Emacs (it forbids building an application that can be construed as a Lisp development system, and when I pressed them about exactly what limits that policy implies, they explicitly used Emacs as an example of an application that would be forbidden).

The original CMUCL Hemlock and its portable version are designed to work with CLX. It could probably be made to work in a modern X environment, but making it fit well into modern GUI environments and porting it to all the platforms GNU Emacs works on would be a huge amount of work.

I'm probably overlooking some other Emacsen, but I don't know of any off the top of my head for which the situation is any easier.


Robert Strandh is working on an Common Lisp emacs based on McClim but, McClim basically forces you to use linux.


There's also lem, which is more portable: https://github.com/cxxxr/lem


It hasn't already been done, no. The Emacs variants you mentioned (that IMV should not be referred to as "Emacs") are from-scratch new implementations of just the concept behind Emacs, but not GNU Emacs itself. They are not compatible with GNU Emacs, they don't even support a small fraction of GNU Emacs features, and are thus doomed to obsolescence / non-existent market share.

So far, nobody -that I know of- has tried to move GNU Emacs to Common Lisp. GNU Emacs is a well-known, well-used platform and moving to Common Lisp could very well be worth the upfront costs.

Wishful thinking aside, the GNU Emacs development community has hunkered down behind Emacs Lisp, so improving it will bring immediate and tangible benefits to every GNU Emacs user. I love Common Lisp and do use SBCL for a lot of personal projects, but I also love GNU Emacs and I will take any improvements there that I can get. Watching Emacs Lisp become a more viable general purpose language (improved performance means an expanded set of problems that it can now address) is a great development. Stefan Monnier on emacs-devel:

  The main benefit of such a compiler is not to run existing Elisp
  code faster (99% of existing Elisp code runs fast enough that the
  user won't notice if it runs faster) but to make it practical to
  write other Elisp code which would otherwise be too slow. But for
  that to work well, you want the new compiler to be
  available "everywhere", rather than just on some platforms. So it
  will only start being useful when it works on GNU/Linux, macOS,
  Windows, ARM, RISC-V, x86, amd64, MIPS, younameit (AFAIK
  lbgccjit's CPU coverage is already good enough for that, so the
  main barrier here is the OS support), and when it's not just an
  option at compile-time but when it's included in all builds.


You appear to consider "Emacs" and "GNU Emacs" synonymous. I explicitly do not.

Hemlock is not a reimplementation of GNU Emacs; it's derived from earlier Emacsen, just as GNU Emacs is. Hemlock in particular is based on ZWEI, and was in version 0.99 around the time that work started on the GNU Emacs project. It's a slightly older sibling of GNU Emacs, rather than a descendant of it.

GNU Emacs has undoubtedly become the de facto standard Emacs, but I don't see why that disqualifies other branches of the Emacs family tree from using the name that all of them inherit from the venerable TECO implementation.

My point was that Emacsen have already been built in Common Lisp, but that they do not serve as substitutes for GNU Emacs--a point with which you appear to agree.

Maybe we even agree on the reason why: because no Emacs implementation can substitute for GNU Emacs without supporting its ecosystem. The amount of work needed to do that would be enormous, whether the proposed substitute is written in Common Lisp or in something else.


Sure, I just wanted to clarify and make it very clear that GNU Emacs in CL hasn't even been attempted since that was what the poster you originally replied to, equated "Emacs" with.

As you said, the value lies primarily in the ecosystem. Common Lisp is by far the better language, but GNU Emacs is by far the most empowering/practical environment. I don't think enormous work would be needed, there are plenty of powerful shortcuts one could take (some ideas are in this thread) and Emacs Lisp can be made to run on top of Common Lisp but it's safe to say that for whatever reasons the will/intent is simply not there.

Emacs Lisp is what we have to go on with and it's nice to see that it is not stagnating.


Primitive Emacs Lisp functions are defined in C, yes, using helper C macros. There might be a way to extract the symbol names from those macros, and maybe enough information to generate the FFI bindings.

Any kind of rewrite of Emacs is going to be a lot of work, I am not denying that. But if we assume there is a layer where everything works as it currently does (the C primitives), the Emacs Lisp defined on top of that needs not be defined in C, as far as I know.

Also, the last time I checked, there were a ton of legacy stuff in the configure script for different platforms.


I wonder whether a modern modularized architecture might be the better way to handle this. Similar to neovim where you have a backend which communicates with a detatched frontend and other backends over some rpc-mechanism.

Refactor the existing emacs to work headless and become the legacy-core, while also establishing channels to attatch any other core and frontend.


https://github.com/xi-editor is an example of the architecture you're referring to I believe. Xi's README makes it clear that very high performance is one of the main goals: in particular it mentions the latency between keypresses and screen painting. To what extent does this RPC-based architecture limit performance compared to a single-process architecture? I know that the original author of Xi very much knows what he is doing, so I guess the answer is "it doesn't" -- I was just hoping to be educated on this point.


An important optimization discussed in the video is to optimize through call into C code (i.e elisp->C->elisp). adding RPC at the boundaries would go in the opposite direction.


It's optimization for a single process vs multiple processes.

Additionally, when you free up your architecture and loose up the internal bindings between different areas, you sometimes can open chances for significant improvments which prior where not possible, thus still be faster with a "worse" solution.

Considering how old GNU Emacs is, how much historical problems it has and how often you hear that this and that can't be optimized for this and that reason, this seems like such a situation where loosing up would open faster roads long term.


Well this is not to my taste but I take it as a proof of existence that an rpc at the boundary could make for some amazing things -

https://github.com/rhysd/NyaoVim


It brings GCC into the address space via libgccjit. GCC is not robust enough to be integrated into applications that stay running, and be repeatedly invoked.


I don't think that is how libgccjit is being used in this project. Emacs is not being linked with libgccjit. libgccjit is only being use for AOT compilation.


I assume you mean writing an Emacs lisp interpreter in Common Lisp — which could be done with a small lexer, a lot of macros, and some library support. That would probably be a win.


Depends on how difficult it is to integrate Common Lisp into Emacs. Or would you just be rewriting Emacs in Common Lisp?


The issue is the importance of all the existing packages — invalidate them and nobody will use your implementation.

There are small syntactic differences between Emacs lisp and Common Lisp so a lexer that handled those would suffice.

There are some small semantic differences but they could be handled by porting code. And the Emacs c code could be rewritten in lisp or ported into the Common Lisp implementation, depending on your taste.


The existing package ecosystem is one of the main value propositions of Emacs. Otherwise a re-implementation with a different extension language could have replaced it a long time ago.

> There are some small semantic differences but they could be handled by porting code. I suspect there are more than just small differences. And someone would have to do the work to port the code. I suspect that if packages that people use don't work, they will just avoid using the new version of Emacs, as opposed to abandoning the package.

It's not an impossible project, but it's not clear to me it's easier/better than the proposed solution.


The point of my suggestion was to not have to rewrite any existing .el files.


You don't think it would be difficult to keep compatibility with all of the existing Emacs Lisp packages that exist now?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: