Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Forcing a panic when 2 routes are matched seems counter intuitive to me (versus, idk, every other web framework which uses the first-to-be-registered route that matches).

Are there go-specific reasons for that? The edge case of "you might register HTTP routes in a bunch of places and it's harder to find that if multiple routes match" seems like something that be worked around with tooling.

I've (ab)used the behavior of "first to be matched wins" a ton in my career - there's been a bunch of "business case" times when I've needed like `/foo/bar` to be one registered route, and `/foo/{id}` to be another.



You can't guarantee the order of registration will always be the same, so it's really undefined behavior. This is how the original ServeMux was designed and implemented, and they felt that was useful behavior to continue to support.

From the design proposal[1]:

  > Using specificity for matching is easy to describe and
  > preserves the order-independence of the original ServeMux 
  > patterns. But it can be hard to see at a glance which of 
  > two patterns is the more specific, or why two patterns 
  > conflict. For that reason, the panic messages that are 
  > generated when conflicting patterns are registered will 
  > demonstrate the conflict by providing example paths, as in 
  > the previous paragraph.
See also the background discussion[2]:

  > The semantics of the mux do not depend on the order of the 
  > Handle or HandleFunc calls. Being order-independent makes 
  > it not matter what order packages are initialized (for 
  > init-time registrations) and allows easier refactoring of 
  > code. This is why the tie breakers are not based on 
  > registration order and why duplicate registrations panic 
  > (only order could possibly distinguish them). It remains a 
  > key design goal to avoid any semantics that depend on 
  > registration order.
[1]: https://github.com/golang/go/issues/61410 [2]: https://github.com/golang/go/discussions/60227


This seems like an issue caused by registration at a distance. I'm new to go and it is one of the things that felt most wrong.

I'm used to a router where you define a big list of routes all in one place, maybe in some DSL. That's very easy to refactor, and doesn't have any ambiguity. If a library wants a route registered it puts a snippet in the docs for users to copy.

Go in general seems to value clarity over magic at the expense of verbosity, and this is a puzzling exception.


It’s not an exception. There is no magic.

Large code bases are large, and not every program is your simple crud app with a handful of endpoints that can be meaningfully managed a single function.


Hard panic in a "not simple app with large codebase" app because you couldn't match a route is amateur hour.


I think you're assuming that the panic happens when a request is received, but it actually happens when a conflicting route is registered.

To me, that's reasonable behavior and is consistent with other things such as https://pkg.go.dev/regexp#MustCompile


I think you are vastly oversimplifying what has been said or why it works the way that it does. I really recommend you read the design document and reasoning, as it is rather clear why they are doing it the way that they are. If you have a cogent contribution to the discussion, please do share.


Others have already contributed more than enough.

People's willingness to defend any and all of go's dubious decisions is really baffling.


Registering handlers is a startup activity ... Practically the only time I allow a panic in my code. I also have unit tests for the entire startup sequence so theoretically code with ambiguous handlers would never be committed.


Go's flag, log, image, and sql packages all work this way, so it's not just the HTTP router. I sort of see both sides. Having implicit registration makes it very easy to have different teams working on different packages and you just import the package and it registers itself. But it also makes the behavior of the resulting final binary hard to understand and based on implicit code instead of explicit. I personally try to just have one big routes() http.Handler function that returns everything all in one place, but I get why that isn't always practical.


It doesn't feel to me like much of a gain over having said team expose a specific default self registration function. Then that act can be explicit rather than action at a distance on some dependency defined global state.

I think this pattern in the standard library is a mistake.


Yeah, the trade-off is literally:

   import _ "thing"
Vs:

   import "thing"
   thing.Register()
But one uses a strange construct to save a single line, loses the ability to control order, and encourages people to use globals that they can't control.


I very much dislike import side effects in any language. Im a lot happier where thing.Register() is still forced to happen in the main function.

Still, I can understand it for some components like loggers to not add boiler plate to every library. However, I was very uneasy to see that enabling gzip decompression in a gRPC server is done through a magic _ import. You have to initialize the server anyways, so why not just make it an explicit function argument?


thing.Register() may register a route that conflicts with yours and it will never be matched in this case if declaration order was taken into account. You may discover this too late when you either have missed important calls on that thing or when requests intended for that thing are causing unintended effects on your first declared route.


That is all also true when doing the same thing in an init func.


Well, yeah, with the "implicit registration" model you can either split it up or do it in a central place, whereas if they would require you to do it in a central place, you wouldn't have that choice. It probably boils down to "library" vs. "framework" - Go's standard library doesn't want (as much as possible) to be a framework that forces you to do things its way, and as far as I'm concerned that's ok...


How is this new to Go? Almost all Java frameworks that rely on Annotations do registration at a distance. Controllers are Routinely declared on different packages and even injected from dependency libraries.


I agree with most of this post. The one exception in my experience: Vertx. I never saw any annotations for routing. It is a very verbose library, but that means there is no black magic.


> Are there go-specific reasons for that?

In general, the reason you use a compiled / typed language like golang at all (instead of, say, perl) is to "left shift" your bugs: A bug caught when you first spin up your application is better than a bug caught after a corner case acts up in the wild, and a bug caught when you compile is better than a bug caught when you first spin up your application.

I recently ran a cross a bug in a side project of mine (using gorilla/mux) where I had accidentally made overlapping routes. If the router had panic'ed instead of running, I would have been nudged to refactor the URLs and completely avoided the bug.


Wouldn’t the “left-shift” be a compiler error instead of a panic?


That's where the focus usually is, but aborting at startup is the next best thing. It's an undervalued technique.


This particular issue is undecideable at compile-time in the general case, so panicking on registration (ie, preventing the process from starting at all) is the next-most visible and noisy error option.


It can not be decided on compile time but it provides the basis for a build time "error". It's enough to have a testcase that registers your routes and the bug is descovered even before you commit potentially.


Ideally.

But that's impractical, so the next soonest time is at startup.


Go as a language is generally not sophisticated enough to do that sort of thing.

A counter- and/or example of this is what Service Weaver does to try to bomb builds when generated files are older than their dependencies.


I'm having trouble thinking of a mainstream production language with a stronger type system that could make a compile error out of the string argument passed to an HTTP router; can you think of one? Or of a non-string-typing for routes that solves the same problem, again in something people ordinarily use to deploy to production?


> that could make a compile error out of the string argument passed to an HTTP router

For example, Phoenix Verified Routes in Elixir: https://hexdocs.pm/phoenix/Phoenix.VerifiedRoutes.html


OCaml can definitely do it (for example, you get a compiler error if you pass the wrong arguments to a `printf` where the format string specifies, say a number, but you pass in a string).

Rust can very likely do it by leveraging their `build.rs` stuff to parse and validate call sites of the registration and parameters.

Zig can probably do it with their comptime stuff.

In theory, Go could do the same (but that would mean special-casing the `net/http` handler registration in the compiler). At least `go vet` is smart enough to yell at you about wrong format string arguments.


Typescript's type system is known to be turing complete and people have implemented things such as sorting/tree-walking just using types (i.e. during compile time) [1].

I'd imagine something like this should be possible as well, but I'm not sure it would be worth it, considering the effort it would take to implement.

[1]: https://twitter.com/anuraghazru/status/1511776290487279616


I'm inclined to suggest the typescript could make compile errors of ambiguous routes, though I don't see any obvious reasons way without an explicitly referenced agreggate type for existing routes. So perhaps not if routes are initialised implicitly like this.

Template type strings with inference would also allow you to parse the strings in the type system.

I cannot imagine wanting to build my routes table implicitly through import graph rather than having a specific place to aggregate.


Most languages with macro or templating should be able to define this behavior as a library. Rust and C++ come to mind, for example.


Can you provide an example of a Rust HTTP routing library that can generate a compile-time error for overlapping routes?


I don't know of one, but you seemed to doubt not that it's actively being done, but that it's even possible, which is a very different proposition.

Remember C++ actually implements checks at compile time for the modern std::format function. That is, if you mess up the text of a format string so that it's invalid, C++ gives you a compile time error saying nope, that's not a valid format.

You might think that's just compiler magic, as it is for say printf-style formats in C, but nope, works for custom formats too, it's (extremely hairy) compile time executed C++.


> "left shift" your bugs

<< bug

Am I doing it right?


ug


> Are there go-specific reasons for that? The edge case of "you might register HTTP routes in a bunch of places and it's harder to find that if multiple routes match" seems like something that be worked around with tooling.

The panic is the tooling. It ensures that you can’t write code that becomes ambiguous and hard to reason about, or become dependent on something random like the order code is initialised in, which may change for completely arbitrary reasons (for example renaming a file your taking advantage of Go’s special file level init func).

> I've (ab)used the behavior of "first to be matched wins" a ton in my career - there's been a bunch of "business case" times when I've needed like `/foo/bar` to be one registered route, and `/foo/{id}` to be another.

It’s pretty trivial to write a handler that accepts the base path, and then routes to a different set of handler functions. Then the routing is clear and explicit.

Ultimately if you want the behaviour not in the stdlib, plenty of other libraries exist out there with more functionality.

The Go stdlib, and language in general has always skewed towards conservative behaviour in the face of possible ambiguity. Something I’ve always appreciated, because it means it’s easy build a strong and accurate intuition of how the stdlib works. You rarely find yourself in situations where what you think is the “obvious” answer isn’t the correct answer, simply because your idea of “obvious” doesn’t perfectly align with the authors idea of “obvious”. Being able to quickly and accurately read and understand code is far more valuable than saving a handful of seconds when writing it.


> It’s pretty trivial to write a handler that accepts the base path, and then routes to a different set of handler functions. Then the routing is clear and explicit.

It seems a little pointless to have route handling functionality that requires more route handling for pretty normal cases.


Almost all the examples provided as "pretty normal cases" of having overlapping path registrations are already handled without panicking by the path precedence rules. All the cases where you have a wildcard path + handlers for specific variants of the wildcard values, are handled as expected without panics. Only the rather extreme corner case of having two wildcard paths, with wildcards in different locations, but still matching the same path, results in a panic.

Honestly if you're running into the second case above, I would question the wiseness of whatever it is you're attempting to do, because reasoning about the behaviour is unlikely to be clear and obvious if registration order is the only differentiator.


I believe it allows `/foo/bar` and `/foo/{id}` as the first one is more specific and has precedence. This looks fine to me.

Looks like it will panic in case you have `/foo/{id}/delete` and `/foo/bar/{action}`. /foo/bar/delete will match both, none is more specific so it panics. Feels reasonable. Having a first one wins precedence might be better though.


The least surprising behavior would be matching `/boo/bar/{action}`, since we're dealing with path element separated by slashes, and the precedence case already exists otherwise.


I would expect and prefer the longer literal prefix to match. If the goal is to not have so many different routers in use, having some options like this case would be better than having one opinionated Go-way, which is how we end up with many libraries to fill common gaps.


Na it’s dumb. Bar is very specific. It should be picked.


It's a 404 Not Found or 500 Server Error.

HTTP server panicking is never reasonable


The panic happens at handler registration i.e. binary startup, not while performing path matching. Having literally any tests in your application, literally anything that executes your binary before you ship it, will tell there's a problem. Your code simply won't be able start serving traffic if there's path routing ambiguity.

If you've managed to ship code that panicked during execution due to path routing ambiguity, then honestly your code is probably so riddled with bugs this is going to be the least of your issues.


Getting an error instead of the incorrect data because the route that was actually ran was another one is pretty good for debugging. I have actually ran into this with Django, one route was missing the ending "$" regex.


I get the frustration, but URL routing correctness is fairly trivially tested in Django and other frameworks to work around this issue, whereas it sounds like the matching algorithm here simply does not support certain common use-cases.


What common use-cases does it not support?


The `/foo/bar` and `/foo/*` use-case, where you want the first to go to a special page and the latter to go to some regular page. Perhaps the former is hard-coded/static and the latter looks up some URL parameter in a database.

You can of course always implement this within `/foo/*`, but then you're implementing your own URL routing and working around the framework rather than working inside the framework.

This feels to me like it's a change designed for APIs, not a change designed for user-facing websites. For APIs URL structure is often well defined in a technical sense, and this sort of use-case is rare. For a user-facing site however there are UX concerns, marketing concerns, SEO concerns, all sorts of reasons why some lovely, technically correct REST-style URL structure just won't work in practice. Unfortunately I know this from experience.


This work, the panic is only triggered when the precedence rules can't fix a conflict.

The behaviour documented here is actually extremely sensible and is best practices for anyone calling themselves a software engineer. Fail as early as possible, with as clear a message as possible.


But there is a precedence to pick for the panic case too.


I'm pretty sure that example worked even before the recent changes. From the documentation:

> Longer patterns take precedence over shorter ones, so that if there are handlers registered for both "/images/" and "/images/thumbnails/", the latter handler will be called for paths beginning with "/images/thumbnails/" and the former will receive requests for any other paths in the "/images/" subtree.


The two routes `/foo/bar` and `/foo/*` are not ambiguous and would be allowed by the router


Are they? It seems that the latter matches the former. i.e. for the path `/foo/bar` there are 2 matching routes. One must take precedence, but this router doesn't appear to allow that.

This may just be a syntax thing, I was being loose with syntax, and meant that the `*` would match anything for the purposes of this example.


It does allow that, the one without wildcards is more specific so it'll pick that one


This pattern already works and would work with this change, you'd just need to define `/foo/bar` first.


I understood what you were saying but it might be easier to read if you escaped the * characters.


Ha! Whoops! Thanks


The panics are really annoying. Sometimes, you generate routes dynamically from some data, and it would be nice for this to be an error, so you can handle it yourself and decide to skip a route, or let the user know.

With the panic, I have to write some spaghetti code with a recover in a goroutine.


What kinds of routes would you generate dynamically that couldn't be implemented as wildcards in the match pattern? Genuine question


> or let the user know

Many are misunderstanding when the panic happens. It does not happen when the user requests the path, it happens when the path is registered. The user will never arrive at that path to be notified. You will be notified that you have a logic error at the application startup. It can be caught by the simplest of tests before you deploy your application.


Mmmm, code with recover is just a valid code. Calling it spaghetti seems unjustified.


I'll always take a footgun with a sizable bang over a sinister undefined behavior.

Debugging the latter one is much more harder.


If it panics, you discover it instantly and can fix. If it didn’t panic, you could unknowingly deploy and rely on behavior you did not know about until it is an issue.


It's a programmer's error, so it should lead to a panic. There is simply no reason to ever handle this at runtime (as opposed to network errors or other things that can go wrong during normal operation and should be handled at runtime).


Not always, I can imagine a (weird, for sure) scenario where routes are a part of configuration, or are added dynamically at runtime.

There’s `recover` of course, but I see no harm in returning an `error`, especially given that `errors.Join` is a thing now, so one doesn’t need to copy-paste ifs.

The only reason not to is keeping the function signature intact.


And I see no Harm on making this extreme wierd usecase less user-friendly and force you to go through panic recover rather than force everyone else that will never get errors to handle errors.


It's not "extreme weird", just significantly less common, so just weird, but nowhere extreme. Dynamic routing isn't exactly unheard of, and isn't some sort of perversion - servers like Traefik and Caddy do this (except that they don't use stdlib mux, of course). Basically any DIY proxy with dynamic service discovery may need it, and while people typically pick off-the-shelf solution, some write their own lightweight one.

And panic/recover is not idiomatic Go here, as it's not an exceptional situation, just an error.

YMMV, of course.


do they though? first both of them and afaik all other routers need to reload so for all intents and purposes the routing table is static, you can not add routing rules during the runtime. (Their apparent "dynamism" comes because they sighup or reload the process to put the new rules in place).

Secondly and most importantly they don't error. Caddy will simply match the last declared rule while Traeffic has a system of priority labels and yet for conflicting rules with same priority again it will match the last one (or even worse a random one). Both of these are very dangerous because you may realise late (from statistic logs) that you are routing the wrong requests to the wrong handler, and have already missed countless calls that should have gone on the first declared one.


> I've (ab)used the behavior of "first to be matched wins" a ton in my career

Even if the framework could guarantee the order of registration, this source-order heuristic is easy when you work alone or in a small team. Good luck guaranteeing order of routes in a large project with many teams.

Actually forcing you to not depend on things that can cause hidden bugs is a good design decision.


This comment is interesting to me, because in Clojure there is a data-driven routing library called Reitit, and by default it will refuse to compile conflicting routes unless you explicitly tag the conflicting routes as conflicting. So Golang's new behavior is more intuitive to me than the current behavior.

I have also ran into situations where I needed a routing tree like

  [["/foo/bar"]
   ["/foo/:id"]]
but it's better IMO for routing libraries to force the user to acknowledge they are introducing conflicting routes rather than silently resolve conflicts. That way the user is forced to understand the behavior of the router.


I feel conflicted on it. I've abused routes in that way, but I've also been confused or encountered bugs because I didn't notice the conflict.


I would have expected the last one to override the first when registering routes. That seems to be the behavior I see most (not specific to web servers).

Given opposite expectations, erroring out makes sense, but a panic? Does that mean it crashes the whole web server when a client first accesses it, when you launch the server, or does it return a 500 to the client?


When you launch the server. Idea is that you want to catch errors as early as possible in the development process, and by crashing the server, the programmer that wrote it will catch it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: