Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I never liked encountering code that chains functions calls together like this

email.bulkSend(generateExpiryEmails(getExpiredUsers(db.getUsers(), Date.now())));

Many times, it has confused my co-workers when an error creeps in in regards to where is the error happening and why? Of course, this could just be because I have always worked with low effort co-workers, hard to say.

I have to wonder if programming should have kept pascals distinction between functions that only return one thing and procedures that go off and manipulate other things and do not give a return value.

https://docs.pascal65.org/en/latest/langref/funcproc/



> email.bulkSend(generateExpiryEmails(getExpiredUsers(db.getUsers(), Date.now())));

What makes it hard to reason about is that your code is one-dimensional, you have functions like `getExpiredUsers` and `generateExpiryEmails` which could be expressed as composition of more general functions. Here is how I would have written it in JavaScript:

    const emails = db.getUsers()
        .filter(user => user.isExpired(Date.now()))  // Some property every user has
        .map(generateExpiryEmail);  // Maps a single user to a message

    email.bulkSend(emails);
The idea is that you have small but general functions, methods and properties and then use higher-order functions and methods to compose them on the fly. This makes the code two-dimensional. The outer dimension (`filter` and `map`) tells the reader what is done (take all users, pick out only some, then turn each one into something else) while the outer dimension tells you how it is done. Note that there is no function `getExpiredUsers` that receives all users, instead there is a simple and more general `isExpired` method which is combined with `filter` to get the same result.

In a functional language with pipes it could be written in an arguably even more elegant design:

    db.getUsers() |> filter(User.isExpired(Date.now()) |> map(generateExpiryEmail) |> email.bulkSend
I also like Python's generator expressions which can express `map` and `filter` as a single expression:

    email.bulk_send(generate_expiry_email(user) for user in db.get_users() if user.is_expired(Date.now())


I guess I just never encounter code like this in the big enterprise code bases I have had to weed through.

Question. If you want to do one email for expired users and another for non expired users and another email for users that somehow have a date problem in their data....

Do you just do the const emails =

three different times?

In my coding world it looks a lot like doing a SELECT * ON users WHERE isExpired < Date.now

but in some cases you just grab it all, loop through it all, and do little switches to do different things based on different isExpired.


  If you want to do one email for expired users and another for non expired users and another email for users that somehow have a date problem in their data....
Well, in that case you wouldn't want to pipe them all through generateExpiryEmail.

But perhaps you can write a more generic function like generateExpiryEmailOrWhatever that understands the user object and contains the logic for what type of email to draft. It might need to output some flag if, for a particular user, there is no need to send an email. Then you could add a filter before the final (send) step.


since were just making up functions..

    myCoolSubroutine = do
      now <- getCurrentTime
      users <- getUsers
      forM users (sendEmail now)

    sendEmail now user =
      if user.expiry <= now
        then sendExpiryEmail user
        else sendNonExpiryEmail user
The whole pipeline thing is a red herring IMO.


What language is this?


Looks like Haskell


> Question. If you want to do one email for expired users and another for non expired users and another email for users that somehow have a date problem in their data.... > > Do you just do the const emails = > > three different times?

If it's just two or three cases I might actually just copy-paste the entire thing. But let's assume we have twenty or so cases. I'll use Python notation because that's what I'm most familiar with. When I write `Callable[[T, U], V]` that means `(T, U) -> V`.

Let's first process one user at a time. We can define an enumeration for all our possible categories of user. Let's call this enumeration `UserCategory`. Then we can define a "categorization function" type which maps a user to its category:

    type UserCategorization = Callable[[User], UserCategory]
I can then map each user to a tuple of category and user:

    categorized_users = map(categorize, db.get_users())  # type Iterable[tuple[UserCategory, User]]
Now I need a mapping from user category to processing function. I'll assume we call the processing function for side effects only and that it has no return value (`None` in Python):

    type ProcessingSpec = Mapping[UserCategory, Callable[[User], None]
This mapping uses the user category to look up a function to apply to a user. We can now put it all together: map each user to a pair of the user's category and the user, then for each pair use the mapping to look up the processing function:

    def process_users(how: ProcessingSpec, categorize: UserCategorization) -> None:
        categorized_users = map(categorize, db.get_users())
        for category, user in categorized_users:
            process = how[category]
            process(user)
OK, that's processing one user a time, but what if we want to process users in batches? Meaning I want to get all expired users first, and then send a message to all of them at once instead of one at a time. We can actually reuse most of our code because how how generic it is. The main difference is that instead of using `map` we want to use some sort of `group_by` function. There is `itertools.groupby` in the Python standard library, but it's not exactly what we need, so let's write our own:

    def group_by[T, U](what: Iterable[T], key: Callable[[T], U]) -> Mapping[U, list[T]]:
        result = defaultdict(list)
        # When we try to look up a key that does not exist defaultdict will create a new
        # entry with an empty list under that key
        for x in what:
            result[key(x)].append(x)
        return x
Now we can categorize our users into batches based on their category:

    batches = group_by(db.get_users(), categorize)
To process these batches we need a mapping from batch to a function which process an iterable of users instead of just a single user.

    type BatchProcessingSpec = Mapping[UserCategory, Callable[[Iterable[User]], None]
Now we can put it all together:

    def process_batched_users(how: BatchProcessingSpec, categorize: UserCategorization) -> None:
        batches = group_by(db.get_users(), categorize)
        for category, users in batches:
            process = how[category]
            process(users)
There are quite a lot of small building block functions, and if all I was doing was sending emails to users it would not make sense to write these small function that add indirection. However, in a large application these small functions become generic building blocks that I can use in higher-order functions to define more concrete routines. The `group_by` function can be used for many other purposes with any type. The categorization function was used for both one-at-a-time and batch processing.

I have been itching to write a functional programming book for Python. I don't mean a "here is how to do FP in Python" book, you don't need that, the documentation of the standard library is good enough. I mean a "learn how to think FP in general, and we are going to use Python because you probably already know it". Python is not a functional language, but it is good enough to teach the principles and there is value in doing things with "one hand tied behind your back". The biggest hurdle in the past to learning FP was that books normally teach FP in a functional language, so now the reader has to learn two completely new things.


Your post was very interesting in terms of how to translate requirements to a functional solution. You should write that book on how to do that.


In Elixir this would be written as:

  db.getUsers()
  |> getExpiredUsers(Date.now())
  |> generateExpiryEmails()
  |> email.bulkSend()
I think Elixir hits the nail on the head when it comes to finding the right balance between functional and imperative style code.


Not a single person in this thread commented on the use of Date.now() and similar - surely clock.now() - you never ever want to use global time in any code, how could you test it?

clock in this case is a thing that was supplied to the class or function. It could just be a function: () -> Instant.

(Setting a global mock clock is too evil, so don't suggest that!)


I was just referring to how pipes make these kinds of chained function calls more readable. But on your point, I think using Date.now() is perfectly ok.


> I think using Date.now() is perfectly ok.

This is why we have tests which we need to update every 3 months, because somebody said this. This is of course, after a ton of research went into finding out why the heck our tests broke suddenly.


I would call those badly-written tests. The current date/time exists outside the system and ought to be acceptable for mocks, and in python we have things like freezegun that make it easy to control without the usual pitfalls of mocks.


What are those mock pitfalls, which are avoided by freezegun which is a mock according even to them? IoC and Clocks solve the same problem. So what are the pitfalls of using those instead of this other mock?


Applying it to the wrong module, a very common mistake in python due to how imports work.


What happens during a daylight savings adjustment?


You use UTC which doesn't adjust daylight savings.


nit picking on the example code while missing the example the code was trying to demonstrate. I see why TAOCP used pseudocode


Agreed! But i didnt miss the example.... i also thought it was interesting that all the various examples of declarative or applicative did Date.now(), which i see as a big thing to avoid.


    bulk_send(
        generate_expiry_email(user) 
        for user in db.getUsers() 
        if is_expired(user, date.now())
    )
(...Just another flavour of syntax to look at)


The nice thing with the Elixir example is that you can easily `tap()` to inspect how the data looks at any point in the pipeline. You can also easily insert steps into the pipeline, or reuse pipeline steps. And due to the way modules are usually organized, it would more realistically read like this, if we were in a BulkEmails module:

  Users.all()
  |> Enum.filter(&Users.is_expired?(&1, Date.utc_today()))
  |> Enum.map(&generate_expiry_email/1)
  |> tap(&IO.inspect(label: "Expiry Email"))
  |> Enum.reject(&is_nil/1)
  |> bulk_send()
The nice thing here is that we can easily log to the console, and also filter out nil expiry emails. In production code, `generate_expiry_email/1` would likely return a Result (a tuple of `{:ok, email}` or `{:error, reason}`), so we could complicate this a bit further and collect the errors to send to a logger, or to update some flag in the db.

It just becomes so easy to incrementally add functionality here.

---

Quick syntax reference for anyone reading:

- Pipelines apply the previous result as the first argument of the next function

- The `/1` after a function name indicates the arity, since Elixir supports multiple dispatch

- `&fun/1` expands to `fn arg -> fun(arg) end`

- `&fun(&1, "something")` expands to `fn arg -> fun(arg, "something") end`


Not sure I like how the binding works for user in this example, but tbh, I don't really have any better idea.

Writing custom monad syntax is definitely quite a nice benefit of functional languages IMO.


> I have to wonder if programming should have kept pascals distinction between functions that only return one thing and procedures that go off and manipulate other things and do not give a return value.

What you want is to use a language that has higher-kinded types and monads so that functions can have both effects (even multiple distinct kinds of effects) and return values, but the distinction between the two is clear, and when composing effectful functions you have to be explicit about how they compose. (You can still say "run these three possibly-erroring functions in a pipeline and return either the successful result or an error from whichever one failed", but you have to make a deliberate choice to).


Making a distinction between pure and effectful functions doesnt require any kind of effect system though.

Having a language where "func" defines a pure function and "proc" defines a procedure that can performed arbitrary side effects (as in any imperative language really) would still be really useful, I think


> Having a language where "func" defines a pure function and "proc" defines a procedure that can performed arbitrary side effects (as in any imperative language really) would still be really useful, I think

Rust tried that in the early days, the problem is no-one can agree on exactly what side effects make a function non-pure. You pay almost all the costs of a full effect system (and even have to add an extra language keyword) but get only some of the benefits.


The definition I’ve used for my own projects is that anything that touches anything outside the function or in any way outlives the function is impure. It works pretty well for me. That is, no i/o, mutability of a function-local variable is okay but no touching other memory state (and that variable cannot outlive the return), the same function on the same input always produces the same output, and there’s no calling of impure code from within pure code. Notice this makes closures and currying impure unless done explicitly during function instantiation, making those things at least nominally part of the input syntactically. YMMV.


nim does that. and they are called that.


I would have written each statement on its own line:

var users = db.getUsers();

var expiredUsers = getExpiredUsers(users, Date.now());

var expiryEmails = generateExpiryEmails(expiredUsers);

email.bulkSend(expiryEmails);

This is not only much easier to read, it's also easier to follow in a stack trace and it's easier to debug. IMO it's just flat out better unless you're code golfing.

I'd also combine the first two steps by creating a DB query that just gets expired users directly rather than fetching all users and filtering them in memory:

expiredUsers = db.getExpiredUsers(Date.now());

Now I'm probably mostly getting zero or a few users rather than thousands or millions.


Yeah. I did not mention what I would do, but what you wrote is pretty much what I prefer. I guess nobody likes it these days because it is old procedural style.


There's nothing procedural about binding return values to variables, so long as you aren't mutating them. Every functional language lets you do that. That's `let ... in` in Haskell.


(author here)

This is actually closer to the way the first draft of this article was written. Unfortunately, some readability was lost to make it fit on a single page. 100% agree that a statement like this is harder to reason about and should be broken up into multiple statements or chained to be on multiple lines.


Took me a bit of scrolling to find this. I believe most of the other folks are functional devs or something. The 5 functions on a single line wouldn't pass the code review in most .net/java shops.

The rule I was raised with was: you write the code once and someone in the future (even your future self) reads it 100 times.

You win nothing by having it all smashed together like sardines in a tin. Make it work, make it efficient and make it readable.


Glad to see this. This style seems like it’s out of vogue now, but I find it much, much easier to reason about.


I agree because it reads as it will process in the direction I normally read. But I do think one of the benefits of the function approach is that the scope isn't cluttered with staging variables.

For these reasons one of the things I like to do in Swift is set up a function called ƒ that takes a single closure parameter. This is super minimal because Swift doesn't require parenthesis for the trailing closure. It allows me to do the above inline without cluttering the scope while also not increasing the amount of redirection using discrete function declarations would cause.

The above then just looks like this:

  ƒ { 
    var users = db.getUsers();
    var expiredUsers = getExpiredUsers(users, Date.now());
    var expiryEmails = generateExpiryEmails(expiredUsers);\
    email.bulkSend(expiryEmails);
  }


But then you are creating references with larger then needed "reachability".


I don't see a problem with that. This code would typically be inside it's own function anyway, but regardless I think your nitpick is less important than the readability benefit.


It also invites exceptions as error handling instead of a monadic (result) pattern. I usually do something more like

    Result<Users> userRes = getExpiredUsers(db);
    if(isError(userRes)) {
        return userRes.error;
    }

    /* This probably wouldn't actually need to return a Result IRL */
    Result<Email> emailRes = generateExpireyEmails(userRes.value);
    if(isError(emailRes)) {
        return emailRes.error;
    }

    Result<SendResult> sendRes = sendEmails(emailRes.value);
    if(isError(sendRes)) {
        return sendRes.error;
    }
    
    return sendRes; // successful value, or just return a Unit type.
This is in my "functional C++" style, but you can write pipe helpers which sort of do the same thing:

    Result<SendResult> result = pipe(getExpiredUsers(db))
        .then(generateExpireyEmails)
        .then(sendEmails)
        .result();

    if(isError(result)) {
        return result.error;
    }
If an error result is returned by any of the functions, it terminates immediately and returns the error there. You can write this in most languages, even imperative/oop languages. In java, they have a built in class called Optional with options to treat null returns as empty:

    Optional.ofNullable(getExpiredUsers(db))
        .map(EmailService::generateExpireyEmails)
        .map(EmailService::sendEmails)
        .orElse(null);
or something close to that, I haven't used java in a couple years.

C++ also added a std::expected type in C++23:

    auto result = some_expected()
        .and_then(another_expected)
        .and_then(third_expected)
        .transform(/* ... some function here, I'm not familiar with the syntax*/);


I may have gotten nerd sniped here, but I believe all of these examples so far have some subtle errors. Using elixir syntax, I would think something like this covers most of the cases:

    expiry_date = DateTime.now!("Etc/UTC")

    query = 
          from u in User,
          where: 
            u.expiry_date > ^expiry_date 
            and u.expiry_email_sent == false,
          select: u

    MyAppRepo.all(query)
    |> Enum.map(u, &generate_expiry_emails(&1, expiry_date))
    |> Email.bulkSend()  # Returns {:ok, %User{}} or {:err, _reason}
    |> Enum.filter(fn 
      {:ok, _} -> true
      _ -> false
    end)
    |> Enum.map(fn {:ok, user} ->
      User.changeset(user, %{expiry_email_sent: true})
      |> Repo.update()
    end)

Mainly a lot of these examples do the expiry filtering on the application side instead of the database side, and most would send expiry emails multiple times which may or may not be desired behavior, but definitely isn't the best behavior if you automatically rerun this job when it fails.

----

Edit: I actually see a few problems with this, too, since Email.bulkSend probably shouldn't know about which user each email is for. I always see a small impedance mismatch with this sort of pipeline, since if we sent the emails individually it would be easy to wrap it in a small function that passes the user through on failure.

If I were going to build a user contacting system like this I would probably want a separate table tracking emails sent, and I think that the email generation could be made pure, the function which actually sends email should probably update a record including a unique email_type id and a date last sent, providing an interface like: `send_email(user_query, email_id, email_template_function)`


Ada is a great modern language that preserves the distinction between functions and procedures that you mention.


That's pretty hardcore, like you want to restrict the runtime substitution of function calls with their result values? Even Haskell doesn't go that far.

Generally you'd distinguish which function call introduces the error with the function call stack, which would include the location of each function's call-site, so maybe the "low-effort" label is accurate. But I could see a benefit in immediately knowing which functions are "pure" and "impure" in terms of manipulating non-local state. I don't think it changes any runtime behavior whatsoever, really, unless your runtime schedules function calls on an async queue and relies on the order in code for some reason.

My verdict is, "IDK", but worth investigating!


It has been so long since I worked on the code that had chaining functions and caused problems that I am not sure I can do justice to describing the problems.

I vaguely remember the problem was one function returned a very structured array dealing with regex matches. But there was something wrong with the regex where once in a blue moon, it returned something odd.

So, the chained functions did not error. It just did something weird.

Whenever weird problems would pop up, it was always passed to me. And when I looked at it, I said, well...

I am going to rewrite this chain into steps and debug each return. Then run through many different scenarios and that was how I figured out the regex was not quite correct.


> you want to restrict the runtime substitution of function calls with their result values?

I don't get how you got there from parent comment.

Pascal just went with a needless syntax split of (side-effectful) methods and (side-effectful) functions.


Since everyone's giving !opinions, in my C# DDD world you'd ideally be able to:

  _unitOfWork.Begin();

  var users = await _usersRepo.Load(u => u.LastLogin <= whateverDate);
  users.CheckForExpiry();

  _unitOfWork.Commit();
That then writes the "send expiry email" commands from the aggregate, to an outbox, which a worker then picks up to send. Simple, transactional domain logic.


On the same page here, read it multiple times to see if I can convince my mind, this is bit off in terms of reading the code as its being executed. There are high chances of people making mistakes over the time with such patterns. As usual there is always a trade off involved, readability is the one taking hit here.


These chains become easy to read and understand with a small language feature like the pipe operator (elixir) or threading macro (clojure) that takes the output of one line and injects it into the left or rightmost function parameter. For example: (Elixir) "go " |> String.duplicate(3) # "go go go " |> String.upcase() # "GO GO GO " |> String.replace_suffix(" ", "!") # "GO GO GO!"

(Clojure) ;; Nested function calls (map double (filter even? '(1 2 3 4)))

;; Using the thread-last macro (->> '(1 2 3 4) (filter even?) ; The list is passed as the last argument (map double)) ; The result of filter is passed as the last argument ;=> (4.0 8.0)

Things like this have been added to python via a library (Pipe) [1] and there is a proposal to add this to JavaScript [2]

1: https://pypi.org/project/pipe/ 2: https://github.com/tc39/proposal-pipeline-operator


If you get an exception, you might not know where it comes from unless you get a stack trace. Code looks nice but not practical imo


I use Clojure all the time and I haven’t noticed the gripe you’ve got, but these are built in features of (somewhat) popular programming languages. Might not be for you but functional programming isn’t for everyone.


You could write the logic in a more straight forward, but less composable way, so that all the logic resides in one pure function. This way you can also keep the code to only loop over the users once.

email.sendBulk(generateExpiryEmails(db.getUsers(), Date.now()));




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: