Have you found a good writeup on using Python like that? I was talking to someon...

weberc2 · on June 10, 2017

This drives me crazy. Rather than a uniform separation of object definition and object graph assemblage, we'll just weld arbitrary parts of the object graph assemblage to arbitrary object definitions and write a unique snowflake hack in each test case to work around it, all because we can. The lack of consistency will make the entry point every bit as illegible if not worse, but surely if we weren't supposed to write code this way, Python would make it difficult.

Honestly, this is my favorite argument for static typing. Stupidity should be difficult.

I say this as a Python developer. We waste a lot off time debating this in code review and debugging this junk code when it inevitably leads to bugs or inexplicably broken test cases. Static typing would have precluded these hacks (and others).

chubot · on June 10, 2017

I was going to write that I don't have a good pointer, but actually I think the idea is basically the "onion architecture" or "clean architecture". I don't recall if I watched all of this talk, but in general I like Brandon Rhodes' PyCon talks :

https://www.youtube.com/watch?v=DJtef410XaM

http://rhodesmill.org/brandon/slides/2014-07-pyohio/clean-ar...

I don't come from a Java background, but I think that many of these ideas originated in the Java community. They came to Python later (if at all), and it's a little funny to see Go struggling with the same thing.

I think I came at it from more of a functional programming perspective, and the style gradually converged with what people were thinking about OOP. I'm not actually of all the details about "clean architecture" and "onion architecture", but they are definitely related to dependency injection.

Basically all your effects and state can be instantiated on the "outside" in main() and passed INWARD. But I don't believe it necessarily has to be pure. I encapsulate all my state, but I don't encapsulate all my side effects, like printing to the screen.

Some people appeared to like this comment, which is somewhat related:

https://news.ycombinator.com/item?id=11841893

I'm not sure how much this will help, but here is 12K+ lines of code I wrote with zero globals:

https://github.com/oilshell/oil

The main directories to look at are core/ and osh/. The main program imports a lot of stuff and wires it together:

https://github.com/oilshell/oil/blob/master/bin/oil.py

(There is a bunch of surface messiness, but the architecture is sound.) I've also written even larger programs with zero globals.

Also, I believe that web frameworks are some of the worst offenders in terms of preventing you from using this style. I don't think I've seen a single web framework that made the handlers easily testable.

I think you may be fighting against the grain in some parts of the Python world using this style... I can do it in certain projects because I'm writing most of the code from scratch, and forking some dependencies.

Jach · on June 10, 2017

Thanks. I skimmed the slides, it looks like he's saying three things. 1 is that dependency injection is doable in a few ways (he didn't mention mine) with Python, but it's kind of tacky. 2 is that what you really want is functional programming so that you don't need to do dependency injection at all, or at least not on as many things, the dependency is just data and pure functions won't change output with the same input (so can't do I/O which is one of the bigger common dependencies to factor out)! (Reminds me of Clojure's value of values.) Tests are easier with functional programming (and not stated but I like that it encourages doctests too). 3 is that the "onion architecture" (I think this is the same thing as the "hexagonal architecture" which you can find around the internet with pretty high levels of praise) is great and helps with this and other goals of good design. All 3 I agree with, though only in principal on #3 because I haven't had a chance to try it all the way in a large system, and I feel like it'd be a lost cause trying to bring it into an existing legacy system without full team support. We just need to get our crap under test first so the fires can die down, worry about better architecture refactoring where testing can fall out naturally later.

I came from Python but have worked a lot in Java (and primarily in Java for the last few years). It took me a while to accept dependency injection but now I see it as fundamental to good Java design. But I don't see the same for Python. My reasoning is that it comes down to the difference in static and dynamic typing and how imports work. Maybe even mostly on how imports work, since DI did take off in Angular for JS, but if JS had a module system it might not have needed to like we see with Python. Ideas that quickly spread and cross quite different language boundaries are the ones I pay more attention to (like the industry shift to more functional designs and patterns, "composition over inheritance" when OOP is still in heavy use, and so on), DI hasn't been one of those.

The canonical example of dependency injection in Java is having a method that constructs a new Date object (or something that is based on the current system time) and does something with it, and asking the question of how you would test it. Well the JVM is pretty dynamic under the hood so if you have to (useful for legacy systems or third party code you can't really modify) you can cheat with something like PowerMock and overwrite the bytecode for the built in Date class with your mock one. But normally, such usage is frowned upon, just change the code under test. So DI comes into play where you bring either the Date class or some common time-related interface up into the method signature argument, and you provide the value in your test. You may or may not have another method for prod code to call that supplies a default value, depending on whether you want the burden of giving a good default (if such a thing makes sense for the use case) to be put on the client or not.

When you have dependencies that your whole class uses, then it's common to put those on the constructor so clients will pass you in the object, instead of you making one yourself. Frameworks like Spring make that slightly easier to work with (especially when you get into cross-module service classes you want to depend on) by letting you just specify a setter() and it will magically make sure it autowires an object for your class methods to use when needed. You can easily resolve circular dependency problems this way too, A <-> B is resolved by having two more modules for IA and IB that each of A and B depend on instead with something behind the scenes to make sure it all wires up correctly. Interfaces are great since now you can depend on much less. If you want a mock, just implement a test class implementing the interface. You can use the less powerful Mockito library to help only mock out the parts of the interface you actually need to test (but this implies access to the source so you can know such things about the implementation). With interfaces though you can in principal even test while treating the function as a blackbox, and if you can prove that you've passed all possible states for the dependency interface you can prove interesting things. Or just fuzz it. Or be allowed to have your IDE helpfully autocomplete types and suggestions.

Python doesn't have interfaces, being duck-typed, nor does it force static typing. This alone makes DI less useful. I can write:

    def unix_time(time_dependency):
        return time_dependency.time()

and have it live in its own file, nothing else in it. Java would throw a fit, it needs to know what type time_dependency is. Python doesn't care, as long as it can call the .time() function on it. I think this is great, but some programmers are terrified. But anyway it's following DI and assuming time_dependency.time() is pure (it may or may not be) would make unix_time also pure itself. But it's not very helpful if you're looking at the signature or even the full source itself. Something more helpful would be:

    import time
    def unix_time(time_dependency=time):
        return time_dependency.time()

Then you at least have an idea what to pass it yourself besides just something that can have .time() called on it, or you can pass it nothing. Something you would see in Java, but with the same method names instead of the different ones that Python requires:

    import time
    def unix_time_internal(time_dependency):
        return time_dependency.time()
    
    def unix_time():
        return unix_time_internal(time)

But the most common implementation in Python is probably this:

    import time
    def unix_time():
        return time.time()

The Pythonista in me also thinks this is in fact the best implementation. It's just as testable as any of the others. As I mentioned in my first comment, in your test code, instead of passing in a time dependency directly, or using a mock/patch library that starts to smell like PowerMock, you can literally just call (assuming the function was in a file called timeutil.py) timeutil.time = myTimeDep and then call timeutil.unix_time() with your substitution used. This is because unlike in Java where imports are definitions for a compiler to lookup and insert things about, in Python they are global (to the file) variables, and they can be redefined, and with Python's dynamicism it looks up the name of the thing each time to get its value instead of doing it once and coding a static reference to it, so your redefinition will take precedence. Of course this pattern isn't even that unusual in non-unit Java tests, frequently the test will save the current value of something and overwrite it in the setup, run the test, and then restore the old value in the teardown. At least in Python you can have a clean decorator for that.

In your code I opened util.py at random. Looking at GetHomeDir, I can't test this in the Java sense, it has dependencies in its implementation on os and pwd. But that's ok, because unlike in Java, those dependencies are actually global (to the file) variables that I can mutate. Plus you probably tested it manually in the REPL, another development benefit Python has over languages like Java, you can build and test things bottom up sometimes. Thanks for the code sample though, I'll keep it around for future reference. (Would you say it's common these days in the community to have test files living next to the implementation files? The only place I see that these days are with JS components.)

Probably the initial resistance I had with DI is that taken to the ultimate extreme, DI would take the form of a style of functional programming whose name escapes me at the moment, but essentially one where everything is passed as a parameter, including language built-ins, and only at the top level where you define a main() can you specify all of these things that the language will inject when it runs your program.

When I work with Python, I don't see a problem with not having the same DI toolkits or seeing projects not following it very well. No more so than not having privates or a culture in love with a sea of getter and setter methods. But I can see why some might disagree, like your sibling comment, and cry out for static types as the answer. Well, there's Python 3, not sure how much it gives you for this, and there's a ton of other static languages out there, I'd suggest working with those and leaving the Python community to do its thing. Functional programming in general, though, that's the popular thing in pretty much every language in modest use at this point, it's good stuff. A lot of bugs in old Python code I wrote could have been avoided if I had written in a more functional style, probably not as many if I had followed DI more rigorously.

chubot · on June 10, 2017

Addendum: I get that you can test anything by monkey patching in Python, and I definitely used to do that in setUp and tearDown.

But I guess I have stopped doing that, in favor of testing against my own interfaces, using integration tests with shell scripts, or the odd param that is only for testing. I think I just like to have some indication in the code of where I tested something, and it doesn't cost very much and doesn't come up too often.

chubot · on June 10, 2017

Yes I totally get what you are saying. At one point I might not have written the last unix_time() example, but that is my default now, UNTIL say I hit a bug and need to test it thoroughly.

GetHomeDir() is a great example that proves that point... if I were trying to be very pure, I would have reified the pwd dependency and environment dependency.

One reason to avoid the complication is that I'm using shell scripts for tests. So in shell you already have a test environment. You can set $HOME to whatever, and in theory /etc/passwd, although that's a little trickier.

Actually that was one of my motivations for writing a shell :) You can test "one level out" at the OS level. Instead of the "seam" being the language, the seam is the OS.

I prefer to test against STABLE INTERFACES, not against things that I made up internally. You don't want your tests to calcify the structure of your code. I've seen that happen a lot with fine-grained testing, and it's a big pitfall.

I would say at the beginning, I write unit tests for tricky parts, but don't aim for 100% unit test coverage. I aim more for high integration test coverage. And then at the end of the project, when you are fixing bugs, that is when you can do fine-grained testing without worrying about making a mess of your code.

Here I did parameterize random() (irr_rand), because it's very important to the function of the code and needs to be tested:

https://github.com/google/rappor/blob/master/client/python/r...

So it all depends on the context. That's why I say it takes some practice. It takes practice to:

- not end up with more than 3-4 parameters for each class.

- not end up with too many classes. Java code seems to fall into this. A lot of things are just functions with dependencies. Actually this style I think reduces the need for classes -- they are your parameters rather than being your context!

- not structuring your code as a deep tree of calls. Instead it should be a relatively flat object graph.

A relatively static "Object graph" is really the idea that distinguishes OOP from "structured programming" (i.e. a pyramid).

I think I would differ with you in that I don't think DI and functional programming are that different. I think they are trying to get at the same core idea.

Things I ALMOST NEVER use in Python:

- setters and getters.

- Especially, setters for dependency injection. I always pass params through constructors.

- decorators. This can always be accomplished with composition of objects. I find that style a lot more readable. decorators are non-trivial code at the global level, when it really should be in main().

- classmethod and staticmethod. Static methods should just be functions. Class method is kind of a hack for singleton-like behavior.

- As mentioned, the singleton pattern is banned. Singletons are just classes you instantiate once.

I like that this style gets rid of a lot of concepts: thread local (as mentioned above), explicit singleton pattern, and "static/class methods".

Also I think it's true that Java's static type system might get in the way a little bit, but I don't have a strong conclusion on that. However, I also wish Python were a little more strict. I don't use 90% of the dynamism of classes.