Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Oh good, another generation of of programmers with their brains stuck thinking about box-and-arrow diagrams and have no clue how to store a binary tree in a way that don't blow your cache or a graph in an adjacency matrix.


Wait, there are people out there that don't know what you know? Sound the alarms everyone! These new generation of amateurs are going to ruin everything!


You're making a personal issue out of a pedagogical complaint. Which one of us is approaching this poorly?

It's not the next generation's fault you've utterly failed them.


I think you’re overestimating the demand for people who can implement a binary tree.

Then again, when I used to work more with node, I would complain daily that those kids were just reimplementing solved problems in computing … poorly. I’m working more with elixir these days and everything is so fast that I’ve never stopped to dig into the implementation.


CS is built on abstractions, which make things more modular and easier to think about.

A data structure can be seen as an interface, a logical structure, a physical layout, or an encoding. When you teach them, you have to start from somewhere. The logical structure is usually a good choice, because it contains the key ideas of the structure. If you cram in too many details and too many levels of abstraction, you are just going to confuse the audience.


A data structure can be seen as a mathematical object with attendant definition of structure and operations, which can be useful for formal reasoning; or a implementation, an actual executable artifact encoding those structures and operations into a particular runtime, which can be useful to, like, write/run some code.

What you're calling a "logical structure" is not really a thing. It's a sort of outline or mnemonic, usually for the mathematical objects but sometimes the program. It's per se insufficient for either mathematical or practical use. When the outline fails to capture the right details (as it fails here) it's also harmful to anyone trying to understand either view of the data structure.


That's a narrow point of view. It would make minor variations of the same structure different data structures, which is not very useful approach.

Logical structure is the heart of the data structure. It can be understood as the set of invariants maintained by the structure. Those invariants determine which operations can be supported efficiently by some version of the data structure. Layout and encoding choices affect the performance of those operations. The interface determines which operations are available, but it's often a poor match for the actual structure.

The lines between the logical structure, the layout, and the encoding are vague, but the concepts are useful. For example, if two data structures have the same logical structure and layout, they can often share the same serialization format. That implies that there should be a simple but very efficient conversion algorithm between the two, which can prove useful if you sometimes need to support different operations efficiently.


I don't think "computers have caches that are faster to access than main memory" is just an implementation detail though; if you're designing data structures for a specific kind of storage you should teach the properties of that storage first and the data structures after.


If you are designing data structures for a specific kind of storage, you have hopefully studied them beyond the first/second year algorithms and data structures class. Like most introductory classes, that class mostly teaches you concepts and ideas.

My job is largely about designing and implementing new data structures. Those four levels of abstraction are the ones I've personally found useful. None of them deals with implementation details, as they are all abstractions. Implementation details are an orthogonal concept that is relevant at every level.


> CS is built on abstractions, which make things more modular and easier to think about.

CS also leads to generally garbage performance with its abstractions, because computers simply don't function according to assumptions of CS...

Better to teach people how computers actually work, and then go from there.

Doesn't feel particularly surprising anymore that software has been getting slower far more quickly than hardware can get faster.


In the CS curricula I'm familiar with, the introductory algorithms and data structures class is usually a prerequisite to the core systems classes. In order to learn how computers really work, you must first be familiar with the key ideas of organizing and manipulating data. Designing algorithms and data structures for real computers is an advanced topic few students take, because they are more interested in software/systems/theory/ML.


Abstract data types can be zero-cost abstractions...


I think you learn the box-and-arrow diagrams in an algorithms class. You learn about cache and memory in a systems class. When you get a real job, if someone asks you to write a performant binary tree in a low-level language, you combine knowledge from both these experiences and make an inference on what to do. Maybe you try, test, and iterate. Maybe you read some blog posts or find a canonical implementation in another language.

Either way, it doesn't seem like a huge knowledge gap.


Yeah, I was thinking this as well. This box and line diagram style is so far away how you would implement many of these structures that you might not recognize them if you saw them.


What's the preferred way of implementing a binary tree, and how much more efficient is it over just nodes and references?


The heap implementation in Coran Leaderson and Revests’ Algorithms book is super interesting. But, I used to refer to that book as the cure for insomnia as an undergrad. It won’t be very engaging content to most people.


I sadly think some of these subjects become sort of a snooze fest because at the time you are learning them, you are so far away from applying the patterns. It's far closer to stamp collection than any actual implementation for the average undergrad.

Back first year when we had linear algebra, I was pretty much the only guy who absolutely aced the exam. Not because I was the smartest or hardest working, but because I applied everything we learned to a game I was building, a 3d spaceship simulator. Vector products, planar projections, rotation matrices, I mainlined all of it, I pored over the book in my spare time with great enthusiasm. For the rest of the class it was just yet concept after abstract concept being piled on with no place to apply them.


Its Cormen, Leiserson, Rivest. Was it a spell check error or a deliberate misspell for some reason?


Lol. That was all Seri. Now, I’m questioning her politics.


As alluded to in the sibling comment, implicit trees are a good starting point for trees that are to be reasonably dense. Removing explicit pointers reduces the size and increases the data locality. Depending on what manner of tree you've built, you can also iterate over these values in a fast and cache-friendly way.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: