Hacker Newsnew | past | comments | ask | show | jobs | submit | srean's commentslogin

My uni had its manual in its library is all I can say :)

https://en.wikipedia.org/wiki/Ramer%E2%80%93Douglas%E2%80%93...

is an alternative if visuals are all that matters. It can and will rain havoc in the Fourier space.


You are exactly right.

Control theory and reinforcement learning are different ways of looking at the same problem. They traditionally and culturally focussed on different aspects.


Fun memories.

We have successfully replaced thousands of complicated deep net time series based anomaly detectors at a FANG with statistical (nonparametric, semiparametric) process control ones.

They use 3 to 4 orders lower number of trained parameters and have just enough complexity that a team of 3 or four can handle several thousands of such streams.

The amount of baby sitting that deep net models needed was astronomical, debugging and understanding what has happened quite opaque.

For small teams, with limited resources I would still heavily recommend stats based models for time series anomaly detection.

May not be your best career move right now for political reasons. Those making massive bets do not like to confront that some of their bets might not have been well placed. They may try to make it difficult for contrary evidence to become too visible.


Super cool, thanks for sharing!

This is one of the reasons I am so skeptical of the current AI hype cycle. There are boring, well-behaved classical solutions for many of the use-cases where fancy ML is pushed today.

You'd think that rational businesses would take the low-risk snooze-fest high-margin option any day instead of unintelligible and unreliable options that demand a lot of resources, and yet...


>This is one of the reasons I am so skeptical of the current AI hype cycle. There are boring, well-behaved classical solutions for many of the use-cases where fancy ML is pushed today.

In 2013 my statistics professor warned that once we are in the real world, "people will come up to you trying to sell fancy machine learning models for big money, though the simple truth is that many problems can be solved better by applying straightforward statistical methods".

There has always been the ML hype, but the last couple years are a whole different level.


It does not work that way in the short term.

Say you have bet billions as a CEO, CTO, CFO. The decision has already been made. Such a steep price had to come at the cost of many groups and teams and projects in the company.

Now is not a time to water plants that offer alternatives. You will have a smoother ride choosing tools that justifies that billion dollar bet.


Decision-making in organizations is definitely a hard problem.

I think an uncomfortable reality is that a lot of decisions (technology, strategy, etc.) are not optimal or even rational, but more just an outcome of personal preferences.

Even data-driven approaches aren't immune since they depend on the analysis and interpretation of the data (which is subjective).


Data informed is good. Purely data driven is a bad idea.

After all even in Physics big advances came from thought experiments. Data is one way to reason about a decision, logic and knowledgebase is another way. Both can be very powerful if one retains the humility of fallibility.

In organizations one common failure mode is that the organisational level at which decisions are made are not the same levels where the decisions are going to have their effects felt.

It's a really difficult problem to solve. Too much decentralisation is also a bad idea. You get the mess of unplanned congested cities.


For a while now, I've been summarizing the ease with which everything turns into a "Humanity Complete" problem via: "Delegation affords so much, but trust sure is tricky."

This has been observed forever in various forms/contexts. Planning & policy people call them "Wicked Problems" (https://en.wikipedia.org/wiki/Wicked_problem). The Philosophy of Science one goes by the Demarcation Problem (https://en.wikipedia.org/wiki/Demarcation_problem) { roughly, in the sense that the really hard nugget connects to "trust" }.

At least one aspect of all of it is that trust is a little like money/capital and "faking it" is a bit like "stealing". The game theory of it is that since faking is virtually always vastly cheaper there are (eventually) huge incentives to do so, at some point by someone(s). So, almost any kind of trust/delegation structure has a strong pull toward "decay", from knock-off brands to whatever. It just takes a sadly small fraction of Prisoner's Dilemma defectors to ruin things/systems thereof. 2nd law of thermo makes order cost energy and this decay feels like almost an isomorphic (maybe even the same..?) thing. It's not just product/tech enshittification, but that might be yet another special case/example.

Anyway, I have no great answers or as some responder to me a while back said, if I did, I'd "have a Nobel and possibly be the first president of the united planet".


> There are boring, well-behaved classical solutions for many of the use-cases where fancy ML is pushed today.

I know some examples but not too many. Care to share more examples?


Some off the top of my head...

- Instead of trying to get LLMs to answer user questions, write better FAQs informed by reviewing tickets submitted by customers

- Instead of RAG for anything involving business data, have some DBA write a bunch of reports that answer specific business questions

- Instead of putting some copilot chat into tools and telling users to ask it to e.g. "explain recent sales trends", make task-focused wizards and visualizations so users can answer these with hard numbers

- Instead of generating code with LLMs, write more expressive frameworks and libraries that don't require so much plumbing and boilerplate

Of course, maybe there is something I am missing, but these are just my personal observations!


With all due respect, all of those examples are the examples of "yesterday" ... that's how we have been bringing money to businesses for decades, no? Today we have AI models that can already do as good, almost as good, or even better than the average human in many many tasks, including the ones you mentioned.

Businesses are incentivized to be more productive and cost-effective since they are solely profit-driven so they naturally see this as an opportunity to make more money by hiring less people while keeping the amount of work done roughly the same or even more.

So "classical" approach to many of the problems is I think the thing of a past already.


> Today we have AI models that can already do as good, almost as good, or even better than the average human in many many tasks, including the ones you mentioned.

We really don't. There are demos that look cool onstage, but there is a big difference between "in store good" and "at home good" in the sense that products aren't living up to their marketing during actual use.

IMO there is a lot of room to grow within the traditional approaches of "yesterday" - The problem is that large orgs get bogged down in legacy + bureaucracy, and most startups don't understand the business problems well enough to make a better solution. And I don't think that there is any technical silver bullet that can solve either of these problems (AI or otherwise)


I am wondering how often do you use AI models? Because I do it on a daily basis, and as much as they have limitations, I find them to be performing incredibly well. It's far very far from being a demo - last time it was a demo that looked "cool" was around 2020/21 when they were cool for spitting out the haiku poetry, and perhaps 2022 when capabilities were not as good. But today? Completely mind-blowing.

If you're not convinced, I suggest you to search for the law firms, hospitals, and laboratories ... all of which are using AI models as of today to do both the research and boiler-plate work. Creative industries are being literally erased by the generative AI as we are speaking. What will happen with the Photoshop and other similar tools when I can create whatever I want using the free AI model in literally 2 seconds without prior knowledge? What will happen with majority of movie effect makers when single guy will be able to do the work of 5 people at the same time? Or interior designers? The heck, what will happen with the Google search - I anticipate nobody will be using it in a year or two. I already don't because it's a massive sink of time compared to what I can do with perplexity for example.

There's many many examples. You just need to have your mind open to see it.


You're making a ridiculously overconfident statement.

* Show me a discrete manufacturing company using AI models for statistical process control or quality reporting

* Show me a pharmaceutical company using AI models for safety data analysis

* Show me an engineering company using AI models for structural design

The list goes on and on. There are precious few industries or companies that have replaced traditional analysis & prediction with AI. Why? Because one of two things are true: 1) their data is already in highly structured relational stores that have long legacies of SQL-based extraction and analysis, 2) they're in regulated industries and have to have audit-proof, explainable reporting, or 3) they need evidence-based design and analysis that has a key component coming from real people observing real processes in action.

For all the hyped "AI Automation" you read about, there are 100 other things that aren't, or where firms don't believe they can be, or where they'll struggle to for [reasons].


Right right, I get it. Pharma, structural engineering, discrete manufacturing, ..., all of the industries which are "too hard" to be conquered by some stupid statistical parrot. You're being delusional my friend but I am not going to be the one trying to persuade you to believe otherwise. I am here for sharing experiences and interesting discussions from which I can learn and I am not here for combating triggered and defensive strangers on the internet. And FWIW both of your conclusion and premise, and interpretation of my comment is wrong.

In my domain, I see lots of people reaching immediately for "AI" techniques to solve sensor fusion and state estimation problems where a traditional Kalman filter type solution would be faster and much more interpretable.

Incidentally, I worked on the exact same thing - Kalman filtering for tracking objects in hard real-time systems. And it is not quite as simple as one would think - developing mathematical models for all kinds of different objects that one might wanna track is far from trivial, and it was difficult to model the real-world with more or less simplistic discrete equations. And it didn't work completely reliably so we needed an extra layer of confidence - I don't remember what we used back then but it was yet another algorithm with yet another source of data.

Sigma point filters ? The escalation ladder is usually KF, EKF, unscented KF, sigma point ...

In the realm of data science, Linear models and SAT solvers used cleverly will get you a surprisingly long way.

I thought the OCR was one of the obvious examples where we have a classical technology that is already working very well but in the long-run I don't see it surviving. _Generic_ AI models already can do the OCR kinda good but they are not even trained for that purpose, it's almost incidental - they've never been trained to extract the, let's say name/surname from some sort of a document with a completely unfamiliar structure, but the crazy thing is that it does work somehow! I think that once somebody finetunes the AI model only for this purpose I think there's a good chance it will outperform classical approach in terms of precision and scalability.

In general I agree. For OCR I agree vehemently. Part of the reason is the structure of the solution (convolutions) match the space so well.

The failure cases are those where AI solutions have to stay in a continuous debug, train, update mode. Then you have to think about the resources you need, both in terms of people as well as compute to maintain such a solution.

Because of the way the world works, it's endemic nonstationarity, the debug-retrain-update is a common state of affairs even in traditional stats and ML.


I see. Let's take another example here, I hope I understood you - imagine you have an AI model which is connected to all of your company's in-house data generation sources such as wiki, chat, jira, emails, merge requests, excel sheets, etc. Basically everything that can be deemed useful to query or to create business inteligence on top of. These data sources are continously generating more and more data every day, and given their nature they are more or less unstructured.

Yet, we have such systems in place where we don't have to retrain the model with ever-growing data. This is one example I could think of but it kinda suggests that models, at least for some purposes, don't have to be retrained continuously to keep them running well.

I also use a technique of explaining something to the AI model he has not seen before (according to the wrong answer I got from it previously), and it manages to evolve the steps, whatever they are, so that it gives me the correct answer in the end. This also suggests that capacity of the models is larger than what they have been trained on.


Data science solutions are different in the sense they rarely ever get done and dusted in a sense a sorting library might.

There's almost always something or the other breaking. Did the nature of data change. Did my upstream data feed change. Why are these small set of examples not working for this high paying customer.

You would need resources to understand and fix these problems quarter after quarter.

A rich network of data dependencies can be a double edged sword. Rarely are upstream code and data changes benign to the output of the layer you own.

There are two cases where AI solutions are perfect. They are so good that they are fire and forget. The second is that your customer is a farmer not a gardener. Individual failing saplings mean little to him.

If a single misbehaving plant can cause commercially significant damage then when choosing opaque tools you must consider the maintenance cost you may be signing up for.

Say I have a ton of historical data that is being continuously added to. It's a real temptation to replace the raw data with a model that uses less number of parameters than the raw data. In a sense lossy compression. Can be a very bad idea. Data instances where the model does not fit well may be the most important pieces of art information. Model paints with a broad brush stroke. If you are hunting faults, you have been aware that a lossy compression can paper them away. You are also potentially harming a future model that could have been trained but you have thrown away a decade of useful data because storage costs were running so high.

No easy solution. General recommendation would be to compress but losslessly simply because you know not what may be valuable in the future. If it's impossible, then so be it, you have to eat that opportunity cost in the future, but you did your best.


Never heard the farmer vs gardener framing before, but I love it. Can classify so many business problems like this.

Oh well, was just dipping into the delights of some old Martin Gardner. Amazing how the human brain works.

It’s not fundamentally different from cattle vs pets, but I like how it shapes the human agency. There is a problem in your cow/wheat/server -what do *you* do? Can you intentionally ignore it by virtue of large numbers, or stop everything to resolve?


I've seen a lot of uses for SAT solvers, but what do you use them for in data science? I can't find many references to people using them in that context.

Root causing from symptoms is one case where SAT or their ML analogue -- graphical models are quite useful.

>unintelligible and unreliable options that demand a lot of resources

Some options have more persuasive salesmen than others.


This sounds fascinating. Can you say anything about the application?

Autoscaling? Data center cooling and power use?


Would rather not. Just to be in the compliant zone legally and also to stay somewhat anonymous. Sincerely sorry to disappoint. But let me assure you it was nothing exotic.

Fair. Not sure what it’s like getting tech talks approved through comms these days, but this would be fascinating to hear about at a SF or SouthBay Systems meetup.

> We have successfully replaced thousands of complicated deep net time series based anomaly detectors at a FANG with statistical (nonparametric, semiparametric) process control ones.

Interesting.

Were you using things like Matrix Profile too ? And if so, have those been replaced too ?


Fwiw, I have a masters in operations research as a focus area within an industrial engineering degree, and spent 15 years working in manufacturing systems with a focus on test automation & quality. Traditional SPC/SQC analysis is, and will remain, king -- at least for some time. That can potentially evolve on high-vol/low-mix scenarios that lend themselves more easily to training models on anomaly detection, but especially for complex product manufacturing in high-mix factories that's not the case. It's far better to let your test/quality engineers do their jobs and figure out statistical controls on their own.

Among other reasons, this is largely true because acceptable ranges for different anomaly & defect types can vary significantly for different revs of a single product, or even sub-revs (things that are tied to an ECO but don't result in incrementing the product rev), or -- more crucially -- the line the product is manufactured on. One thing that's notoriously tricky to troubleshoot without being physically onsite is whether a defect is because of a machine, because of a person, or because of faulty piece parts/material.

Understanding and knowing how to apply traditional statistical analysis to these problems -- and also designing useful data structures to store all the data you're collecting -- is far more valuable right now than trying to shoehorn in an AI model to do this work.


In this specific project no, but in others a very emphatic yes.

Can you be more specific about what SPC algorithm you moved to? Did you trade off prediction quality for complexity, increasing the number of false alarms?

We generally targeted specific statistics of derived/processed streams. For some such streams we cared if the mean changed. In others if the spread changed in a way that was unusual for the time of day. In yet others if some percentile changed that was unusual for the time of day. Sometimes it will be more than one of such statistics.

Then we would track an online estimator of that measure with an SPC chart. The thresholds would be set based on our appetite for false alarms. We did not fit or use properties of parametric distributions that standard SPC charts use. So no 3-sigma business. In our case convergence to Gaussian would often be not fast enough for such techniques to be useful.

Also the original streams were far from IID, temporal dependencies were strong. So we had to derive from them derived streams that didn't show temporal dependencies any longer, at least not as strongly. This was the most important bit.

The next key aspect was to keep the alerting thresholds as untarnished and unaffected as possible from the outliers that would unavoidably occur. Getting this to work without additional human supervisory labels was the next most important part.

Make this part too robust to outliers then the system would not automatically adapt to a new normal. Make it too sensitive and we would get overwhelmed by false positives.


What confuses me about deep nets is that there's rarely enough signal to be able to meaningfully train a large number of parameters. Surely 99 % of those parameters are either (a) incredibly unstable, or (b) correlate perfectly with other parameters?

They do. There are enormous redundancies. There's a manifold over which the parameters can vary wildly yet do zilch to the output. The nonlinear analogue of a null space.

Parameter instability does not worry a machine learner as much as it worries a statistician. ML folks worry about output instabilities.

The current understanding goes that this overparameterization makes reaching good configurations easier while keeping the search algorithm as simple as stochastic gradient descent.


Huh, I didn't know that! Are there efforts to automatically reduce the number of parameters once the model is trained? Or do the relationships between parameters end up too complicated to do that? I would assume such a reduction would be useful for explainability.

(Asking specifically about time series models and such.)


What you are looking for is the lottery ticket hypothesis for neural networks. Hit a search engine with those words you will find examples.

https://arxiv.org/abs/1803.03635 ( you can follow up on semantic scholar for more)

Selecting which weights to discard seems as hard as the original problem. But random decimation, sometimes barely informed decimation have been observed to be effective.

On the theory side now it's understood that in the thicket of weights, lurk a much much smaller subset that can have nearly the same output.

These observations are for DNNs in general. For time series specifically I don't know what the state of the art is. In general NNs are still catching up with traditional stats approaches in this domain. There are a few examples where traditional approaches have been beaten, but only a few.

One good source to watch are the M series of competitions.


could you give a brief overview of: - what libs were you using - what kind of algos / models were most useful for what kind of data?

I have an IoT use-case, I wanted to look both at NNs and more classical stats models to see if it has value


Can't for obvious reasons. But no specialized libraries used. The usual Python stack that comes packaged for any respectable OS distribution these days, mixed in with other close-to-the-metal languages for performance or API compatibility reasons.

Look up nonparametric statistical process control and you will find useful papers. The algorithms are actually quite simple to implement. If the algorithms are not simple then probably they are not worth your time. The analysis in the paper might be complicated, don't worry about that, look for simplicity of the algorithms.


did similar work at similar scale to srean.

Assume you have signal from one IoT device, say a sensor reading. Anomalies are sudden changes in the value of the signal. Define sudden (using the time delta between observations and your other domain knowledge); let's say the sensor reports 1x/second and sudden means 1-3 minutes.

Simple options: rolling mean last 3 values/rolling mean last 60 values. If this value is over a threshold, alert

Say the readings are normally distributed, or they can be detrended/made normal via a simple 1 or 2 stage AR/MA model. Apply the https://en.wikipedia.org/wiki/Western_Electric_rules to detect anomalies.

Complexer but still simple options. Say you have IoT sensors over a larger area, and an anomaly is one sensor which is higher than others. Run roughly the same analysis as above, but on the correlation matrix of all the sensors. Look for rapidly changing correlations.

example: temperature detectors in each room of your house, and your kid opens the front door to go play in the snow. The entry hall cools down while the rest of the house's temp stays roughly stable. You can picture what that does to the correlation matrix.


Bang on.

It was little more complicated to remove temporal dependencies from the original streams and we could not rely on Gaussian behaviour. Other than that, it's pretty much the same, barring an effort to keep the alerting thresholds unaffected by recent anomalies.


Now if future AIs were shown to be capable of suffering then this could change.

Don't ISP's just charge per caps on ingress and egress volume?

From your comments it is clear that they don't. Super infuriating. Why should they care what I do with ingress and outgress that I paid for, as long as I am not hurting them.


His comments are based on fear-mongering he read somewhere or an overly-literal interpretation of terms and conditions written to cover the ISP's ass in every theoretical situation possible.

ISPs who enforce data caps already priced it in and technically have an incentive for you to exceed your cap as fast as possible so you pay to increase said cap (they can however still slow down your traffic as they wish, to ensure sufficient capacity for everyone).

ISPs who don't enforce a cap actually still internally enforce a reasonable cap of several terabytes at their discretion. And of course, they can and will use traffic shaping to ensure the integrity of their network so your usage doesn't affect others. If you exceed that soft cap consistently several months in a row they may get in touch, but other than that you're fine.

TLDR: host your server and enjoy. When you get to the scale of the next YouTube, then you have to worry.


Hey thanks, will check her works out. I am more into scifi than fantasy, especially hard sci-fi.

Yes indeed.

In my other comment I suggested carbon fibre flywheels (for energy storage). A design that stresses the rotor uniformly to near it's breaking point would make a great storage device. If it's possible to add density to the fibres but without compromising strength, even better.

For a solid material with equal strength in all direction the optimal cross section is one with an exponentially decreasing thickness.

To give an intuitive reasoning, the more radially inwards you go there's is more material and velocity on the outside that's straining to break free, so you need larger cross-section to resist that. But now, this extra thickness too has to be supported as you move inwards. One can make this formal as a differential equation and the solution is an exponential profile.

Anyhow, for carbon fibres the optimal geometry will depend on the weave because a fibre has different strength along different directions.


AFAIK, carbon fiber flywheels that are levitated in vacuum and that exceed the energy density of the metallic flywheels have already been made and used in certain experiments, even if I am not aware of any such flywheel being available commercially.

There was also some research for using such flywheels for energy recovery in very heavy vehicles with electric motors, e.g. tanks with a turbo-electric generator, but the use in a vehicle has obvious difficulties. Even if the flywheels are paired, to avoid influencing the mobility of the vehicle, that still causes high internal stresses in the case holding the pair of flywheels when the vehicle rotates, which can lead to fatigue failures.


Right.

I chose it as my undergraduate project literally several decades ago.

3D woven ones might be stronger as they might resist laminar separation of circumferential layers more. Going by units, the product of stress and volume has the same units as kinetic energy. So it appears breaking stress and volume might be what limits the stored kinetic energy. This addresses doubt and curiosity raised by one comment (not yours).


Just yesterday I was mentioning about the shared fascination with everything knitting, weaving, knitting, tatting, crocheting and braiding.

https://news.ycombinator.com/item?id=46039952

Wonder if I should braid my wired earphones for storage to prevent tangling. I can keep the cable inside a pouch with the earpieces out but that's not very satisfactory.

My current fascination knitted ropes/cables/cords. These are not the typical ropes that are spun and coiled and held together by friction. These ones made of synthetic fibres look like woven tubes, but the insides aren't hollow. The insides seem packed with more woven tubes.

What I really want to see though, are 3d knitted heavy duty carbon fibre flywheels of optimal shape such that it's under equal radial stress everywhere. The shape is interesting to compute for a solid one.


out of curiosity, what would you want a carbon fiber flywheel for? usually the value of a flywheel is for storing kinetic energy which a low density material like carbon fiber would not be suitable for.

What you lose in mass could be made up in velocity and radial distance. Depends on the breaking strength of the fibres. I haven't done any calculations myself to see if they are more promising than a steel one. A carbon fibre flywheel exploding might be somewhat less dangerous than steel if the bits flying away can crumple dissipatively.

Regarding where can they be used... they are just batteries with a different form factor. You can put them on the grid to have some inertia and be a place to dump or extract energy spikes. They may be commercially viable if rare earth based batteries become very expensive. At a smaller scale one could use them as a mechanical UPS for a building/datacenter. Maybe even to power golf carts, not sure how well they will steer because of the angular momentum.


That was somewhat relevant for data centers years ago when inverters were much more expensive, and even then only for the 20 seconds it took to start the diesel by connecting a simple soft starter to the diesel's induction generator (bringing it up to 50/60 rps (1500/1800 rpm) in about 5 seconds, then waiting for the turbo to spool up before it can deliver full power).

Even on the grid, batteries for sub-hour duration storage are cheap, as long as you place them at an already existing AC/DC converter site like a solar plant (or a modern internally-DC datacenter's centralized grid rectifier (AC/DC converter)).

Or even a HVDC transmission line.

Or a sufficiently modern aluminum/zinc smelter. Pretty much anything large enough to bother that has at least a boost-PFC on the input. Because with those you could just put them there, beef up the capacitor a bit or better yet, use a native 3-phase PFC that doesn't have strong 100/120 Hz ripple on that capacitor, and then literally just control the already there input transistors to do your grid jobs. If it was the very cheap low efficiency rectifier approach, it also needs the rectifier upgraded to be controlled, so just use it on the higher efficiency ones before upgrading the others. (It's like 1% efficiency difference on 240V, and 2% on 120V power supplies.)


Flywheels can have much higher power density than any kind of battery.

There is no competition between batteries (low power density, high energy density, low storage cycle efficiency) and flywheels (high power density, low energy density, high storage cycle efficiency).

Flywheels (preferably levitated in vacuum) compete only with supercapacitors and superconducting rings (SMES = Superconducting Magnetic Energy Storage).

Supercapacitors/flywheels/SMES have their high-power applications, for which batteries are not appropriate.

Using them where batteries are the right solution is of course not a good choice.


Are you Daniel Craig :) ?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: