bluusteel's comments

bluusteel · on Feb 22, 2018

I kept expecting the author to recommend this book, but then he never did...

bluusteel · on Jan 20, 2018

It seems like blog posts on simple statistical methods like this one land on the front page of HNews a lot more than one would expect.

bluusteel · on Feb 12, 2016

I think this is called varied practice and that the idea has been around for awhile. The book Make It Stick[1] discusses a study that had 8 year olds toss beanbags at a target. For one group, the distance to the target was varied. For another, the distance to the target was fixed. At a later time, both groups were tested and the group that practiced with a variable distance performed better than the fixed distance group.

[1] http://www.amazon.com/gp/product/0674729013

bluusteel · on July 30, 2015

Really nice writeup! I wish more graduate students would blog about their research in this way. Could be a great way to educate the public and convey the importance of government research funding.

I wonder how big the reflection coefficient is for the air/ice interface. It seems like it would be huge. So maybe ground based techniques offer better coupling at the cost of not being able to survey as much area?

lindzey · on July 31, 2015

We record on two channels, "high gain" and "low gain", separated by ~50dB. I showed the high gain products in this post, and the surface absolutely does saturate the detectors. The system was designed so that near-surface returns don't saturate the low gain channel.

We transmit 8kW, and the air/ice surface reflection coefficient is ~0.08 (~-11dB). Flying at ~600m above the surface, spreading loss actually contributes more to signal attenuation (1/(2*h)^2 ~= -62dB).

Our instrument is optimized for seeing through the entire ice sheet, mapping deep layers and the bed. Other (also airborne) instruments operate at higher frequencies, trading higher resolution for less penetration. I'm not super familiar with groups using ground-based ice-penetrating radar, but one big tradeoff is $$$. The airplane is hugely expensive to operate, whereas ground-based just needs a snowmobile.

bluusteel · on July 30, 2015

I wonder if they plan to tackle other educational settings in the future (e.g. schools, self-studying individuals, etc). Am also curious to know if the card design happens automatically or maybe the customer selects from different card types depending on content. The space of tech products that help improve learning seems more sparse than it should be.

bt41 · on July 30, 2015

Great questions! Thanks bluusteel.

We'll see regarding other educational settings, right now it's very targeted on these sales / customer service. What we learn here however can be applied anywhere.

Card types are a bit automagic and templatized based on the course type. I.e. if it's a sales pitch there are different templates & exercises than if it's about a product.

"The space of tech products that help improve learning seems more sparse than it should be." -- Completely agree.

bluusteel · on July 24, 2015

I don't think I realized how important default colors/styles are for visualization software until using seaborn. Quickly producing sensible plots in terms of aesthetics should not require 10+ lines of extra code. Fortunately, it looks like the matplotlib folks are looking to change default colors/styles with 2.0 [1].

[1]http://matplotlib.org/style_changes.html

bluusteel · on July 22, 2015

I thought closed form expressions existed for linear regression. Why is gradient descent needed?

srean · on July 22, 2015

That closed form requires matrix inversion. And that is almost always (but not always) a bad idea. It is numerically unstable/sensitive and more expensive than need be. Sometimes you also need a quick ball-park figure of the answer and the ability to query the answer anytime. These can be done with iterative algorithms (gradient descent is one such iterative algorithm, conjugate gradient descent would be an improvement of that). With the closed form you have to wait till it finishes going through the motions. Its an all or nothing deal. OTOH you can stop an iteration anytime and peek at the current estimate of the answer.

If this comment has even one takeaway, I would like that to be "don't invert, unless you are very sure that is exactly what you need". In some scenarios inverses are indeed required, solving linear equations are almost always not that scenario.

east2west · on July 22, 2015

Some clarifications. Explicitly computing matrix inversion is never a good idea, unless you require an inverted matrix. For solving the normal equation, indeed for solving any linear equation, use appropriate specific algorithms (variants of Gaussian elimination, QR, SVD, ...). For regression, computing the inverse of covariance matrix is actually faster than QR methods ( plural because there are more than one) at the cost of numerical instability. SVD is the more stable but also more expensive method. These are the standard methods you will find in any linear algebra or statistics textbooks. For regular use they are just fine. It is when you have thousands of variables and tens of thousands of measurements when they fall short. Iterative optimization algorithms can be used, or random matrix algorithms. There is no free lunch, because numerical instability is inherent in these algorithms, but you do have some, more relaxed, guarantee of errors or probability of errors.

The reason I said iterative optimization algorithms can be used for regression because linear regression is a quadratic optimization problem. We have analytic solutions because of special linear algebra properties. It can be said that we have a good handle on quadratic optimization problems, hence the multitude of algorithm choices. General optimization algorithms necessarily carry cost in numerical instability, but it may be a cost worth paying, especially for large data sets.

Finally, I want to echo your sentiment that DO NOT invert. Solve linear equations should be by tailored algorithms. In theory they yield identical results, in practice specific algorithms are much better.

santaclaus · on July 22, 2015

Implicit in this discussion is the fact that the matrix is positive definite, no?

shoo · on July 22, 2015

yeah.

and helpfully, in this context, if you're starting from the point of minimising the L2 error between the output of some linear function and a target, then the resulting linear system will have the form $A^T A x = A^T b$, so the matrix in question is $A^T A$, so it'll always be positive semidefinite.

imh · on July 22, 2015

For some models, linear regression included, there may be a closed form solution, but it might just be too expensive to compute. In particular, high dimensions can screw everything and sparsity can allow much cheaper solutions, so sometimes you just revert to gradient descent or SGD.

dlib · on July 22, 2015

For completeness I would include the closed form solution in a discussion of linear regression. In matrix notation is fairly simple and it's easy to follow how it falls apart when the assumptions underlying linear regression are not present in the data you're working with (homoskedasticity, independence of errors etc.)

bluusteel · on April 14, 2015

Reminds me of the "talking" drums used in West Africa[1]. Low frequency sound can travel distances measured in miles.

[1] https://en.wikipedia.org/wiki/Talking_drum

bluusteel · on March 26, 2015

Sounds like a good project for a graduate level computational electromagnetics course. The method used in this app is FDTD, which is pretty easy to understand and implement compared to other methods for solving Maxwell's equations numerically[1].

[1] https://en.wikipedia.org/wiki/Computational_electromagnetics

bluusteel · on Dec 30, 2014

"Desirable difficulties" were originally suggested by Robert Bjork, a psychology professor at UCLA. For a short overview, see his website: http://bjorklab.psych.ucla.edu/research.html#idd