Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Here's an example from this morning. At 10:00 am, a colleague created a ticket with an idea for the music plugin I'm working on: wouldn't it be cool if we could use nod detection (head tracking) to trigger recording? That way, musicians who use our app wouldn't need a foot switch (as a musician, you often have your hands occupied).

Yes, that would be cool. An hour later, I shipped a release build with that feature fully functional, including permissions plus a calibration UI that shows if your face is detected and lets you adjust sensitivity, and visually displays when a nod is detected. Most of that work got done while I was in the shower. That is the second feature in this app that got built today.

This morning I also created and deployed a bug fix release for analytics on one platform, and a brand-new report (fairly easy to put together because it followed the pattern of other reports) for a different platform.

I also worked out, argued with random people on HN and walked to work. Not bad for five hours! Do I know how long it would have taken to, for example, integrate face detection and tracking into a C++ audio plugin without assistance from AI? Especially given that I have never done that before? No, I do not. I am bad at estimating. Would it have been longer than 30 minutes? I mean...probably?





Just having a 'count-in' type feature for recording would be much much more useful. Head nodding is something I do all the time anyway as a musician :).

I don't know what your user makeup is like, but shipping a CV feature same day sounds so potentially disastrous.. There are so many things I would think you would at least want to test, or even just consider with the kind of user emapthy we all should practice.


I appreciate this example. This does seem like a pretty difficult feature to build de novo. Did you already have some machine vision work integrated into your app? How are you handling machine vision? Is it just a call to an LLM API? Or are you doing it with a local model?

There was no machine vision stuff in the app at that point. Claude suggested a couple of different ways of handling this and I went with the easiest way: piggybacking on the Apple Vision Framework (which means that this feature, as currently implemented, will only work on Macs - I'm actually not sure if I will attempt a Windows release of this app, and if I do, it won't be for a while).

Despite this being "easier" than some of the alternatives, it is nonetheless an API I have zero experience with, and the implementation was built with code that I would have no idea how to write, although once written, I can get the gist. Here is the "detectNodWithPitch" function as an example (that's how a "nod" is detected - the pitch of the face is determined, and then the change of pitch is what is considered a nod, of course, this is not entirely straightforward).

```

- (void)detectNodWithPitch:(float)pitch { // Get sensitivity-adjusted threshold // At sensitivity 0: threshold = kMaxThreshold degrees (requires strong nod) // At sensitivity 1: threshold = kMaxThreshold - kThresholdRange degrees (very sensitive) float sens = _cppOwner->getSensitivity(); float threshold = NodDetectionConstants::kMaxThreshold - (sens * NodDetectionConstants::kThresholdRange);

    // Debounce check
    NSTimeInterval now = [NSDate timeIntervalSinceReferenceDate];
    if (now - _lastNodTime < _debounceSeconds)
        return;

    // Initialize baseline if needed
    if (!_hasBaseline)
    {
        _baselinePitch = pitch;
        _hasBaseline = YES;
        return;
    }

    // Calculate delta: positive when head tilts down from baseline
    // (pitch increases when head tilts down, so delta = pitch - baseline)
    float delta = pitch - _baselinePitch;

    // Update nod progress for UI meter
    // Normalize against a fixed max (20 degrees) so the bar shows absolute head movement
    // This allows the threshold line to move with sensitivity
    constexpr float kMaxDisplayDelta = 20.0f;
    float progress = (delta > 0.0f) ? std::min(delta / kMaxDisplayDelta, 1.0f) : 0.0f;
    _cppOwner->setNodProgress(progress);

    if (!_nodStarted)
    {
        _cppOwner->setNodInProgress(false);

        // Check if nod is starting (head tilting down past nod start threshold)
        if (delta > threshold * NodDetectionConstants::kNodStartFactor)
        {
            _nodStarted = YES;
            _maxPitchDelta = delta;
            _cppOwner->setNodInProgress(true);
            DBG("HeadNodDetector: Nod started, delta=" << delta);
        }
        else
        {
            // Adapt baseline slowly when not nodding
            _baselinePitch = _baselinePitch * (1.0f - _baselineAdaptRate) + pitch * _baselineAdaptRate;
        }
    }
    else
    {
        // Track maximum delta during nod
        _maxPitchDelta = std::max(_maxPitchDelta, delta);

        // Check if head has returned (delta decreased below return threshold)
        if (delta < threshold * _returnFactor)
        {
            // Nod complete - check if it was strong enough
            if (_maxPitchDelta > threshold)
            {
                DBG("HeadNodDetector: Nod detected! maxDelta=" << _maxPitchDelta << " threshold=" << threshold);
                _lastNodTime = now;
                _cppOwner->handleNodDetected();
            }
            else
            {
                DBG("HeadNodDetector: Nod too weak, maxDelta=" << _maxPitchDelta << " < threshold=" << threshold);
            }

            // Reset nod state
            _nodStarted = NO;
            _maxPitchDelta = 0.0f;
            _baselinePitch = pitch;  // Reset baseline to current position
            _cppOwner->setNodInProgress(false);
            _cppOwner->setNodProgress(0.0f);
        }
    }
}

@end

```


> An hour later, I shipped a release build

I would love to see that pull request, and how readable and maintainable the code is. And do you understand the code yourself, since you've never done this before?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: