When I worked on ads at Google we had many A/B tests that had been running that long, generally holdbacks where a feature was almost entirely but not 100% launched.
It was relatively rare that the holdback would show markedly different results than the initial A/B test we used in deciding to launch. If that had happened more often we would have run more long tests and been slower to move to launch.
It was relatively rare that the holdback would show markedly different results than the initial A/B test we used in deciding to launch. If that had happened more often we would have run more long tests and been slower to move to launch.