If all you want to know is what fraction of all twitter accounts are spam accounts, it should be really easy:
1. Select 1000 accounts uniformly at random. Either from among all twitter accounts, or from active twitter accounts for whatever definition of "active".
2. Classify these 1000 by hand. Do as much investigation into them as you need to classify them accurately; no need to use heuristics here.
You will (with very high probability) get an estimate accurate to within a percent or so. If you do statistics you could find the actual bounds.
1. Select 1000 accounts uniformly at random. Either from among all twitter accounts, or from active twitter accounts for whatever definition of "active".
2. Classify these 1000 by hand. Do as much investigation into them as you need to classify them accurately; no need to use heuristics here.
You will (with very high probability) get an estimate accurate to within a percent or so. If you do statistics you could find the actual bounds.