Machinus, your argument that Survival of the Fittest is probably the only card that makes the toolbox argument relevant to Vintage is a sound argument. The matter of relevance, however, actually strays from the discussion at hand. Relevance dictates that at present there exists no competetive deck that would benefit from a 61'st card (probably; I lack the time and the card-value data to actually run the numbers on every card in every tier 1 deck). The argument being made by myself and others is that theory dictates that a deck that is optimal with 61 cards (and also competetive) is possible.
To address your arguments that even in a toolbox deck, that 60-cards is definitively optimal...
1) Your chances of drawing your engine are reduced.
2) All the arguments already mentioned about the increased inconsistency in drawing mana and disruption.
3a) Your chances of drawing a completely useless, very specific answer increase.
3b) Your chances of drawing redundant answers increase.
Given all these matters, if the 61'st card helps you to win one match in ten, but because of dillution of the deck causes you to lose one match in twenty, it's still causing you to win one additional match out of every twenty you play. In fact, had I been thinking more clearly when I presented my earlier arguments, that is precisely how I would have summarized what I was saying. If the addition of a card causes you to win more games than it causes you to lose, it is optimal to add the card. Furthermore, if after doing so, the removal of any card in the deck causes you to lose more games than it causes you to win, then that deck is optimal (better, at any rate) with 61-cards. To be absolutely certain of optimality, one would have to perform the same analysis with 62 cards.
Obviously, not every card is going to be immediately identifable as a cause of success or failure, particularly if it's a card you
don't see. For example, if you lose a game, you can't say with certainty that drawing an Ancestral Recall would have helped you (Well, you can, by looking at the next three cards, but this argument supposes that Ancestral Recall represents three "perfectly random" cards.) That is why I proposed the relative value analysis in my initial argument. Instead of trying to measure how often the presence of a card wins you the game or the absence of another card loses you the game, I suggested that instead one measure how often one sees the added card versus how often one fails to see another card in the deck. In the example I gave, a third copy of a card will be seen 40 times as often as a singleton will fail to be seen. Roughly, then, the singleton would have to win 40 times as many games as the added card to make the addition suboptimal and 40 times as many games as any other card in the deck to make it optimal to remove a different card.