In VSTT, you will quickly find the need to generate large numbers of concurrent users. Generating this level of load will be impossible with a single machine.
VSTT enables this through the use of a "test rig". You can read the MSDN documentation, on MSDNWiki, here. You might even notice a community comment by yours truly on the wiki site.
Basically, the way it works is that you have one boss machine, called a controller. The controller works with zero or more worker machines, called agents. These machines are associated together during installation of the Team Test Controller and Team Test Agent. Both are available on your distribution media for VSTT or VSTS.
Once you have the agent and controller software installed in your test rig, you're ready to generate massive amounts of load.
Now we get to the interesting part. How do you know how many users you should simulate? It is best to start with a clearly defined goal for the number of concurrent users your software or service should support. My suggestion is to start a load test with half that number of users. Then, you can use a step based load pattern to gradually increase that number of users, until you reach your goal level. I believe it's a good idea to make your steps as large as you can.
For instance, if you are going to supoprt 1000 users, and you're running a 10 hour test, start with 500 users. Use goal based load pattern, and set your step amount to 100, and the step duration to 90 minutes. This will cause you to reach your target of 1000 users at 7.5 hours into the test. That allows you to test each step long enough to see if there are problems with the new load level. You will also get to test at your target load level for 2.5 hours, which will provide you with a good baseline for future testing.
If your test fails at any step, you can start a new test, and test at that user level, or slightly below it, and begin to narrow down the reason for your test failing.
The only caveat is if your test fails because you run out of resources on your agents.
In one part of the OfficeLive service, we started testing at 500 users. We gradually increased the number of users from 500 up to 2500. Once we started to get closer to the maximum number of users we thought we could support, we started changing to a goal-based load pattern. We set the goal based pattern to monitor one of our SQL backend machines, and to adjust load until the processors were at 90% utilization. Every time, however, our test would fail shortly after passing 2500 users. The test failed because we were running out of memory on one of our agents.
You have two options if this happens. If there is a particular agent that is causing you problems, you can adjust the load weighting for that agent, so that it gets fewer users assigned to it. If however, all of your machines are low on resources, you have reached the limit on the number of users you can simulate, and, consequently, the amount of load you can place on your system.
In our case, all of our agents run out of memory, so we have found that running with more than 2500 users, on our particular hardware, is not possible.
Hopefully, I've shed some light on how to determine the maximum load you can generate, given your specific hardware assets.