I’m going to write about increasing the number of transactions to simulate a higher number of Virtual Users. This is a common technique used by performance testers to ‘cheat’ or rather avoid the high costs associated charged by vendors. I recently answered a question on Linkedin and suggested increasing the transaction rate. I was sent the following message by Jim :–
“Saw a reply on here where you suggested increasing no of iterations to increase load… I’m not a big fan of this at all, as it really doesn’t do that, it compresses time instead so for example say I have 200 vusers and they are running 4 iterations per hour each to increase the number of concurrent users I need to double the number of vusers if I double the number of iterations I reduce an 8 hour soak test to 4 hours I know it’s a commonly used technique, but it’s total rubbish, usually when I hear people saying this I just mention the word “sessions” and their head explodes with the thought: huge amounts of what I’ve done for years is wrong!!!”
Being called out and made to think is good thing.
Now I’m not going to go on the defensive and pick holes – In essence what Jim is saying is correct*, and Jim’s response got me thinking. What on earth is the impact of ‘sessions’, what exactly are they? What is the overhead? Session is an umbrella term people use regularly but I suspect many actually don’t have a firm understanding of it. So I did a little digging..
Its all about the Session Data
Within the context of HTTP, what exactly is a session? I’m not going to regurgitate formal definitions of it here: But here’s my attempt at a translation: HTTP is stateless, information about the connecting client is retained on the server architecture using a variety of techniques. The client is given a unique ID when the initial connection is made to the server (session ID). All associated information for the client (session data) is linked to this identifier. Now this is where things can get a little murky .. Because ‘session data’ depends on the implementation within the architecture.. So what does this mean in terms of impact – it means that the actual overhead of session ID’s is small … what is important is the implementation of the associated data with session ID’s. So if a user logs in and has an associated large amount of information this may put strain on the resources handling the information – which in most cases is likely to be either caches or DB’s. This all leads me to the following observations/conjectures:
Advantage of Increasing Transaction Rate, not Virtual Users:
Cost: Its cheap, this is the main one. In an ideal world we would flood a system with the actual real world number of users – but quite often our wallets don’t allow us to.
Transactions: I’ve often been on site and heard “we have 50000 people connected”, we need 50000 VU’s. When the following question is posed “Whats the transaction rate for a key business process X” … and the response is “Oh … about 60/min”. I can immediately see that we do not need 50k users to achieve the goals of the customer. In fact that rate can be hit and exceeded with just 200 virtual users. Sometimes a customer thinks they need concurrency but they actually only need a transaction rate in order to achieve their goals.
Coverage: You will eke out and find 90%** of the issues VolumePills you would have found using this method Vs increasing the number of users
Hardware: I need a LOT less hardware to generate the load. Less hardware components in a performance test also means a lot less risk during a test.
Soak Testing: Not greatly relevant to this argument – if you have a long running soak test, upping the transaction rate can decrease the length of the test. I’ve used this to effectively reduce a 3-day soak performance test into 8 hours (Link Here).
When Increasing Transaction Rate will/may not work:
- Sizing: If you are using Caches for session data – then a lower number of users with an increased transaction rate will not fill the allocated space of associated caches as effectively (This is implementation dependent). This means caches size could actually be too small when you go to live
- Thrashing and Time To Live: A large number of users will naturally cause a large breath of caching to be exercised. I have tested systems where we have ‘come a cropper’ because when the system went live the breath of data searched by a large number of users caused the cache to thrash the DB. Luckily this was raised as a risk beforehand.
- Non Issues: By increasing transaction rates (and not users) you may cause issues you would not experience on live (contention, deadlocking ..). The upside of this disadvantage is that you build a system which is more robust by solving these non, or yet to be experienced issues. PM’s and the business tend not to be fans of this.
- Stateful: If you have a stateful system and the memory required by a connection client is created and retained for the duration of the life of the connecting client.
- Large ViewStates: Large viewstate sizes indicates large use of session data. Generally considered to be a warning sign of poor implementation.
So when would it be safe to use a High Transaction Approach?
This depends on the implementation of system you are testing against and your understanding of the underlying architecture. Here is my initial stab of when it would be safe:
- A site is generally static i.e. doesn’t acquire or retrieve a large amount of session data for connected users. Passive site such as newspapers or blogs are a good example of this.
- When caching mechanisms are not heavily utilised by middle tiers
- When a client connecting to a server doesn’t enact a large retrieval of data in order to process associated business flows.
- Writing calls to atomic webservices.
That’s my initial stab at putting a little more meat around the bones of the Transaction Vs User argument. I think it’s a little strong to come firmly down in one camp or another. If you have the ability to scale to the actual number of users then great, if not then you should evaluate, assess and report the risk. If anyone has other concise examples I can use then let me know, I’ll update the article and give credit.
- Jasons Top 10 Tips for Performance Testing
- A Cloud Based Tool with flexible User Licensing
- Performance Testing Approach within an Agile Environment
*Strictly correct, but we don’t live in a perfect world and compromises often have to be made. I also think the alternative technique is correct within the context of the question originally posted.
**A completely questionable and unscientific figure based on my admittedly bad memory and past experiences