Proper benchmarking to diagnose and solve a .NET serialization bottleneck
Here's a few comments and disclaimers to start with. First, benchmarks are challenging. They are challenging to measure, but the real issue is that often we forget WHY we are benchmarking something. We'll take a complex multi-machine financial system and suddenly we're hyper-focused on a bunch of serialization code that we're convinced is THE problem. "If I can fix this serialization by writing a 10,000 iteration for loop and getting it down to x milliseconds, it'll be SMOOOOOOTH sailing."
Second, this isn't a benchmarking blog post. Don't point this blog post and say "see! Library X is better than library Y! And .NET is better than Java!" Instead, consider this a cautionary tale, and a series of general guidelines. I'm just using this anecdote to illustrate these points.
- Are you 100% sure what you're measuring?
- Have you run a profiler like the Visual Studio profiler, ANTS, or dotTrace?
- Are you considering warm-up time? Throwing out outliers? Are your results statistically significant?
- Are the libraries you're using optimized for your use case? Are you sure what your use case is?
A bad benchmark
A reader sent me a email recently with concern of serialization in .NET. They had read some very old blog posts from 2009 about perf that included charts and graphs and did some tests of their own. They were seeing serialization times (of tens of thousands of items) over 700ms and sizes nearly 2 megs. The tests included serialization of their typical data structures in both C# and Java across a number of different serialization libraries and techniques. Techniques included their company's custom serialization, .NET binary DataContract serialization, as well as JSON.NET. One serialization format was small (1.8Ms for a large structure) and one was fast (94ms) but there was no clear winner. This reader was at their wit's end and had decided, more or less, that .NET must not be up for the task.
To me, this benchmark didn't smell right. It wasn't clear what was being measured. It wasn't clear if it was being accurately measured, but more specifically, the overarching conclusion of ".NET is slow" wasn't reasonable given the data.
Hm. So .NET can't serialize a few tens of thousands of data items quickly? I know it can.
Related Links: Create benchmarks and results that have value and Responsible benchmarking by @Kellabyte
I am no expert, but I poked around at this code.
First: Are we measuring correctly?
The tests were using DateTime.UtcNow which isn't advisable.
startTime = DateTime.UtcNow;
resultData = TestSerialization(foo);
endTime = DateTime.UtcNow;
Do not use DateTime.Now or DateTime.Utc for measuring things where any kind of precision matters. DateTime doesn't have enough precision and is said to be accurate only to 30ms.
DateTime represents a date and a time. It's not a high-precision timer or Stopwatch.
As Eric Lippert says:
In short, "what time is it?" and "how long did that take?" are completely different questions; don't use a tool designed to answer one question to answer the other.
And as Raymond Chen says:
"Precision is not the same as accuracy. Accuracy is how close you are to the correct answer; precision is how much resolution you have for that answer."
So, we will use a Stopwatch when you need a stopwatch. In fact, before I switch the sample to Stopwatch I was getting numbers in milliseconds like 90,106,103,165,94, and after Stopwatch the results were 99,94,95,95,94. There's much less jitter.
Stopwatch sw = new Stopwatch();
You might also want to pin your process to a single CPU core if you're trying to get an accurate throughput measurement. While it shouldn't matter and Stopwatch is using the Win32 QueryPerformanceCounter internally (the source for the .NET Stopwatch Class is here) there were some issues on old systems when you'd start on one proc and stop on another.
// One Core
var p = Process.GetCurrentProcess();
p.ProcessorAffinity = (IntPtr)1;
If you don't use Stopwatch, look for a simple and well-thought-of benchmarking library.
Second: Doing the math
In the code sample I was given, about 10 lines of code were the thing being measured, and 735 lines were the "harness" to collect and display the data from the benchmark. Perhaps you've seen things like this before? It's fair to say that the benchmark can get lost in the harness.
Have a listen to my recent podcast with Matt Warren on "Performance as a Feature" and consider Matt's performance blog and be sure to pick up Ben Watson's recent Book called "Writing High Performance .NET Code".
Also note that Matt is currently exploring creating a mini-benchmark harness on GitHub. Matt's system is rather promising and would have a [Benchmark] attribute within a test.
Considering using an existing harness for small benchmarks. One is SimpleSpeedTester from Yan Cui. It makes nice tables and does a lot of the tedious work for you. Here's a screenshot I
stole borrowed from Yan's blog.
Something a bit more advanced to explore is HdrHistogram, a library "designed for recoding histograms of value measurements in latency and performance sensitive applications." It's also on GitHub and includes Java, C, and C# implementations.
And seriously. Use a profiler.
Third: Have you run a profiler?
Use the Visual Studio Profiler, or get a trial of the Redgate ANTS Performance Profiler or the JetBrains dotTrace profiler.
Where is our application spending its time? Surprise I think we've all seen people write complex benchmarks and poke at a black box rather than simply running a profiler.
Aside: Are there newer/better/understood ways to solve this?
This is my opinion, but I think it's a decent one and there's numbers to back it up. Some of the .NET serialization code is pretty old, written in 2003 or 2005 and may not be taking advantage of new techniques or knowledge. Plus, it's rather flexible "make it work for everyone" code, as opposed to very narrowly purposed code.
People have different serialization needs. You can't serialize something as XML and expect it to be small and tight. You likely can't serialize a structure as JSON and expect it to be as fast as a packed binary serializer.
Measure your code, consider your requirements, and step back and consider all options.
Fourth: Newer .NET Serializers to Consider
Now that I have a sense of what's happening and how to measure the timing, it was clear these serializers didn't meet this reader's goals. Some of are old, as I mentioned, so what other newer more sophisticated options exist?
There's two really nice specialized serializers to watch. They are Jil from Kevin Montrose, and protobuf-net from Marc Gravell. Both are extraordinary libraries, and protobuf-net's breadth of target framework scope and build system are a joy to behold. There are also other impressive serializers in including support for not only JSON, but also JSV and CSV in ServiceStack.NET.
Protobuf-net - protocol buffers for .NET
Protocol buffers are a data structure format from Google, and protobuf-net is a high performance .NET implementation of protocol buffers. Think if it like XML but smaller and faster. It also can serialize cross language. From their site:
Protocol buffers have many advantages over XML for serializing structured data. Protocol buffers:
- are simpler
- are 3 to 10 times smaller
- are 20 to 100 times faster
- are less ambiguous
- generate data access classes that are easier to use programmatically
It was easy to add. There's lots of options and ways to decorate your data structures but in essence:
var r = ProtoBuf.Serializer.Deserialize<List<DataItem>>(memInStream);
The numbers I got with protobuf-net were exceptional and in this case packed the data tightly and quickly, taking just 49ms.
JIL - Json Serializer for .NET using Sigil
Jil s a Json serializer that is less flexible than Json.net and makes those small sacrifices in the name of raw speed. From their site:
Flexibility and "nice to have" features are explicitly discounted in the pursuit of speed.
It's also worth pointing out that some serializers work over the whole string in memory, while others like Json.NET and DataContractSerializer work over a stream. That means you'll want to consider the size of what you're serializing when choosing a library.
Jil is impressive in a number of ways but particularly in that it dynamically emits a custom serializer (much like the XmlSerializers of old)
Jil is trivial to use. It just worked. I plugged it in to this sample and it took my basic serialization times to 84ms.
result = Jil.JSON.Deserialize<Foo>(jsonData);
Conclusion: There's the thing about benchmarks. It depends.
What are you measuring? Why are you measuring it? Does the technique you're using handle your use case? Are you serializing one large object or thousands of small ones?
James Newton-King made this excellent point to me:
"[There's a] meta-problem around benchmarking. Micro-optimization and caring about performance when it doesn't matter is something devs are guilty of. Documentation, developer productivity, and flexibility are more important than a 100th of a millisecond."
In fact, James pointed out this old (but recently fixed) ASP.NET bug on Twitter. It's a performance bug that is significant, but was totally overshadowed by the time spent on the network.
This bug backs up the idea that many developers care about performance where it doesn't matter https://t.co/LH4WR1nit9— James Newton-King (@JamesNK) February 13, 2015
Thanks to Marc Gravell and James Newton-King for their time helping with this post.
What are your benchmarking tips and tricks? Sound off in the comments!