KUtrace: Where does every nanosecond go in complex latency-sensitive software?

When:
October 20, 2022 @ 7:00 pm – 8:30 pm America/New York Timezone
2022-10-20T19:00:00-04:00
2022-10-20T20:30:00-04:00
Where:
On-Line

Computer Society and GBC/ACM

Richard L. Sites, ex-DEC

Register in advance for this webinar at:

https://acm-org.zoom.us/webinar/register/6316631329041/WN_XKN5R8FUSR6w3DCQk7a3Jg

After registering, you will receive a confirmation email containing information about joining the webinar.

Observation tools for understanding occasionally-slow performance in large-scale distributed transaction systems are not keeping up with the complexity of the environment. The same is true for large database systems, real-time control systems, and operating systems themselves.

KUtrace is a low-overhead tracing tool that reveals the true execution and non-execution (waiting) dynamics of such software, running in situ with live traffic. It is based on small FreeBSD or Linux kernel patches recording and timestamping every transition between kernel- and user-mode execution across all CPUs. The resulting postprocessed display shows exactly what each action/response is doing every nanosecond, and hence shows the root cause(s) for unpredictably-slow responses, including interference between programs. Tracing overhead is well under 1%.

Even without a similar observation tool for GPU execution, CPU-only tracing shows GPU delays and CPU-GPU interaction delays. The net result is deep insight into the dynamics of complex software, leading to often-simple changes to improve performance.

Dr. Richard L. Sites wrote his first computer program in 1959 and has spent most of his career at the boundary between hardware and software, with a particular interest in CPU/ software performance. His past work includes VAX microcode, DEC Alpha co-architect, and inventing the performance counters found in nearly all processors today. He has done low-overhead microcode and software tracing at DEC, Adobe, Google, and Tesla. Dr. Sites earned his PhD at Stanford in 1974; he holds 66 patents and is a member of the National Academy of Engineering. His book Understanding Software Dynamics was published by Addison-Wesley in late 2021.

This joint meeting of the Boston Chapter of the IEEE Computer Society and GBC/ACM will be online only due to the COVID-19 lockdown.

Up-to-date information about this and other talks is available online at https://ewh.ieee.org/r1/boston/computer/.

You can sign up to receive updated status information about this talk and informational emails about future talks at https://mailman.mit.edu/mailman/listinfo/ieee-cs, our self-administered mailing list.