An interview with RTP author Ron Frederick: how the RTP protocol was created and his work at Xerox PARC

LiveVideoStack 2021年10月14日

Recently, LiveVideoStack had an email interview with Ron Frederick, who made significant contributions to the Real-time Transport Protocol (RTP), a network protocol for delivering audio and video over IP networks.



(Photo provided by Ron Frederick)


In this enlightening interview, Ron discusses how he created RTP with the other three authors, his nv tool and the work culture of Xerox PARC. He also shares his views on WebRTC and QUIC.


To people who are pursuing a career in computer engineering, Ron’s main advice is: don’t be afraid to get your hands dirty. It means that people should find real-world problems to solve and actually write code which solves them.


This interview also unveils the pioneering work done by the computer scientists decades ago, and how the Internet was like during that time.


The following is our conversation with Ron Frederick.


LiveVideoStack: Could you give us a brief introduction about yourself?


Ron Frederick: Sure. My name is Ron Frederick and I currently live in Mountain View, CA in the heart of Silicon Valley. I have a bachelor’s degree in Computer and Systems Engineering from Rensselaer Polytechnic Institute (RPI) and a master’s degree in Computer Science from Stanford University. I currently work in computer security for Broadcom Software, making products that keep people and businesses safe on the Internet.


LiveVideoStack: How did you get interested in computers and when did you decide to pursue a career in this field?


Ron Frederick: From a very young age, I always had a love for mathematics. When Radio Shack introduced their TRS-80 microcomputer in 1977 (when I was 9 years old), I quickly became fascinated by it, and I spent hours in the Radio Shack store in the mall near my home programming the demo model they had on display. A couple of years later, my parents bought me one of these machines, and not long after that at age 12 I landed my first paying job setting up and running a computer system for a local business. This led to part-time consulting jobs working for other local companies throughout my time in high school, and I knew that working with computers in some form is what I wanted to do.


LiveVideoStack: Why did you choose Xerox PARC as your first company after graduating from Stanford?


Ron Frederick: I actually worked for Xerox PARC as a summer intern the first summer after coming to Stanford for grad school. My advisor at Stanford recommended that I apply, and it was a terrific experience. Even after returning to school in the fall, I continued to work for Xerox part-time for the next year or so, and I really saw it as a dream job. I was originally in the PhD program at Stanford, but I had the opportunity to leave Stanford with a Master’s degree and take a full-time position at PARC and decided to go for it, as it really was everything I could have asked for in a job.


LiveVideoStack: Xerox PARC is truly a great company and it has numerous revolutionary inventions which have entirely changed people’s daily life. You were there from 1992 to 2000. So, what was it like working at Xerox PARC?


Ron Frederick: It really was an amazing place! One of PARC’s fundamental ideas was for its researchers to “live in the future”. By that they meant that we would build products that were not feasible to be manufactured and sold at a reasonable cost or packaged in a comfortable form factor based on present-day technology, but the prototypes we built would allow us to figure out what worked and what didn’t in terms of how the products could be used, and what new kinds of interaction they could enable.



Photo credit: Scientific American September 1991


Using this approach, we managed to be about 10-15 years ahead of our time. For example, we had portable networked handheld devices (the PARCTab) and tablets (the PARCPad) with graphical touch screens back in 1992, 15 years before Apple’s iPhone and iPod Touch were introduced in 2007 and 18 years before the iPad in 2010. Of course, the processing power and network speeds back then were nowhere near what was possible by the time the iPod Touch came out, but the basic concept of a “home screen” with icons to launch different applications and many other interactions like audio/video conferencing and shared drawing tools were all there more than a decade before a consumer version of these products came out.


LiveVideoStack: Looking back on those days in Xerox PARC, who impressed and inspired you most?


Ron Frederick: There were many amazing people I met at PARC during my time there, but I think the person who impressed me most was Mark Weiser, who was the manager of PARC’s Computer Science Laboratory (CSL) when I first got there and later became PARC’s Chief Technology Officer in 1996. Unfortunately, Mark passed away in 1999 before he got to see some of the technology he helped to create be fully realized on a consumer scale, but there’s no question in my mind that many of those products would not have been created if it weren’t for Mark’s early work and leadership in this area. In 1988, he coined the term “Ubiquitous Computing” to describe a world where dedicated PCs were replaced with computers that existed all around us, receding into the background and that’s exactly what we’re seeing now.


Here’s a nice article on some of Mark’s work:

https://www.lri.fr/~mbl/Stanford/CS477/papers/Weiser-SciAm.pdf 


Here is a video about Ubiquitous Computing:

https://www.timeheart.net/ubiquitous_computing_demo.html 


LiveVideoStack: Recently our government has ruled that the “996 work culture” (a practice where people work from 9am to 9pm six days a week) is illegal. Did you have to work long hours at Xerox PARC? What was the Xerox’s work culture?


Ron Frederick: Xerox really didn’t mandate any specific hours that people had to work. In fact, many people came into work later in the morning to avoid rush hour traffic. They often stayed much later in the evening as well, though, because they really loved what they did there. People were generally rewarded for their original ideas and ability to push technology in new and interesting directions, and not specifically how many hours they worked to achieve that.


LiveVideoStack: Among all the work you have done, which part do you find most satisfying?


Ron Frederick: If I had to pick one specific project that I’m most proud of, it would probably be my “nv” (Network Video) tool, which was one of the first pieces of software to allow people to send and receive video over the Internet. Xerox PARC allowed me to release this code as open source, and it was downloaded and used by thousands of people from all over the world. It was used by NASA to do live broadcasts of Space Shuttle missions on the Internet for many years, and also used to transmit video of the Internet Engineering Task Force (IETF) meetings where the RTP standard was being developed so those who couldn’t attend in person could still track the work.


Last year, I re-published the original “nv” code on GitHub at https://github.com/ronf/nv for anyone who wants to look it over.


You can find more info about nv in this audio interview:

https://town.hall.org/radio/Geek/060394_geek_ITR.html 


LiveVideoStack: Is there anything you’d like to say to encourage the younger generation who are pursuing a career in computer engineering?


Ron Frederick: I think my main advice to anyone studying this area is to not be afraid to get their hands dirty, meaning that they should find real-world problems to solve and actually write code which solves them. Academics are valuable to teach you the fundamental principles you can use to build things, but in my experience the bulk of what you learn comes from actually writing code, and especially writing code that does something real, and isn’t just some class project or homework.


LiveVideoStack: RTP has four authors, Van Jacobson, Steve Casner and, Henning Schulzrinne and you. How was this cooperation achieved? Was there anything that left you with a deep impression while working on RTP?


Ron Frederick: We became the authors of this specification because we all had built tools that involved sending some kind of real-time data over the Internet, ranging from audio and video streams to a real-time “shared whiteboard” application that let people at different locations collaboratively draw on a virtual whiteboard that stayed in sync across all the different locations. The idea was to figure out what the common elements were for these different streams, and standardize those across the tools, to make it easier to do things like synchronize streams with one another or report on the quality of the stream (collecting statistics on packet losses and delays). We’d each take passes at writing or editing different sections of the doc, and then periodically pull it all together and publish new versions as an “Internet Draft” for a larger working group of interested people to provide feedback on.


For me, this was really the first time I worked with a standards body, so I learned a lot from the other authors who already had much more experience with this kind of work. Since I actually had developed one of the tools that fed into the effort, though, they were very welcoming to my input, and I got a chance to make significant contributions to the resulting RFCs.


LiveVideoStack: What was the most challenging part of creating RTP?


Ron Frederick: I think the most difficult thing was probably trying to figure out how to build something general enough to be a good fit for all of the different types of real-time communication we wanted RTP to support. Supporting both audio and video was already a challenge, given the way that video data tends to be more “bursty”, often producing several packets that all have the same timestamp, where audio generally tends to be much smoother in that regard. When you then try to add in other types of data sharing such as the shared whiteboard tool, this becomes even more difficult, as with that you generally want some form of retransmission to allow for eventual consistency in what everyone is seeing, where with audio and video you may do some limited retransmission or use other techniques like forward error correction, but at some point you just give up and drop part of the audio or video if all of the data doesn’t arrive in time.


LiveVideoStack: As the fundamental protocol for WebRTC, today RTP/RTCP has been used by most personal computers around the world. Did you expect this while you were working on it?


Ron Frederick: I was definitely pleased to see that WebRTC took an approach of trying to leverage existing standards, rather than re-inventing everything itself from scratch. When developing RTP, it was definitely our intention to support a wide variety of use cases and what WebRTC was trying to do was very much within that scope.


While WebRTC didn’t come along as a standard until much later, I actually did some work at PARC in 1996 getting audio and video streaming in a web browser using the NPAPI “plugin” mechanism. These streams were based on RTP/RTCP, leveraging the code I wrote for “nv” and some earlier code I had written which did network audio streaming, and Xerox eventually spun out this work in a company called “Placeware” that was later acquired by Microsoft.


LiveVideoStack: Now more and more RTP extensions have been implemented in WebRTC, and what do you think of it? Do you think more and more extensions are good for RTP? Did you take it into consideration when you were developing RTP?


Ron Frederick: I must admit I haven’t kept up with all of the details of the WebRTC effort, so I’m not familiar with what specific RTP extensions WebRTC is proposing. The RTP standard was designed with some support for extensions in mind, but we wanted to strike a balance between that and efficient packet processing. As such, the extension support was mainly focused on adding extensions to the RTCP control packets, and not necessarily providing a large amount of extension support to RTP data packets. RTP does allow application profiles to be defined which extend the RTP header, but any given RTP data packet will typically only be allowed to have a single RTP header extension and the interpretation of that header extension must be defined by the application profile that RTP is operating within. This provides a way to carry additional data when needed, but avoids having to have code which has to walk over multiple extensions to find the beginning of the actual real-time payload data.


LiveVideoStack: QUIC/HTTP3 is getting more and more popular, and some people even say that QUIC is the future of WebRTC. What do you think about the idea of RTP OVER QUIC? Is it a good one?


Ron Frederick: QUIC is a very interesting protocol and provides a number of benefits over TCP, particularly when it comes to real-time data, since it provides for the possibility to process packets out of order. While HTTP/2 added the ability to multiplex several streams over a single TCP connection, the use of TCP forced the data to always be processed in order, meaning a packet loss on one multiplexed stream would block processing of data in all other streams. QUIC has the potential to solve that problem, and could even potentially be evolved to support different retransmission strategies for different streams, which could be very beneficial to something like audio or video where packets arriving after a certain amount of delay are no longer useful. I look forward to seeing how this work evolves!


LiveVideoStack: What’s the biggest difference between the Internet today and the Internet in your time? As you see it, what could be the next incredible innovations on the Internet?


Ron Frederick: The most obvious difference between then and now is the amount of bandwidth available to end users, as well as the amount of computing power in the devices they are using to connect to the Internet. When I began working on carrying audio and video over the Internet, I had ISDN connectivity at home which gave me a whopping 112 kbps of throughput, with most people still limited dial-up running at 28.8 kbps or slower. Today, a typical home Internet link is over 1,000 times faster than that, and there’s an even bigger jump in the amount of processing power available.


Another big difference is ubiquitous presence of WiFi and cellular connectivity, allowing mobile devices to stay permanently connected. That’s something we actually explored at PARC using a mixture of infrared and near-field radio technologies we developed ourselves (as all of this was before WiFi was invented), but it was still limited to only working in our building, where people now take it for granted that they can access the Internet from almost anywhere!


Sadly, the one thing that was developed which never made it into the modern Internet (at least not over the wide area) was IP multicast, allowing a stream of data to be sent very efficiently to multiple targets at once, without having to put multiple copies of that data onto any given Internet link. The idea was that the routers between you and the various targets would cooperate to build a tree that efficiently delivered the packet to all of the target machines, but only ever putting a single copy of any packet on any given link, no matter how many targets would eventually receive it. Copies were made only as the paths diverged from one another, and even then only the paths with at least one receiver of the data actually ended up getting a copy of those packets. As we start to do more and more multi-party video conferencing, I think not having this capability on the modern Internet is going to hurt us, but the bandwidth has increased enough that we’ll probably just pay the cost of sending multiple copies of the data.


I’m not really sure what the next big innovation will be, but as more and more computing power fits into the devices, I think we’ll probably see more “local” processing, such as voice assistant like Siri which can work even without sending your data to the cloud first, and that will be a good thing for end-user privacy. I also think that we might to see more done with augmented reality, particularly if we ever get to a point where a heads-up display coming from eyeglasses (or contact lenses!) ever becomes lightweight enough to be comfortable.


LiveVideoStack: The coronavirus struck the whole world in 2020 and people’s lives have been changed greatly. However, the video technology is rapidly advancing as people have to stay at home and communicate using video. Do you think this is an opportunity for video technology innovation and growth?


Ron Frederick: There’s no doubt in my mind that the pandemic greatly accelerated the use of videoconferencing and presentation tools like WebEx and Zoom. It also probably accelerated the use of the Internet to stream entertainment, with people stuck at home much more than they otherwise would have been. In some cases, I think we may see this form of remote work now being an option at many employers even after the pandemic is under control, now that they’ve seen that people can be productive this way. This could even have an impact on commercial real-estate, with some companies opting to not even have offices for some employees.


LiveVideoStack: How did the pandemic change your life and work? What are you doing these days?


Ron Frederick: I’ve been lucky enough to manage to stay safe during the pandemic, and to not have my job impacted by it. As many companies did, employees of my company spent a number of months working from home when the early lockdowns began, later moving to working 50% of the time in the office, and now actually back to being 100% in the office (with many safety measures in place).


I was fortunate that the work itself (in Internet security as I mentioned above) remained important, and I was also able to take advantage of many of the technologies we’re discussing here to stay productive through all of this.


In my spare time, I’m also working on a few open-source projects these days. The first is an SSH client/server built on top of the Python asyncio framework called AsyncSSH. You can find more information about that at https://asyncssh.readthedocs.io/en/latest/   or on GitHub at https://github.com/ronf/asyncssh  .


Also, to learn a bit more about modern web standards and have some fun resurrecting an old game I wrote back in 1992 while I was at Xerox PARC, I created a browser-based version of the classic “Spacewar!” last year, designed to use WebSockets and WebRTC. You can find more info about that at https://github.com/ronf/webrtc-spacewar  . The original version of this was intended to show how to build a game with no central server, by having the clients use IP multicast to share their game information. This version carries forward that basic concept, with all the game logic in the clients and a server used only to download the game and allow the clients to discover one another. The clients will attempt to use WebRTC to connect to each directly if they can, falling back to using a simple WebSocket relay for that communication if WebRTC fails.


LiveVideoStack: Last but not least, if you are given a chance to have a conversation with a computer scientist, who do you want to talk to most? What would you like to talk about?


Ron Frederick: I think I would choose Alan Kay, who was at Xerox PARC but left before my time there. I actually have a T-shirt that I got later when working at Symantec with a quote from him that I really like: “The best way to predict the future is to invent it.” Among other things, he conceived of the “dynabook” in 1968, a precursor to tablet computers before any kind of personal computer even existed! He is also considered the architect of the modern graphics user interface and the inventor of object-oriented programming. I would love to hear what he thinks of modern mobile devices and how they match up against his original vision. What did we get right, and where are we still lacking even now?


Editor: Alex Li


还可输入800
全部评论
作者介绍

LiveVideoStack

音视频技术社区

文章

粉丝

视频

阅读排行
  • 2周
  • 4周
  • 16周