Okay, let’s sudden my name is Ricardo and together with a current in, I will be talking about the future of 3d graphics on the web. But before we do that, let’s have a quick look at the past and the present WebGL landed in browsers in February 2011.
We knew the workflow and tools were very different compared to traditional web development. So we also made the project open source, so others could use it as reference. Some years later, Internet Explorer an H and Safari implemented WebGL 2, which means that today the same experience works in all major browsers in desktops and tablets and phones too. What I find most remarkable is the fact that we don’t have to modify the code for that to happen.
It is a low-level API, which means that it is very powerful, but it’s also very verbose. For example, a graphics cards main primitive is a it’s a triangle. Everything is done with triangles here’s the code that we’re going to need to write an order in order to display just just one triangle. First, we need to create a canvas element. Then, whichever skill we get the context for that canvas and then, since things get pretty complicated like pretty fast like after defining positions for each vertex, we have to add them to a buffer.
Send it to the GPU, then link link the vertex and fragment shaders, compile a program that will you will be used in from the graphics card to how to fill those pixels. So that’s why a bunch of us back then like started creating libraries and frameworks to that abstract, all that complexities, so so developers and ourselves cool, stay, productive and focus those libraries take care of placing objects in 3d space material configuration is loading 2d and 3d assets.
Interaction sounds etc, like anything for doing any and you sort of like game on application, there’s only designing those libraries takes time, but over the years people have been doing like free amazing projects with them. So let’s have a look at what people are doing today, so people are still doing interactive music articles – that’s good, in fact like in this example, track by little workshop, not only works on desktop mobile, but it also works on VR devices, letting you look around while Chopping through like glowing tunnels, another clear use of the technology is gaming.
Home is a beautiful game developed by a super surprisingly small team and was released last year last year’s Christmas experiments. Another one is a web experiences in this case out. The goat is an interactive animated, storybook designed to teach children about bullying and the guys that the folks at assembly used my to model on rican animates those characters and then export it to gltf via blender for rendering they use fridges and they brought like 13 13 Thousand lines of typescript to make the whole thing work, and yet another very common use is a product configurators.
The guys are like a little workshop again show how good this can look in this demo click, but those those cases are not and they’re like what people are doing. Like data visualizations, enhancing newspaper articles, virtual tours, documentaries, movie promotions and more like you can check, you can check the 3js webster than the babylon Jet Set to see more of those examples. Alright, we don’t want to.
We don’t want to end up like in a world where the only HTML elements in your page is just a combat stack in the script tag. Instead, we must find like a ways of combining WebGL on an HTML, so the good news is the laylee. We have been like seeing more and more projects and examples of web designers utilizing bits of WebGL to enhance their HTML pages. Here’s a site that welcomes the user with a beautiful immersive image, we’re able to interact with the 3d scene by moving the mouse around the image.
On the background, but like for under power devices, we can just replace that WebGL scene with the static image and the websites so functional. Another interesting trend we have been seeing is the websites that use distortion effects. The worst for Japanese director of terajima has a very impressive use of them. However, the content is actually plane and selectable HTML, so it is surprising because, as you know like, we cannot do these kind of effects with CSS.
So if we look look at it again, what I believe that they were doing is like the edges copying they they had the Dom elements that they’re copying the pixels all those elements into the background of WebGL canvas. Then they hide the Dom element that they apply. The distortion – this is the finish, the transition and they put the next Dom on top, so it’s still something that you can enable/disable depending on there if it’s small, it also works on mobile some of the things, but something that you can progressively enhance one more example To set for this side applies the social effect on top of the HTML, basically making the layout like like truly fluid, then again like this is something surprising because with the may be possible with CSS.
However, I think that we can still make even simpler, enter web components. I believe, where components will allow us to finally bring bring all the power of WebGL right into the HTML layer. We can now encapsulate all those effects in composable custom elements and hide all the code complexity so, for example, here’s another project that we did for the WebGL lunch eight years ago. It was kind of a globe platform.
This custom element is ready to display, like any 3d 3d models were using the gltf open standard. An important feature of HTML tags is accessibility for low vision and blind users. We’re trying to inform them on both the 3d model, like what the 3d model is and also orientation of the model. Here you can see that the view angle is being communicated verbally to the user, so they can be oriented with what’s going on and also it prompts the for how to control the model with keyboard, and I see an easy exit back to the rest of the Page, the mala paper also supports air like a mental reality, and this you can see how it’s being it’s also really being used on the nasa website so use by adding the a RI attributes.
It’s going to be able it’s going to show an icon and it’s going to be able to launch the a or b word for both on Android and iOS for iOS. You have to include the u.S. DC file and, lastly, while building the component, we realized that, depending on the device, you can only have up to 8 WebGL context at once. So if you create a new one, the first one disappears. It is actually like a well-known a limitation of WebGL bit lots of good practice.
You only have one context for keeping memory in one place. The best solution that we found for this was creating a single WebGL context on off-screen, so like it’s hidden and then we use we use that one to render all the model. We were elements on the page. We also like utilize, the interesting observer to make sure that we are not rendering objects that not are not in BO and also recites observer too, whenever detecting, if the, if the developer is modifying the size we render we have to, but we all know how the Web is sooner than later, someone we want to display hundreds of those components and ones, and that is great.
We want to allow for that. But for that we’ll need to make sure that the underlying API is or as efficient as possible. So, for that now, quarantine is going to share with us what’s coming up in the future. Thank you. Okay! Thank you, Ricardo. This was an amazing display of what’s possible on the web using GPUs today. So now I’ll give a sneak peak of what’s coming up next. In the future, where you’ll be able to extract even more computational power from GPUs on the web, so hey everyone, I’m Colton Velez, and for the last two years at Google, I’ve been working on an emerging web standard called web GPU in collaboration with all the major Browsers at w3c, so web GPU is a new API, that’s the successor to WebGL and it will unlock the potential of GPUs on the web.
So now you’ll be asking go Anton. We already have WebGL. So why are you making a new API? The high level reason for this is that WebGL is based on an understanding of GPUs as they we’re 12 years ago and in 12 years GPU hardware has evolved, but also the way we use GPU hardware has evolved. So there is a new generation of GPU. Api is native, for example, Vulcan that helped do more with GPUs and web GPU is built to close the gap with what’s possible in native today, so it will improve what’s possible on the web for game developers, but not only it also improve what you can do In visualization in heavy design, applications for machine learning, practitioners and much more so for the rest of the session, I’ll be going through specific advantages or things that web GPU improves over WebGL and show how it will help build better experiences.
So, first web GPU is still a low level and verbose API so that you can tailor usage of a GPU to exactly what your application needs. This is the triangle Ricardo just showed and as a reminder, if this was the code to render this that triangle in WebGL now this is the minimum web GPU code to render it the same triangle. As we can see, the complexity is similar to WebGL, but you don’t need to worry about it, because, if you’re using a framework like three or Babylon, then you’ll get the benefits transparently for free when the framework updates to support what GPU.
So the first limitation. For that WebGL frameworks run into is the number of elements or objects they can draw it frame, because each drawing command has a fixed cost and needs to be done individually, each frame so with WebGL and optimized. The application can do a maximum, a thousand objects per frame and that’s kind of already pushing it, because, if you’re on, if you want to target a variety of mobile devices and desktop devices, you might need to go even lower than this.
So this is a photo of a living room. It’s not rendered it’s an actual photo, but the idea is that it’s super stylish, but it feels empty and cold. Nobody lives there, and this is sometimes what it feels looking at WebGL experiences, because they can like complexity. In comparison, game developers in native or on consoles are used to, I don’t know, maybe 10,000 objects per frame if they need to, and so they can build richer, more complex, more lifelike experiences, and this is a huge difference, even with the limitation in the number of Objects, WebGL developers have been able to build incredible things and so imagine what they could do if they could render it as many objects.
So, let’s go back to the slides. What we have seen is that, for this specific and early demo, web CPU is able to submit three times more drawing commands in WebGL and leave his room for your applications. Logic I made Renu a major new version of babylons. A yes Babylon’s is 4.0 was released. Just last week, and now today, the WebGL, the Babylon Jazz developers, are so excited about what GPU that they are going to implement full support for web GPU.
For the initial version of what GPU in the next version of Babylon is that blonde J is 4.1. But what GPU is not just about drawing more and more complex scenes with more objects, a common operation done on GPUs are say: post-processing image filters, for example, def depth-of-field simulation. We see this all the time in cinema and photography, for example, this photo of the fish. We can see the fish is in focus while the background is out of focus, and this is really important because it gives the feeling that the fish is lost in a giant environment.
So this type of effect is important in all kinds of rendering. So we can get a better cinematic experience, but it’s also used in other places like camera applications and of course this is one type of post-processing filter, but there’s many other cases of post-processing filters. Like I don’t know, color grading image sharpening a bunch more and all of them can be accelerated using the GPU. So, for example, the image on the Left could be the background behind the fish.
If, before we apply the depth of field and on the right, we see, the resulting color of the pixel what’s interesting is that the color of the pixel depends only on the color of a small neighborhood in the original image in a small neighborhood of the pixel. In the original image, so imagine the grid on the left is a neighborhood of original pixels. We’re going to number them in 2d, and the resulting color will be essentially a weighted average of all these pixels.
Another way to look at it is to see that, on top we have the output image and each of the the color of each of the output pixels will depend only on the 5×5 stencil of the input image on the bottom. The killer feature of a GPU, in my mind, is what we call GPU compute, and one use case of GPU compute is to speed up local image filters like we just saw, and so this is going to be pretty far from Dom manipulation.
I would like react, or like amazing web features, like course headers. So please bear with me we’re going to go through it in three steps. First, we’ll look at how GPUs are architectures and how an image filter in WebGL uses that architecture and then we’ll see how web GPU takes better advantage of the architecture to do the same image filter but faster. So let’s look at how a GPU works – and I have one here so this is a package you can buy in stores.
And can you see it? Oh yes! So this is a package you can buy in stores and the huge heatsink. But if we see inside there’s this small chip here – and this is the actual GPU, so if we go back to the slides, this is what we call a die shot, which is a transistor level picture of the GPU. And we see a bunch of repeating patterns in it, so we’re going to call them execution units. These execution units are a bit like cores and CPUs in that they can run in parallel and process different workloads independently.
If we zoom in even more in one of these execution units, this is what we see. So we have in the middle a control unit which is responsible for choosing the next instruction like, for example, add two registers or load something from main memory, and once it has chosen an instruction, it will send it to all the alias. The ALUs are the arithmetic and logic units and when they receive an instruction, they perform it.
So, for example, if they need to add two registers, they will look at their respective registers and add them together. What’s important to see is that a single instruction from the control unit will be executed at the same time by all the ALUs just on different data, because they all have their own registers. So this is single instruction, multiple data processing, so this is the part of the execution unit that is accessible from WebGL, and what we see is that it’s not possible for L used to talk to one another.
They will have no ways to communicate, but in practice, GPUs look more like this. Today there is a new shared memory region in each of the execution units where I’ll use our can share data with one another. So it’s a bit like a memory cache in that it’s much cheaper to access than the Jemaine GPU memory, but you can program it directly explicitly and have a use shared memory there. So a big benefit of GPU compute is to give developers access to that shared memory region.
This was the architectures of GPUs and their execution needs. So now we’re going to look at how the image filter in WebGL maps to that architecture. For reminder, this was our the algorithm we’re going to look at and in our example, since our execution units has 16 a I’ll use, we’re going to compute a 4 by 4 block, which is 16 pixels of the output in parallel, and each ALU will take care Of computing, the value for one output pixel – and this is GPU pseudocode for the filter in WebGL – and essentially it’s just a two de loop on X & Y, that fetches from the input and computes the weighted average of the input pixels.
What’s interesting here is the coordinates. Argument to the function is a bit special because it’s going to be pre-populated for each of the ALUs and it’s what will make it that’s what will make that they’ll use each to the execution on different data because they start populated with different data. So this is a table for the execution of the program, and likewise we can see the coordinates are pre-populated.
So each column is the registers for one of the ALUs and we have 16 of them for the 16 ail use. So the first thing that happens is that the control you need says, hey, initialize, some to 0, so all of them initialize the sum to 0, and then we get to the first iteration of the loop in X and each ALU gets its own value for X. Likewise, it’s edges, h-hell, u gets its own value for y and now we get to the line that does the memory load of a value of the input.
So each ALU has a different value of x and y in their registers, and so each of them will be doing a memory load to a different location of the input. Let’s look at this register at this ALU. It’s going to do a memory load at position. Minus 2 minus 1 we’re going to get back to this one. So if we go and do an audit or iteration of the loop in Y. Likewise, we have data while register and we do a memory load.
What’s interesting here is that the first ALU will do a memory load in minus 2 minus 1. That’s a redundant load because we already did it at the nest at the last iteration anyways. The loop keeps on looping and there’s more loading and summing and all that that happens and in the end we get to the return, and that means the output. The sum will get written to the output pixel and the computation for a 4 by 4 block is finished.
Overall, the execution of WebGL on the you of the algorithm in WebGL for a 4 by 4 block did 400 memory loads. The reason for this is, we have 16 pixels in each of them. Each of them did 25. So now this was how the filter executes in WebGL we’re going to look at how web GPU uses the shared memory to make it more efficient. So we take the same shader the same program as before, so it’s the exact same code and we’re going to optimize it with shared memory.
So we introduced a cache. That’s going to load! That’s going to contain all the pixels of the input that we need to do the computation. This cache is going to be in shared memory so that it’s cheaper to access than the actual input. It’s like a global variable, that’s inside the execution unit. Of course, we need to modify the shader to use that input tile and because input tile needs to contain values at the beginning.
We can’t just start like this, so this is this function. It’s going to be a helper function that computes the value of the pixel and we’re going to have a real main function that first go plates the cache and then calls the computation. So like the previous version of the shader, the coordinates are pre-populated. So each of the I’ll use does a different execution and then all the L users work together to populate the cache and there is a bunch of loops and whatnots there.
But it’s not really important. So, as we use this what’s interesting to see is that only 64 pixels of the input are loaded and put in the cache. There is no redundant memory loads. Then we go through the main computation of the value, and likewise this is very similar to what happened before, but on this line, the memory load is now from the shared memory instead of the main memory – and this is cheaper.
So, overall, thanks to the caching of a tile of the input, the web GPU version didn’t do any redundant main memory load. So for a 4 by 4 block it did 64 memory loads and, like we saw before what GL had to do 400. So this looks very, very biased in favor of web GPU, but in practice things are a bit more mixed, because web GPU did didn’t do main memory loads, but it did a bunch of shared memory loads and it’s still not free and also WebGL, is a bit More efficient than this, because GPUs have a memory, cache hierarchy, and so some of these memory loads will have hit the cache that’s inside the execution unit.
But the point being overall web GPU will be more efficient because we explicitly are able to cache input data. So the code we just talked about in the graphics world, it’s called an image filtering. But if we look at the machine learning world, it’s called a convolution or a convolution operator, all the optimizations we talked about. They also apply to convolutional neural networks, also known as CNN’s, so the basic ideas for CNN’s were introduced in the late 80s, but back then it was just too expensive to train and run the models to produce the results we have today.
The ml boom of the last decade became possible because CNN’s and other types of models could run efficiently on GPUs in part, thanks to the optimization we just saw. So we are confident that machine learning web frameworks such as tensorflow JS, will be able to take advantage of GPUs to significantly improve the speed of their algorithms. Finally, algorithms can be really difficult to write on GPUs in WebGL and sometimes sometimes there’s just not possible to write at all.
So, to summarize, the key benefits of web GPU are that you can have increasing complexity for just better and more engaging experiences, and this is what we have seen with babylons is it provides performance, improvements for scientific computing, like machine learning and it unlocks a whole new Class of algorithms that you can upload from JSC PU time to run on a GPU in parallel, so now you’re like hey, I want to try this API you’re.
In luck. The web GPU group at the web, CPU is a group effort and everyone is on board. The Chrome Firefox edge, Safari they’re, all all starting to implement the API today, we’re making an initial version of a GPU available on Chrome Canary on Mac, OS and other operating system will follow shortly to try it. You just need to download Chrome Canary on Mac OS and enable the experimental, Schlag and safe web GPU and again this is an unsafe lag.
So please don’t browse the internet with it. On for your daily browsing, more information about about web GPU is a valid available on web GP, dot io. So there’s the status of implementations, there’s link to some samples and demos, a link to a forum where you can discuss web GPUs and we’re going to add more stuff to this. With articles to get started and and all that, what we’d love is for you to try the API and give us feedback on what the pain points are, what you’d like the thing to do for you, but also what’s going great and what you would like about It so thank you, everyone for coming to this session Ricardo and I will be at the web, send box for the next hour or so.
If you want to discuss more. Thank you. You