Ep9 - Making AEM & Edge Delivery go CRAZY FAST with StreamX

[00:00:10] Speaker A: All right. This is Arbery Digital experiences. This is episode nine. I'm Tad Reuss from Arbery Digital principal architect, and I'm joined here by Michael from Poland. He's the CTO of Dynamic Solutions and the co founder of Streamx. And Stream X is a brand new product. It's a digital experience mesh for accelerating a whole bunch of different digital experience platforms, including Adobe experience manager, including edge delivery service, a whole bunch of others. But those are words that are ones that we're going to get into defining because as we define the problem that this is solving. And I think that anybody who's been in the CMS digital experience platform space will hopefully empathize with some of the problems that we're going to talk about and then we're going to get into how StreamX went about solving this. So. [00:01:10] Speaker B: Yep. Welcome, welcome. Thank you for inviting, been watching your podcast for a while and it's good to be here. [00:01:19] Speaker A: Yeah. And as you just pointed out right before we hit the record button, I don't have my cap on. So hopefully everybody can recognize. [00:01:26] Speaker B: Yeah, I'm sure they will. [00:01:29] Speaker A: Yeah. [00:01:30] Speaker B: Eventually it's on your channel, so they will. [00:01:32] Speaker A: That's right. That's right. Yeah. Maybe, maybe this is the, this is the time to use that Genai stuff to have, you know, to do the generative fill and have it put a, put a cap on me or something like that. But, but Mikael, why don't you tell me a little bit about your, just your background, your, and, and how you, how you got into the, or your experience with the AEM space and kind of what led you to, into developing this platform. [00:02:04] Speaker B: Yes. So I've been working with AEM for twelve years and I've been a developer later architect and on a client's project. But we also started a consulting company, Dynamic solution. And we got like 50 people now been working on, you know, with those people for almost from the start. Of course, it was not 50 from the beginning, but yeah. So we were always dreaming of building something that will make impact and change the way we work. So, and of course we've been working with IAm and also other web systems, so we know, we kind of knew what the pain points were. So the start point is the confluence page when we ask all of our consultants to write down the pains and we tried to build a platform that will address it and it couldn't. I mean, we didn't succeed for the start. Like the first try was a bit different from what Streviac is at the moment. And I think that the key point was to deciding to use event streaming and build the solution around that. And yeah, once we did it, once we created a Poc, once we created their version of products, and now when we got it full rating and presented to a lot of people, yeah, I'm pretty sure that we have what we needed to have. So now it's a great opportunity to share some of the things with you and the audience. [00:04:03] Speaker A: I love it. I love it. Yeah, I mean I think that one of the things that's going to make this interesting is that it's one thing if you just came out of university, 22 year old or something like that, with these bright ideas of how CMS should be, but I feel like you're instead coming from it, from the viewpoint of having seen CMS's used for absolutely everything. Like you shared a story of having am being used for an airline to print boarding passes. When you think of a content management system, you think of a bunch of content pages for products about us and things like that, but it instead becomes the, the tool that is used for absolutely everything and to glue everything together, even things that have nothing to do with just simple web content. [00:05:07] Speaker B: I mean when I graduated the university, I was a Java developer, I never wanted to build websites. I mean it was not my goal. And because when I graduated, cmss were like completely different story. And the challenges that we have in digital world and the ways of the, of solving the problems were different. So back in two thousands, it was a time of, you know, WordPress or dropala or life frame, maybe that was there, etcetera. And actually when I started my journey with CQ five by day at the moment, at that time I was impressed with the architecture. So back then, the replication of the outer to publishers and caching dispatcher, moving it, being able to scale it, distribute it, the own database and everything. I'm still a big fan of SGI development model, so that that is hard to use, let's make it clear. But it's like a solid piece of engineering. And I always enjoyed working with, I always been enjoying working with demanding technologies. So we had that time of the CMSs where it, you know, it's like first you got static pages, then you got some CDI scripts. In 2000 there is a year of cmss. Then CMS has evolved into web content management systems. I still don't know what's the difference between the solid CMS and the entry level web content management system. But you know, all those distribution, a lot of features coming into those systems like search user management, permissions management, like close user groups, some basic analytical tools built into the CMS and ability to handle the serious load because you could actually almost, I mean scale, scale it horizontally by still having redundancy. And I mean those were things that were really good at that time. [00:07:35] Speaker A: Yeah, yeah. And simply the fact that you could have your publish environment scale separately from the authoring environment. Because the first time that I had touched CQ that one of the only content management systems that I have, I, funny, homegrown Java weirdness, Apache velocity, I remember that project, but there's like WordPress, everything's all ground into the same thing. And so you can't necessarily simply, especially the way it was architected back then separated out and scale it. So even, even just the separation of concerns of authoring and publishing and publishing the cache, it, you know, it was, it was in, it was a fresh, it was breath of fresh air. [00:08:23] Speaker B: Yeah, exactly. And that's how I thought. And also the projects, the clients, I'm linked enterprise clients, they always kept challenging, challenging things to work with. Like you said, the airlines, the banks, I mean, you know that if you develop a solution, you can do an impact and your work can be visible. So it's like, you know, not implementing some algorithm that will be used by one company in back office system or a bank or something. Actually, you could finally go to a family birthday and say, this is what I do. [00:08:59] Speaker A: Right, right. Which is a big deal, which is, I mean, it's funny for somebody who's been in operations like me, you know, they, they say, so you did that website. I'm like, well you see how when you went to the website and you didn't get an error message? That was me, I did that. [00:09:16] Speaker B: Yeah. So my younger brother is a firefighter fighter, my older brother is a doctor. So they always get, you know, bunch of stories. And whenever someone asked me, what are you working with? It was like, well, you want to get it? [00:09:33] Speaker A: You want to know. [00:09:35] Speaker B: Yeah, exactly. So you know, with the, after the WCM web content management system, we got the era of DXP. Everyone wanted to be a platform at the time and all those CMSs and literally became the DXPs. [00:09:54] Speaker A: Right. [00:09:54] Speaker B: And that would be good. You know, you got a lot of monolithic applications with a lot of functionalities inside. But the problem is that Internet evolves faster than the technology. And in fact the technology got it its lifecycle. So like it's made for a kuros in the, to solve the problems of the times. And the times goes some of the technology declines, need to be replaced with newer technologies, or at least requires some modification or a booster that will allow you to solve new problems. So how I see it at the moment is that we get, I mean, Internet changed. Previously, you got like 100 visitors of your website like 20 years ago, and having everything in one logic application was just good. If you had a search on your site, then you were in top companies with the best digital presence. If you had a side map and all of that, and you could implement everything in Java, then it just worked. I mean, nowadays, especially enterprise organizations get huge amount of, I mean, different challenges, like huge amount of information that they want to present on the website, like recommendations, products, data, detailed products, data, like, you know, dynamic prices, everything. Yeah, coming from different systems. So when you look at the DXP platforms and old Drupal or WordPress, when you look at the architecture of a single node, it's still the same. Like there is a database, there is some business logic layer, like, you know, the components, pages, etcetera, and you get the rendering. So, but in order to scale, let's say search, you need to scale the whole instance. Of course. I mean, scaling of the publishers is no longer an option. And the other fact is that if you use a functionality that is built into the CMS or GXP, it won't be as good as a separate service built only for a single proposed, like, you know, search, like Opensearch. So you know, there are like teams working on Opensearch and trying to make the search the best of bread. So you cannot expect that the search in a WordPress will be as good as the search on open search, right? [00:12:50] Speaker A: That's right. [00:12:51] Speaker B: So, you know, I say that the next trend is moving to the cloud, like all those Mac technologies, jamstack Technologies build front end on front end and use services like search recommendations from different vendors and somehow glue it together. But however, I see that it's not answering the needs of the enterprise customers because at least not all of them, because to properly implement cloud native Mac technology, you need to have solid development team that will understand each of those pieces and orchestrate it together in order to achieve the results. But once you decide to move to cloud or Mac or whatever, you got, I mean, the project, that will take two years and you don't know what will be the outcome, you can trust that the outcome will be good, especially when you hire a new CTO in a company and he decides that we need to move forward. So I see the next step, next trend. So Mac architecture as a very good one, but also very risky and costly one. So I believe that we provide an alternative of how to keep your existing backend systems, but treat them as source systems and allow customers to use, ready to use and tested platform to achieve the scalability, composability and performance across the geographical locations. [00:14:46] Speaker A: So. [00:14:48] Speaker B: Actually that's another topic is that when you decide to move to Mac, you need to also migrate all your marketing sales, maybe not sales, but marketing content teams to the new tools and you need to migrate the data. So you were using Adobe commerce. Okay, now we are moving to something headless. You are using IEM, now let's try to use contentful. But you know, when you compare what IBM brings to the marketing teams and what, what can contentful give to you, it's like, you know. [00:15:28] Speaker A: Right. [00:15:29] Speaker B: I still believe that those years where AEM was developed are, I mean AEM is a great tool. I mean it's great for managing, managing the content and it's great for managing the publication process. It's got a lot of integrations. You can have your Dom, you can have your pages, tags, everything and. But I believe that at the same as with other, other solutions like Adobe Commerce or any other heritage CMS systems like probably Magnolia or previous versions of Sitecore, you can modernize existing architectures and use them as they were designed to be. [00:16:16] Speaker A: Yeah, exactly. But one thing that I think is interesting though, and like as you're talking about moving to this whole, you know, microservice headless containers, things like that, or disparate third party services that you're stitching together that aren't, that you don't even necessarily own all of them. The problem that you have with that is a shared problem that you have with a, with having all those services on the backend which you've got. The problem is you have a bunch of different services and the failure of any one of those services can result in degraded user experience and you need something to stitch it together. And having something providing and managing those backend connections in a performant way, in a way that facilitates fast user experience wherever that user happens to be, is just always going to be a problem. Not necessarily a problem, but it is something technical to confront regardless of whether you're doing that inside your data center or you're doing that externally and orchestrating multiple independent cloud systems that are not even under your hosting. [00:17:34] Speaker B: Yeah, I mean this is something like you can build modern websites using front end technologies and decide to do all those integrations on front end, like doing front end calls to compose the content, to call a search, to get the recommendations and display. But as the number of systems you are calling to grows, the risk of having, you know, slowdown or downtime or at least part of your side being inoperable is growing because, you know, your site is probably in some cases at least as slow as the slowest service you are. You are asking to. Exactly so. But in order to solve that problem, you need to revert the model. So it's true? [00:18:42] Speaker A: Yeah, it's true. So I've got this here, let me fix how this works. Okay, good. So I got this as a random diagram. I just pulled this from my website that I did a while back of just an example, Adobe experience manager architecture multi region. But this could be, you could replace a lot of the parts with almost any CMS. And the problems are going to be similar because in almost any cms, regardless of what you're doing, you may have a couple of common things. Like in this particular architecture, we've got an AM environment that's sitting in the United States. You've got another AM environment, a published environment that's sitting in China to serve to. In China folks, you've got LDAP, you've got a solar search cluster. But let's just say you're paying for search that is only served out of the United States or it's only served out not inside of China. You have publishers making back end calls to a common search cluster. We see this all the time, not just from. Here's how I would love to architect it, but just this is how many, how many licenses we have for this particular piece of software. And it'd be great if we could re architect it, but so sorry, we don't have budget for that right now. Then you have other things like an author environment that might be connected to a PIM, a product information management system that has all of your product details. And a lot of times you have long sync processes. Let's say you've got 75,000 skus in a database and they have, all those skus have to be put into author, and it happens nightly. And there's no other way to do that because it's a job that takes 4 hours to run every time it runs. If you're lucky, sometimes it fails. Like you've got some other database perhaps that's connected to it that has your pricing. And the pricing is different for every country. And so you have all these different systems that they all have to be working and come together in order to for a user over here to see your page and see the experience that they were supposed to get. So, and so this is a problem. [00:21:03] Speaker B: Because I literally can see the history of this being built over the time. [00:21:13] Speaker A: Yeah, it's true. Yeah. We've been through these battles so many times of like, okay, we're going to upgrade. You know, there's one part of the stack. Okay, but can we also do these other parts? No, no, no, we don't have budget for that right now. Okay, good. So we're going to use the one that's in the US. Yes, we're going to use the one that's in the US. [00:21:29] Speaker B: And actually those are not bad decisions. [00:21:32] Speaker A: Yeah. [00:21:33] Speaker B: In fact, for the moment where you make those decisions, we need to, as architects or you know, consultants, we need to advise people to, to make decisions that gives you the best value for the money and with the lowest time. So it's all about trading. When I see this picture, we could move the solar cluster, replicate the sole cluster to the same region as the publishers are, but that would take time, that would take additional licensing and that would take time, license time, cost, that's all. And sometimes it's wiser to just say to solve your particular problem we need to put publishers, dispatchers there. It will solve 80% of our problems, but with the half of the price. So this is, I mean I feel this architecture, I've been working on those projects and actually, you know, when you got a payment, you got a request to display your products on a, on a website and what are the requirements? Okay, are you, do you agree to have it updated once a day? Yes, we do agree. Okay, then which will cost half or something like that, you know, so, so, so those are all traders that you as an architect are making in order to give that, you know, value to the customer. So, and what's more, you on a project, you get a toolbox, right? Like, and you can only, you use the tools that are in your toolbox. So I believe that streamx can be like some extra tools that you can put and use that will allow you to fix some of the problems. [00:23:47] Speaker A: Yeah, well let me explain just a couple of problems because then I want to segue into a different way to solve them because from, because one of the tools that I always had as an engineer is like, okay, well, because if you look at this architecture, so you've got Akamai and you've got your dispatcher, those are places where things can get cached. And so you know that you only have so you can only pull data out of your PIM once a day, you know, and then you have to publish all that data. So that, and that. So you have to, that that data is always going to be a day old, always solar data is always going to be slow. And you're going to figure out some way to try, to, try to maybe cache search pages. Like, let's say you've got pages that are built on search that you're going to try to make, you know, try to leverage cash where you can, but when you can't, the round trip is going to, is going to take a minimum of half a second. [00:24:50] Speaker B: Yeah. And more. [00:24:52] Speaker A: It's hard for the international guys. [00:24:54] Speaker B: Yeah, exactly. It's hard to cache when you catch the response, when you don't know what the queries. And in search, you can have like, you know, unlimited types of queries. So you can, like, as an example. [00:25:10] Speaker A: Sorry to interrupt you, but like, so, as an example. So I, so we were, I was debugging a situation like this and we had, we're like, why is our search performance so slow? Like, what are, like, I thought we were optimizing for queries like this and it was a food site and the query was supposed to be like, we're optimizing, caching for people searching for recipes on chicken. And the guy was like, and there were recipes that were evidently searching for chicken. But the more we mined the logs, we saw that autocomplete and the guy kept misspelling the word chicken like 50 times, like ch ikchi. And then, and then it was sitting there recursively going and searching every single time. And that was why the search cluster was, was going crazy, because nobody could remember how to spell the word chicken. [00:25:58] Speaker B: But I know with the cash you mentioned. Very valid point. I mean, cash is good, but if you can bypass the cash, cash, and you normally will even, you know, on the, we, we had that problems many times when we're caching pages on, on dispatcher. But there were like parameters. And if you use a parameter, then the cache is bypassed. [00:26:22] Speaker A: Right. [00:26:23] Speaker B: I remember the whole site down time because the marketing team sent the link to the landing page with. Yeah, with the, with the campaign id. Yes, everyone. [00:26:41] Speaker A: Yeah, yeah. [00:26:44] Speaker B: And there were, there was, I mean, you know, the system works stable all the time, but as soon as we send a marketing newsletter, you know, it goes down. And this is the moment where you don't want your site to be down because all those people are just taking part in your campaign. [00:27:02] Speaker A: Yeah. [00:27:02] Speaker B: So the same is with cash, right? Like you can cache a page, but if you misspell the query, like you a lot. [00:27:10] Speaker A: That's right, that's right. [00:27:12] Speaker B: So with streamx we reverted the model. So we decided that pims are great, but they're great for managing the product, not for serving the digital experiences. Commerces are great, but those are monolithic applications and you cannot scale it indefinitely. CMS, we already know like what's relevant in the CMS, in the context of the website, the moment when you click publish. So when you publish an image, when you publish a page, when you publish experience fragment, or when you publish a data, this is a single action. When you say, okay, this should be public. So we created a digital experience mesh which is based on event streaming and microservices. And we created real time data pipelines that listens or receives all those events like publications or product updates or API changes, and push it in the real time to those pipelines and put it on the specialized services close to the customer. So the unified data layer can be put, one unified data layer is built on the services or microservices. One can be put in China, one can be put in Europe, one can be put in us. And the customer always it the delivery layer. So it's like real time CDN, you push the data close to customers, you never reach out to original systems. Instead you built your logic inside of a digital experience mesh to reflect the state of the source systems into the delivery layer. This is super important because it's normally I would say that, okay, there are pages, pages are obvious. You publish a page, you generate a page, you put a page on a web server, it's Nginx in our case. So it can handle hundreds of thousands of requests per second. On our simple demo setup that cost one thousand dollars, dollars per month. So it's Superman. For the search. We either use open search or bear lucid indices. They're limited with its own state. And it's like a microservice deployed very close to the customer. So the customer performance a search based on that tidy microservice, you can implement any delivery layer services. So we came up with the blueprints that are like a predefined set of services that you can use. The same as with the pipelines. Like you got pipeline for search feed extraction, you got a pipeline for JSON manipulation and aggregation, you got a pipeline for templating, you got a pipeline for sidebar generation and so on. Important here is that you never reach out to the customer data center or the cloud service. You don't need to scale the left layer or it's quite funny, left side can be down or even stream x can be down. And as long as the delivery layer which are very simple like NginX server with static files and some SSI or open search. So yeah, quite easy to recover, quite easy to scale. As long as those services are up, customer will see the site as it is. If the left side or the middle site is down then the site will stop to get updates and that's all. When you CMS crush you are not able to publish. But it still doesn't affect the end user experience. And this is actually what we have, I mean what we experience. We created the first version of this mesh. It was not called Stream X and it was not the state of it as it is now. But we created more than one year ago before we head to Adobe Sabbath Europe in London. And when I landed from a plane I get a message that our website is up and running and it's updated with the new content. So I can talk to people that share the, share the information. And like an hour after that I start to receive the error messages that our CMS is constantly restarting because out of the disk space. So what I did, I just removed the alerting. I knew that with the CMS down, you know all the site is up and we didn't expect the updates during the weekend so or yeah so on Monday someone came and just fixed the disk issue. But you know the CMS was down for two days but still that the site was up and running. Very important also thing is that you do not need to scale all those cmss, PM, ecommerce or API instances there often very grown big monolithic applications. They are hard to scale. You just scale the right side of the microservices. We build it on Quarkus to start a new instance. It takes less than a second. Probably with a container provisioning and triggering a couple of seconds we can rapidly scale up and download to either handle the load or to make some savings. Like you don't need instances that you are not using. So it's based on kubernetes. So actually the auto scaling, horizontal auto scaling in particular is like a built in functionality so we can tune it. Because you never reach to the source systems. The path that the customer request needs to go is very short. [00:34:12] Speaker A: That's right. [00:34:13] Speaker B: It's not like a CDN, you do not have 200 points of presence. We deployed, we can deploy it to edge, but normally we deploy it to geographic locations like continents probably. But can you put a CDN in. [00:34:30] Speaker A: Front of it if you wanted for faster response time for all that stuff? [00:34:35] Speaker B: Exactly. So that the traffic goes through the internal network or the public network so you can hit the nearest CDN point of presence. And then for example, through the AWS network, it goes to the nearest data center, which is like a matter of milliseconds because it's not traveling to us, it's traveling. Thousand kilometers of speed of light is probably quite built. [00:35:06] Speaker A: What I think is fascinating about this is that so for almost every architecture that any of us have ever made, you're constantly designing each. Like the scaling of all of those backend systems on the left based on the concept of a cache miss and how cacheable is the request? If it's not cacheable, and it's an expensive request that takes a while to generate, then that's how much you have to scale the backend system. And in some cases, that backend system has to be scaled ridiculously. I've had some AEM environments where just not enough work was put on how to make some of these requests cacheable. And so we ended up with 16 publishers serving things because that was just what we had to have. And those licenses get really expensive. And then you go, okay, well, like you said too, some of these just don't scale. The only way you could do it is just, let's just give it some more cpu's. Like some of these Pim systems, you can't go and give it 45, whatever, endpoint environments or something like that, so that you can get them more often. And in a lot of cases too, though, you still have. So we talked about pimples and like, oh, is it okay to have a one day lag on, on the freshness of the data? In some cases, that's completely unacceptable. Like in some cases, like you had a financial services institution that has interest rates, and there's an API that serves the interest rates. And the interest rates has to be fresh, like right now. Fresh because you can't say that. If you jack up your interest rate, you can have your website saying, hey, we offer, uh, such and such, uh, uh, interest rate. And the guy calls up and says, hey, did your website said 5%? Oh, I'm sorry, sir, it's 8.5%. You know, bait switch. Like, you can't have that. So that it has to be fresh right away now that API has to be performant. So, but the only time, the only reason why it has to be performant is so many people are hitting it. What if it, what if? That's what I think is fascinating about turning it all the way around. They said, well what, what if, what if you're always serving the fresh interest rate and I'm just telling you, when it changes, which is what I think is fascinating about this. The whole stream x approach is this. [00:37:33] Speaker B: I mean, regarding the scaling of the 16 publishers, it's like you need to scale the backend for the worst case scenario, like when you flash the cache. So that's why you need so many publishers, even if they are not used most of the time. But you need to prepare that if outer hits publish or someone flashes the cache, it still works. So yeah, it generates a lot of cost. Regarding the relevancy of data, I mean it's like again, world is changing. I mean you got so many customers to enterprises have, has so many information that they want to present on the website or they could present on the website but they cannot because it's not acceptable, like you said, to do nightly bad jobs. Or there's a second question, how many data can you put into the JCR? JCR was designed, Jackrabbit was designed to manage pages, assets, et cetera. So if you start to treat it like a single source of true of the information that you have on your website, it will be overgrown. I mean like the performance will go down, your backups will take forever. It will also cost because you need to have those storage, et cetera. It's also replicated to all those publishers. So if you got 1 data, because you put everything on your into your CMS, it's like 16 times 1 tb plus backups, plus some staging environments. And actually there is literally no need into taking that approach. I mean, okay, sometimes it's good to load the data into the CMS or integrate the PIM with the CMS if you want to have a great authoring experience and you get a couple of products, but, but in most of the cases you are not going to edit all those like each page a page for SKU. So there is no need to outer 50,000 skus in CMS. So those data should not go there. So with the team, with the pricing system, with the availability, stock availability, it can go from other system. All those information can go through our real time data pipelines and can be put for consumption close to the, again close to the user. We use image streaming, actually we use Apache Pulser, but in general it's like Kafka but more cloud native. It allows you to publish think data on your system on your source system and push it to the delivery layer in 1020 milliseconds, probably. So it is like a real time. So loading the page into your browser takes more time. So of course if you are going to publish 50,000 skus, it probably will take like a couple of seconds to generate all those indices, sitemaps, and put some listings that will be ready to consumption and the pages eventually generated. But still it will take seconds, not 24 hours, with very complex or quite complex meshes. Our end to end latency, like for building search, feeding the search, building sitemap, generating some templates, etcetera, it takes like 100 milliseconds. So it's still, you know, much, much faster than 24 hours. Also, what is important is the throughput, because latency is one thing, but even streaming is like a data highway. You can scale it, actually. So if you want to publish gigabytes of data per second, it's possible to do, you just need to add new brokers. So within our architecture, we are prepared to scale also the messaging layer. We can scale each of the services independently. So the whole of the platform is actually horizontally scalable. So if you for example, imagine a case that you want to track users activity, like the way they're browsing your website, and based on, let's say visited products, you want to create real time recommendations for the users, like you visited those products, you may be interested in this, this and that, or when you do a checkout, you may ask the user, didn't you forget about some other things that other users, like you were buying together with those products? So with either streaming, we can process tens of thousands or hundreds of thousands of publication of events per second. So we can literally track every customer step in order to enhance his experience in real time. So it's not only the data that you have in your backend systems, that may be the data that are coming from the customer, from the user session. [00:43:26] Speaker A: Exactly. And one thing that I think is um, additionally fascinating about this approach is that it's not a, um, it's a common problem with both in data center or self hosted or monolithic app where you've got, because you, because previously in this, in this land of like let's like think about um, uh, a, the state of Adobe experience manager or, or CMSS in 2012, if you wanted to integrate anything, the point of integration was this really feature heavy OSGi monster that AEM was. You could say, good, plug everything into the author, just plug it all in and we'll publish it, and then just do it right. Obviously, then you've got that bottleneck, but you could. So we've moved away from that. But even if you take a brand new, completely non am architecture, let's just say you've got something, you're contentful or some other headless cms that is operating a CMS, let's say you have some service that is. Let's say you do whatever. You got a chain of pastry shops or restaurants or something like that. So you got the thing that plugs into Uber Eats and Doordash and stuff like that that allows you to place and take orders and transmit them to stores. You've got something that's centralizing your pricing. You got something that is your credit card processor. So you've got a number of different systems that in a lot of cases, those are all composited at the client side, where the client side web app or mobile app or whatever it is, is making back end calls to each one of these individual services, which, you know, you sign up for them and they say, oh, yeah, we got 99% sla or whatever, you know, yeah, three nines or whatever they tell you, and they don't, and they go down, or a bunch of orders come in and they're like, oh, that's a DoS attack. And then they shut you down. They shut your API down because they think you're dosing the backend service and it goes down or whatever, there's a million reasons why the backend services can go down, and there's a million performance problems that can happen when you have a client side responsible for 100% of the backend communications to things that could all be changing. So you basically have a user experience that in the best case is okay, and in the worst case is totally unacceptable, and it's just a broken app. So to have that just flapping in the breeze, depending on what the state of all these backend systems is, when you run rum, like real user monitoring on some of these things, our experiences all over the map, because we're letting clients handle all these backend communications, which, who knows what the state of the service is. So, so in both cases, like edge delivery service is another one of these ones where you're like, the service, that the way that it's managing content is brilliant. I think it's a brilliant solution for that. I think it's a really neat way to turn a lot of that content management on its head. Doc based authoring is wonderful, and everybody who I've seen who has touched it says, this is really a cool way to author content, but you now have the problem of those client side connections trying to go back to the backend. So how do you see a solution like stream x helping somebody who's gone all in on edge delivery or a client side solution like that? [00:47:36] Speaker B: So yeah, I do agree that edge delivery is stepping to the right direction, like utilizing functions, utilizing cloud native services like yeah, most of all utilizing CDN for the ridiculous fast delivery, right? Yeah, but, but I've been seeing a lot of edge delivery implementations, like actually I'm tracking it whenever something is available, I'm searching for how it works, what is solved, what's not, and source systems are the, I would say, problem, how to handle that? I mean okay, you said that we can call back in systems, but those backend system needs to be scalable, needs to be fast, needs to be close to the user, need to have actual data. So in fact you need to implement all of that using your internal development team. And if they fail to do that, the promise of having super fast high performance lighthouse 100 pages will not be met. I mean not be. Yeah. So I believe that with our architecture we are a great addition to edge delivery service because you can build your website on edge delivery, you can outer your document, your content in documents and then you can plug actually any sources that you have if the real time data processing and that you know, to build the products like JSON's HTML fragments like search service, like recommendations and then put those services with the final products close to the customer. So again you put that just behind the CDN. It's not on the edge but it's in the serving network and it's very close to the edge. You can eventually deploy it to the edge but it will be, the maintenance will be rather hard. But I still believe that it's the best option what you can have because what's the alternative? Alternatively you can deploy some workers on a CDN and fetch those data to the storage within the CDN, which is radically expensive because you got 200 data centers and that storage needs to be probably everywhere and still there are limited in the capacity so you cannot push all the data there. And what is, I mean, yeah, so I see it like you can use workers and data in a CDN for the critical things like you know, small chunks of data that you need to put really, really close to the customer. But if you were talking about, you know, gigabytes of data like you know, all those skus product information, recommendation etc. Etcetera, that changing all the time and you need to be in control of when it's published or not, then our architecture so real time digital experiment can fix it. So again it will offload your front end application which will also improve the customer experience because you won't have all those JavaScript integrations like call this service, call that service, take those two things together. If it's the user a then display this, if the user bid and display that. So we could employ microservices that will do the business logic for you and so that the front end is just the presentation layer. So with that approach you will have the speed that you would expect from the edge delivery service. Actually there is one more thing like data orchestration. So it's not trivial to build index, search index when your content comes from documents and from PIM and from probably you still got some pages on your IEM instance because yeah, you built one site on like careers pages or blog posts on edge delivery, but you still got your main site and everything in IEM so you haven't created full yet. So with stream x we can listen for the updates from edge delivery and we can listen for the updates from IEM and we can listen for the update from Pim, for example, and create again unified data layer. So the search that we build will be the same search that am site is using and edge delivery site is using. And the sitemap for example will be built from pages from AEM, from edge delivery, or eventually from generated pages out of the PIM based on the template that you build. [00:53:12] Speaker A: And you're doing that not as a batch, you're doing that. It's hard to say real time because, not real time, but it could be only a few seconds delay. Because like you're saying, if you've got, if you've got, let's say you have a take that crazy architecture that you and I both feel spiritually. So if you've got a PIM update, because let's just say in your search engine you've got recommendations, let's say you get some search recommendations that are coming out of that, or you've got a product page that has a bunch of recommended products that are supposed to go along with it or something like that. This gets built based on the data that you indexed. If you're indexing it and you're only doing it once in the middle of the night or something like that. And those indexing records, if you publish a new item that should show up as a recommended item for all these other things that may end up in this architecture that may end up being 24 hours before it starts to see any traffic. [00:54:15] Speaker B: Yeah. And actually I may be wrong, but I may not. In edge delivery service you get like page indexer. So if you publish pages, it indexes the tile pages so you can put a listing on your edge delivery site. And it's built in functionality within edge delivery. It takes like, like I would say 2 hours. But I may be wrong. I just updated it. [00:54:47] Speaker A: It's an unreliable amount of time that it takes because it's going on a service bus and it's going to be, it could be seconds and it could be a little while. [00:54:58] Speaker B: Yeah. So with Stream X, like the processing pipeline takes again milliseconds. So if you get really fat search index, it can take seconds to index new documents, especially if you publish 50,000 skus. But it's something that you are in control of. So if you decide that it's too slow, you may scale. And I really believe that with edge delivery that solves a lot of problems. It solves outering problems because you're outering, teams do not need to type, you know, content in docs and then sends to someone who just build components out of what is written. They can literally eliminate the step of rewriting from docs into AEM. But because this is how it worked for a lot of, a lot of people, it solves the problem of global delivery with the great speed. It solves the problem of building lighthouse optimized pages. So it's not like a fact react, react or angular app that will take seconds to fetch all those pieces in order to display the first page. It's ridiculously fast. So I really believe that if you do the switch to edge delivery, you deserve the backend that is the same amount that matches it. Because you know, if you, if you buy a great car and buy a cheap tires, you won't break a wrap or a truck. [00:56:41] Speaker A: Yeah, yeah, I have experience with that. Yeah. Yeah. I had a high performance car that I bought cheap tires for and that was a bad idea. [00:56:48] Speaker B: It's good for rubber burying probably. [00:56:50] Speaker A: Yep, that's about it. That's about it. But it's funny that taking something like this, I mean, especially now that we've been talking about this for a little bit, because there's so many versions of this type of problem that you and I have had to do over the years, whether it's a big enterprise that sells lots and lots of products, whether it's a restaurant, whether it's a bank, whether it's a whatever it is, there's always back end systems. There's always challenges of various sorts. One of the funnest things that I can think of would be redesigning a complicated architecture like this. If somebody said, oh, we would love to go to edge delivery, there's just no way we would be able to do our back end connections to be able to then say, got an idea for you. And to redesign something like this with a service mesh like we've been discussing. [00:57:52] Speaker B: Yeah, that would be great. I can challenge, but, yeah, yeah, other like challenges. [00:57:59] Speaker A: Yeah. But the thing is, too, the thing is about challenges, that's one thing. If you got a challenge that, you know, you don't really have a fair shot at being able to implement it well. And it's, it's challenging, but you know, you're still, it's kind of like if I, like, if I enter a race right now, I'm not going to place as well as I did when I was 20. It's a challenge, but it's going to be a challenge to finish. But in this case, with tech like we have available right now and with yours, it seems like this is something that even with a complicated architecture and wanting to move to edge delivery, that you would have, you would have a fair shot of a high performance architecture for all your different back end systems. [00:58:52] Speaker B: I think that what's great about modernizing existing complicated architecture and making them composable and high performance is that it doesn't require the customer to do a complex migration because you got your DSP, you got your architecture that was presented, the paragraph, something similar, and you can take a piece of the pages like, okay, let's think of your product pages. We'll fix that for you in three weeks, let's say. So we can deploy Stream X, we can install a connector that is already developed and we can just redirect some of the publication information to our like say blueprint services that also already are ready and at once the customer is happy with it. You know, it was, it was risk free, it was fast. And I see the benefits. So, okay, if we migrated product pages, do it for another part of the site, maybe corporate, maybe help, maybe our blog post, maybe it's the time to, and this is actually very interesting. We had a demo where we have edge delivery and IAM site and stream x in the middle. So streamx fetch the data from IAM, from PIM system and from documents. And actually we use the same data to produce listings on IAM sites and at delivery. So you can stay with your AEM, you can boost it, accelerate the site with the stream x like it's your accelerator. And then you can decide to use edge delivery service and use the search and use the data components, data that you already have. And it's the beauty of message oriented architecture. Everything is based on contract, publishing contract and the contract of the data that you have on your delivery services. As long as your site can consume or your source can publish, it will work. I mean the model is like all the messages just need to match. So it's not one code base that works here and only here under some special conditions. So it's definitely a platform that is future proof. And if you decide, if a customer decide that he wants to see the same recommendations that you have on your website, on your mobile, then we just need to consume some jsons from the services application. So it's, I really believe that it's a true composability like, yeah, but built for the web. So it solves Internet problems, modern web problems like latency, throughput, scalability, availability and so on. But also it allows you to build elegant architectures that will be valid in another year. So it's maybe not the first backend that you think of when you buy IAM license, but it will definitely be worth when the projects are run over the years. [01:02:25] Speaker A: I love it. All right, well I feel like we could, if I kept bringing up diagrams we could keep talking about this, but I think we'll cut this at an hour. And I really hope that somebody, somebody watching this has a crazy architecture in mind that they wouldn't mind giving us a shot at redoing because this is a fun new era of making cmses fun again, because we have a fair shot at high performance for all these different use cases. Even when you're talking to, talking to things that aren't supposed to be performant. [01:03:13] Speaker B: Yep, exactly. So bring the fun again. [01:03:16] Speaker A: Yeah, we know. All right, good. Well thanks so much for coming on the podcast and we'll, we'll talk again soon. [01:03:27] Speaker B: Yeah, thank you. Thank you for invitation.

Show Notes

Episode Transcript

Other Episodes

Episode 0

Ep15 - What is AEM Edge Delivery Services & Can it Replace Wordpress for SMB?

Episode

Ep14 - Adobe Podcast and Adobe Developers Live

Episode 0

Ep3 – Adobe Summit 2023 Releases & Feedback – Tad Reeves / Hank Thobe