microsoft ignite 2024
122 TopicsIntroducing Azure AI Agent Service
Introducing Azure AI Agent Service at Microsoft Ignite 2024 Discover how Azure AI Agent Service is revolutionizing the development and deployment of AI agents. This service empowers developers to build, deploy, and scale high-quality AI agents tailored to business needs within hours. With features like rapid development, extensive data connections, flexible model selection, and enterprise-grade security, Azure AI Agent Service sets a new standard in AI automation54KViews9likes7CommentsIgnite 2024: Bidirectional real-time audio streaming with Azure Communication Services
Today at Microsoft Ignite, we are excited to announce the upcoming preview of bidirectional audio streaming for Azure Communication Services Call Automation SDK, which unlocks new possibilities for developers and businesses. This capability results in seamless, low-latency, real-time communication when integrated with services like Azure Open AI and the real-time voice APIs, significantly enhancing how businesses can build and deploy conversational AI solutions. With the advent of new AI technologies, companies are developing solutions to reduce customer wait times and improve the overall customer experience. To achieve this, many businesses are turning to AI-powered agents. These AI-based agents must be capable of having conversations with customers in a human-like manner while maintaining very low latencies to ensure smooth interactions. This is especially critical in the voice channel, where any delay can significantly impact the fluidity and natural feel of the conversation. With bidirectional streaming, businesses can now elevate their voice solutions to low-latency, human-like, interactive conversational AI agents. Our bidirectional streaming APIs enable developers to stream audio from an ongoing call on Azure Communication Services to their web server in real-time. On the server, powerful language models interpret the caller's query and stream the responses back to the caller. All this is accomplished while maintaining low latency, ensuring the caller feels like they are speaking to a human. One such example of this would be to take the audio streams and processing them through Azure Open AI’s real-time voice API and then streaming the responses back into the call. With the integration of bidirectional streaming into Azure Communication Services Call Automation SDK, developers have new tools to innovate: Leverage conversational AI Solutions: Develop sophisticated customer support virtual agents that can interact with customers in real-time, providing immediate responses and solutions. Personalized customer experiences: By harnessing real-time data, businesses can offer more personalized and dynamic customer interactions in real-time, leading to increased satisfaction and loyalty. Reduce wait times for customers: By using bidirectional audio streams in combination with Large Language Models (LLMs) you can build virtual agents that can be the first point of contact for customers reducing the need for customers waiting for a human agent being available. Integrating with real-time voice-based Large Language Models (LLMs) With the advancements in voice based LLMs, developers want to take advantage of services like bidirectional streaming and send audio directly between the caller and the LLM. Today we’ll show you how you can start audio streaming through Azure Communication Services. Developers can start bidirectional streaming at the time of answering the call by providing the WebSocket URL. //Answer call with bidirectional streaming websocketUri = appBaseUrl.Replace("https", "wss") + "/ws"; var options = new AnswerCallOptions(incomingCallContext, callbackUri) { MediaStreamingOptions = new MediaStreamingOptions( transportUri: new Uri(websocketUri), contentType: MediaStreamingContent.Audio, audioChannelType: MediaStreamingAudioChannel.Mixed, startMediaStreaming: true) { EnableBidirectional = true, AudioFormat = AudioFormat.Pcm24KMono } }; At the same time, you should open your connection with Azure Open AI real-time voice API. Once the WebSocket connection is setup, Azure Communication Services starts streaming audio to your webserver. From there you can relay the audio to Azure Open AI voice and vice versa. Once the LLM reasons over the content provided in the audio it streams audio to your service which you can stream back into the Azure Communication Services call. (More information about how to set this up will be made available after Ignite) //Receiving streaming data from Azure Communication Services over websocket private async Task StartReceivingFromAcsMediaWebSocket() { if (m_webSocket == null) return; try { while (m_webSocket.State == WebSocketState.Open || m_webSocket.State == WebSocketState.Closed) { byte[] receiveBuffer = new byte[2048]; WebSocketReceiveResult receiveResult = await m_webSocket.ReceiveAsync(new ArraySegment<byte>(receiveBuffer), m_cts.Token); if (receiveResult.MessageType == WebSocketMessageType.Close) continue; var data = Encoding.UTF8.GetString(receiveBuffer).TrimEnd('\0'); if(StreamingData.Parse(data) is AudioData audioData) { using var ms = new MemoryStream(audioData.Data); await m_aiServiceHandler.SendAudioToExternalAI(ms); } } } catch (Exception ex) { Console.WriteLine($"Exception -> {ex}"); } } Streaming audio data back into Azure Communication Services //create and serialize streaming data private void ConvertToAcsAudioPacketAndForward( byte[] audioData ) { var audio = new OutStreamingData(MediaKind.AudioData) { AudioData = new AudioData(audioData) }; // Serialize the JSON object to a string string jsonString = System.Text.Json.JsonSerializer.Serialize<OutStreamingData>(audio); // Queue the async operation for later execution try { m_channel.Writer.TryWrite(async () => await m_mediaStreaming.SendMessageAsync(data)); } catch (Exception ex) { Console.WriteLine($"\"Exception received on ReceiveAudioForOutBound {ex}"); } } //Send encoded data over the websocket to Azure Communication Services public async Task SendMessageAsync(string message) { if (m_webSocket?.State == WebSocketState.Open) { byte[] jsonBytes = Encoding.UTF8.GetBytes(message); // Send the PCM audio chunk over WebSocket await m_webSocket.SendAsync(new ArraySegment<byte>(jsonBytes), WebSocketMessageType.Text, endOfMessage: true, CancellationToken.None); } } To reduce developer overhead when integrating with voice-based LLMs, Azure Communication Services supports a new sample rate of 24Khz, eliminating the need for developers to resample audio data and helping preserve audio quality in the process Next steps The SDK and documentation will be available in the next few weeks after this announcement, offering tools and information to integrate bidirectional streaming and utilize voice-based LLMs in your applications. Stay tuned and check our blog for updates!Public Preview: The New AKS Monitoring Experience
We're excited to announce the public preview of our enhanced Monitoring experience for Azure Kubernetes Service (AKS). This redesign of the existing Insights experience brings comprehensive monitoring capabilities into a single, streamlined view, addressing some of the most common challenges users face when managing their AKS clusters. Our new Monitoring experience provides both basic (free) and detailed insights (with enabled Prometheus metrics and logging), offering a unified, single-pane-of-glass experience. The basic experience is available for all AKS users with no configuration required at all. A significant benefit of this new experience is in diagnosing pod deployment failures. In the past, identifying pending or failed pods could be a cumbersome process. With the new KPI Card for Pod Status, you can now quickly pinpoint and address these issues before they escalate, ensuring smoother deployments and reduced downtime. Another key scenario where this enhanced view shines is investigating node resource issues. Understanding node readiness and capacity is crucial for efficient cluster management. The Node Readiness Status card, along with detailed CPU and memory usage metrics, provides clear insights into whether your nodes are fully prepared to host pods. This helps prevent resource bottlenecks and optimizes the overall performance of your cluster. Ensuring cluster health during a scaling operation has never been easier. The new Summary Card for Events helps you monitor Kubernetes warning events and pending pod states, making it simple to track and respond to spikes. This ensures your cluster scales smoothly and efficiently, without unexpected hitches that could disrupt your services. Additionally, troubleshooting latency and connectivity issues in AKS is now more straightforward. With enhanced insights into node saturation metrics, including VMSS OS Disk Bandwidth and IOPS consumption, you can quickly identify and resolve issues causing latency. Detailed ETCD monitoring and Load Balancer metrics, such as % SNAT Port Usage, provide critical data to maintain optimal cluster performance, keeping your applications running smoothly. The following comparison table highlights what data comes out of the box for free for ALL AKS users. When you upgrade, you get all the same data collected in the newer Prometheus format as well as access to more rich metrics and logs for your core troubleshooting scenarios. Basic tier metrics Additional metrics in upgraded experience Alert summary card Historical Kubernetes events (30 days) Events summary card Warning events by reason Pod status KPI card Namespace CPU and memory % Node status KPI card Container logs by volume Node CPU and memory % Top five controllers by logs volume VMSS OS disk bandwidth consumed % (max) Packets dropped I/O VMSS OS disk IOPS consumed % (max) Load balancer SNAT port usage We’re committed to providing you with the tools you need to manage and optimize your AKS clusters effectively. Explore the new Monitoring experience in the Azure portal today and experience the future of AKS monitoring!820Views2likes0CommentsNew Da/Ea/Fav6 VMs with increased performance and Azure Boost are now generally available
By Sasha Melamed, Senior Product Manager, Azure Compute We are excited to announce General Availability of new Dalsv6, Dasv6, Easv6, Falsv6, Fasv6, and Famsv6-series Azure Virtual Machines (VMs) based on the 4th Gen AMD EPYC™ processor (Genoa). These VMs deliver significantly improved performance and price/performance versus the prior Dasv5 and Easv5 VMs, NVMe connectivity for faster local and remote storage access, and Azure Boost for improved performance and enhanced security. With the broad selection of compute, memory, and storage configurations available with these new VM series, there is a best fit option for a wide range of workloads. What’s New The new Dalsv6, Davs6, and Easv6 VMs are offered with vCPU counts ranging from 2 to 96 vCPUs. The new general purpose and memory optimized VMs will come in a variety of memory (GiB)-to-vCPU ratios, including the Dalsv6 at 2:1, Dasv6 at 4:1, and Easv6 at 8:1 ratios. The VMs are also available with and without a local disk so that you can choose the option that best fits your workload. Workloads can expect up to 20% CPU performance improvement over the Dasv5 and Easv5 VMs and up to 15% better price/performance. Further expanding our offerings, we are proud to introduce the first Compute-optimized VM series based on AMD processors also in three memory-to-vCPU ratios. The new Falsv6, Fasv6, and Famsv6 VMs offer the fastest x86 CPU performance in Azure and have up to 2x CPU performance improvement over our previous v5 VMs, as shown in the graph below. We are excited to announce that the new Dalsv6, Dasv6, Easv6, and suite of Fasv6 virtual machines are powered by Azure Boost. Azure Boost has been providing benefits to millions of existing Azure VMs in production today, such as enabling exceptional remote storage performance and significant improvements in networking throughput and latency. Our latest Azure Boost infrastructure innovation, in combination with new AMD-based VMs, delivers improvements in performance, security, and reliability. The platform provides sub-second servicing capabilities for the most common infrastructure updates, delivering a 10x reduction in impact. To learn more about Azure Boost, read our blog. To drive the best storage performance for your workloads, the new AMD-based VMs come with the NVMe interface for local and remote disks. Many workloads will benefit from improvements over the previous generation of AMD-based with up to: 80% better remote storage performance 400% faster local storage speeds 25% networking bandwidth improvement 45% higher NVMe SSD capacity per vCPU for Daldsv6, Dadsv6, Eadsv6-series VMs with local disks The 4th Gen AMD EPYC™ processors provide new capabilities for these VMs, including: Always-On Transparent Secure Memory Encryption ensuring that your sensitive information remains secure without compromising performance. AVX-512 to handle compute-intensive tasks such as scientific simulations, financial analytics, AI, and machine learning. Vector Neural Network Instructions enhancing the performance of neural network inference operations, making it easier to deploy and scale AI solutions. Bfloat16 for efficient training and inference of deep learning models, providing a balance between performance and precision. Dasv6, Dadsv6, Easv6, Eadsv6, Fasv6, and Fadsv6-series VMs are SAP Certified. Whether you’re running a simple test infrastructure, mission critical enterprise applications, high-performance computing tasks, or AI workloads, our new VMs are ready to meet your needs. Explore the new capabilities and start leveraging the power of Azure today! General-purpose workloads The new Dasv6-series VMs offer a balanced ratio of memory to vCPU performance and increased scalability, up to 96 vCPUs and 384 GiB of RAM. Whereas the new Dalsv6-series VM series are ideal for workloads that require less RAM per vCPU, with a max of 192 GiB of RAM. The Dalsv6 series are the first 2GiB/vCPU memory offerings in our family of AMD-based VMs. The Dalsv6 series can reduce your costs when running non-memory intensive applications, including web servers, gaming, video encoding, AI/ML, and batch processing. The Dasv6-series VMs work well for many general computing workloads, such as e-commerce systems, web front ends, desktop virtualization solutions, customer relationship management applications, entry-level and mid-range databases, application servers, and more. Series vCPU Memory (GiB) Max Local NVMe Disk (GiB) Max IOPS for Local Disk Max Uncached Disk IOPS for Managed Disks Max Managed Disks Throughput (MBps) Dalsv6 2-96 4-192 N/A N/A 4 - 172K 90 – 4,320 Daldsv6 2-96 4-192 1x110 - 6x880 1.8M 4 - 172K 90 – 4,320 Dasv6 2-96 8-384 N/A N/A 4 - 172K 90 – 4,320 Dadsv6 2-96 8-384 1x110 - 6x880 1.8M 4 - 172K 90 – 4,320 Memory-intensive workloads For more memory demanding workloads, the new Easv6-series VMs offer high memory-to-vCPU ratios with increased scalability up to 96 vCPUs and 672 GiB of RAM. The Easv6-series VMs are ideal for memory-intensive enterprise applications, data warehousing, business intelligence, in-memory analytics, and financial transactions. Series vCPU Memory (GiB) Max Local NVMe Disk (GiB) Max IOPS for Local Disk Max Uncached Disk IOPS for Managed Disks Max Managed Disks Throughput (MBps) Easv6 2-96 16-672 N/A N/A 4 - 172K 90 – 4,320 Eadsv6 2-96 16-672 1x110 - 6x880 1.8M 4 - 172K 90 – 4,320 Compute-intensive workloads For compute-intensive workloads, the new Falsv6, Fasv6 and Famsv6 VM series come without Simultaneous Multithreading (SMT), meaning a vCPU equals one physical core. These VMs will be the best fit for workloads demanding the highest CPU performance, such as scientific simulations, financial modeling and risk analysis, gaming, and video rendering. Series vCPU Memory (GiB) Max Uncached Disk IOPS for Managed Disks Max Managed Disks Throughput (MBps) Max Network Bandwidth (Gbps) Falsv6 2-64 4-128 4 - 115K 90 - 2,880 12.5 - 36 Fasv6 2-64 8-256 4 - 115K 90 - 2,880 12.5 - 36 Famsv6 2-64 16-512 4 - 115K 90 - 2,880 12.5 - 36 Customers are excited about new AMD v6 VMs FlashGrid offers software solutions that help Oracle Database users on Azure achieve maximum database uptime and minimize the risk of outages. The Easv6 series VMs make it easier to support Oracle RAC workloads with heavy transaction processing on Azure using FlashGrid Cluster. The NVMe protocol enhances disk error handling, which is important for failure isolation in high-availability database architectures. The CPU boost frequency of 3.7 GHz and higher network bandwidth per vCPU enable database clusters to handle spikes in client transactions better while keeping a lower count of vCPU to limit licensing costs. The Easv6 VMs have passed our extensive reliability and compatibility testing and are now available for new deployments and upgrades. – Art Danielov, CEO, FlashGrid Inc. Helio is a platform for large-scale computing workloads, optimizing for costs, scale, and emissions. Its main focus is 3D rendering Our architectural and media & entertainment (VFX) 3D rendering workloads have been accelerated by an average of ~42% with the new v6 generation, while maintaining low cost and high scale. In addition, we are seeing significant improvements in disk performance with the new NVMe interface, resulting in much faster render asset load times. -- Kevin Häfeli, CEO / Cofounder Helio AG Silk's Software-Defined Cloud Storage delivers unparalleled price/performance for the most demanding, real-time applications. Silk has tested the new Da/Eav6 VM offering from Azure and we are looking forward to enable our customers to benefit from its new capabilities, allowing higher throughput at lower cost, while providing increased reliability” -- Adik Sokolovski, Chief R&D Officer, Silk ZeniMax Online Studios creates online RPG worlds where you can play and create your own stories. The new VMs we tested provided a significant performance boost in our build tasks. The super-fast storage not only made the workflows smoother and faster, but it also helped highlight other bottlenecks in our design and allowed us to improve our pipeline overall. We are excited for their availability and plan on utilizing these machines to expand our workload in Azure. -- Merrick Moss, Product Owner, ZeniMax Online Studios Getting started The new VMs are now available in the East US, East US 2, Central US, South Central US, West US 3, West Europe, and North Europe regions with more to follow. Check out pricing on the following pages for Windows and Linux. You can learn more about the new VMs in the documentation for Dal-series, Da-series, Ea-series, and Fa-series. We also recommend reading the NVMe overview and FAQ. You can find the Ultra disk and Premium SSD V2 regional availability to pair with the new NVMe based v6 series at their respective links.3.8KViews4likes5CommentsIntroducing Serverless GPUs on Azure Container Apps
We're excited to announce the public preview of Azure Container Apps Serverless GPUs accelerated by NVIDIA. This feature provides customers with NVIDIA A100 GPUs and NVIDIA T4 GPUs in a serverless environment, enabling effortless scaling and flexibility for real-time custom model inferencing and other machine learning tasks. Serverless GPUs accelerate the speed of your AI development team by allowing you to focus on your core AI code and less on managing infrastructure when using NVIDIA accelerated computing. They provide an excellent middle layer option between Azure AI Model Catalog's serverless APIs and hosting models on managed compute. It provides full data governance as your data never leaves the boundaries of your container while still providing a managed, serverless platform from which to build your applications. Serverless GPUs are designed to meet the growing demands of modern applications by providing powerful NVIDIA accelerated computing resources without the need for dedicated infrastructure management. "Azure Container Apps' serverless GPU offering is a leap forward for AI workloads. Serverless NVIDIA GPUs are well suited for a wide array of AI workloads from real-time inferencing scenarios with custom models to fine-tuning. NVIDIA is also working with Microsoft to bring NVIDIA NIM microservices to Azure Container Apps to optimize AI inference performance.” - Dave Salvator, Director, Accelerated Computing Products, NVIDIA Key benefits of serverless GPUs Scale-to zero GPUs: Support for serverless scaling of NVIDIA A100 and T4 GPUs. Per-second billing: Pay only for the GPU compute you use. Built-in data governance: Your data never leaves the container boundary. Flexible compute options: Choose between NVIDIA A100 and T4 GPUs. Middle-layer for AI development: Bring your own model on a managed, serverless compute platform. Scenarios Whether you choose to use NVIDIA A100 or T4 GPUs will depend on the types of apps you're creating. The following are a couple example scenarios. For each scenario with serverless GPUs, you pay only for the compute you use with per-second billing, and your apps will automatically scale in and out from zero to meet the demand. NVIDIA T4 Real-time and batch inferencing: Using custom open-source models with fast startup times, automatic scaling, and a per-second billing model, serverless GPUs are ideal for dynamic applications that don't already have a serverless API in the model catalog. NVIDIA A100 Compute intensive machine learning scenarios: Significantly speed up applications that implement fine-tuned custom generative AI models, deep learning, or neural networks. High performance computing (HPC) and data analytics: Applications that require complex calculations or simulations, such as scientific computing and financial modeling as well as accelerated data processing and analysis among massive datasets. Get started with serverless GPUs Serverless GPUs are now available for workload profile environments in West US 3, Australia East, and Sweden Central regions with more regions to come. You will need to have quota enabled on your subscription in order to use serverless GPUs. By default, all Microsoft Enterprise Agreement customers will have one quota. If additional quota is needed, please request it here. Note: In order to achieve the best performance with serverless GPUs, use an Azure Container Registry (ACR) with artifact streaming enabled for your image tag. Follow steps here to enable artifact streaming on your ACR. From the portal, you can select to enable GPUs for your Consumption app in the container tab when creating your Container App or your Container App Job. You can also add a new consumption GPU workload profile to your existing Container App environment through the workload profiles UX in portal or through the CLI commands for managing workload profiles. Deploy a sample Stable Diffusion app To try out serverless GPUs, you can use the stable diffusion image which is provided as a quickstart during the container app create experience: In the container tab select the Use quickstart image box. In the quickstart image dropdown, select GPU hello world container. If you wish to pull the GPU container image into your own ACR to enable artifact streaming for improved performance, or if you wish to manually enter the image, you can find the image at mcr.microsoft.com/k8se/gpu-quickstart:latest. For full steps on using your own image with serverless GPUs, see the tutorial on using serverless GPUs in Azure Container Apps. Learn more about serverless GPUs With serverless GPUs, Azure Container Apps now simplifies the development of your AI applications by providing scale-to-zero compute, pay-as you go pricing, reduced infrastructure management, and more. To learn more, visit: Using serverless GPUs in Azure Container Apps (preview) | Microsoft Learn Tutorial: Generate images using serverless GPUs in Azure Container Apps (preview) | Microsoft Learn3.5KViews1like0CommentsAutomating the Linux Quality Assurance with LISA on Azure
Introduction Building on the insights from our previous blog regarding how MSFT ensures the quality of Linux images, this article aims to elaborate on the open-source tools that are instrumental in securing exceptional performance, reliability, and overall excellence of virtual machines on Azure. While numerous testing tools are available for validating Linux kernels, guest OS images and user space packages across various cloud platforms, finding a comprehensive testing framework that addresses the entire platform stack remains a significant challenge. A robust framework is essential, one that seamlessly integrates with Azure's environment while providing the coverage for major testing tools, such as LTP and kselftest and covers critical areas like networking, storage and specialized workloads, including Confidential VMs, HPC, and GPU scenarios. This unified testing framework is invaluable for developers, Linux distribution providers, and customers who build custom kernels and images. This is where LISA (Linux Integration Services Automation) comes into play. LISA is an open-source tool specifically designed to automate and enhance the testing and validation processes for Linux kernels and guest OS images on Azure. In this blog, we will provide the history of LISA, its key advantages, the wide range of test cases it supports, and why it is an indispensable resource for the open-source community. Moreover, LISA is available under the MIT License, making it free to use, modify, and contribute. History of LISA LISA was initially developed as an internal tool by Microsoft to streamline the testing process of Linux images and kernel validations on Azure. Recognizing the value it could bring to the broader community, Microsoft open-sourced LISA, inviting developers and organizations worldwide to leverage and enhance its capabilities. This move aligned with Microsoft's growing commitment to open-source collaboration, fostering innovation and shared growth within the industry. LISA serves as a robust solution to validate and certify that Linux images meet the stringent requirements of modern cloud environments. By integrating LISA into the development and deployment pipeline, teams can: Enhance Quality Assurance: Catch and resolve issues early in the development cycle. Reduce Time to Market: Accelerate deployment by automating repetitive testing tasks. Build Trust with Users: Deliver stable and secure applications, bolstering user confidence. Collaborate and Innovate: Leverage community-driven improvements and share insights. Benefits of Using LISA Scalability: Designed to run large-scale test cases, from 1 test case to 10k test cases in one command. Multiple platform orchestration: LISA is created with modular design, to support run the same test cases on various platforms including Microsoft Azure, Windows HyperV, BareMetal, and other cloud-based platforms. Customization: Users can customize test cases, workflow, and other components to fit specific needs, allowing for targeted testing strategies. It’s like building kernels on-the-fly, sending results to custom database, etc. Community Collaboration: Being open source under the MIT License, LISA encourages community contributions, fostering continuous improvement and shared expertise. Extensive Test Coverage: It offers a rich suite of test cases covering various aspects of compatibility of Azure and Linux VMs, from kernel, storage, networking to middleware. How it works Infrastructure LISA is designed to be componentized and maximize compatibility with different distros. Test cases can focus only on test logic. Once test requirements (machines, CPU, memory, etc) are defined, just write the test logic without worrying about environment setup or stopping services on different distributions. Orchestration. LISA uses platform APIs to create, modify and delete VMs. For example, LISA uses Azure API to create VMs, run test cases, and delete VMs. During the test case running, LISA uses Azure API to collect serial log and can hot add/remove data disks. If other platforms implement the same serial log and data disk APIs, the test cases can run on the other platforms seamlessly. Ensure distro compatibility by abstracting over 100 commands in test cases, allowing focus on validation logic rather than distro compatibility. Pre-processing workflow assists in building the kernel on-the-fly, installing the kernel from package repositories, or modifying all test environments. Test matrix helps one run to test all. For example, one run can test different vm sizes on Azure, or different images, even different VM sizes and different images together. Anything is parameterizable, can be tested in a matrix. Customizable notifiers enable the saving of test results and files to any type of storage and database. Agentless and low dependency LISA operates test systems via SSH without requiring additional dependencies, ensuring compatibility with any system that supports SSH. Although some test cases require installing extra dependencies, LISA itself does not. This allows LISA to perform tests on systems with limited resources or even different operating systems. For instance, LISA can run on Linux, FreeBSD, Windows, and ESXi. Getting Started with LISA Ready to dive in? Visit the LISA project at aka.ms/lisa to access the documentation. Install: Follow the installation guide provided in the repository to set up LISA in your testing environment. Run: Follow the instructions to run LISA on local machine, Azure or existing systems. Extend: Follow the documents to extend LISA by test cases, data sources, tools, platform, workflow, etc. Join the Community: Engage with other users and contributors through forums and discussions to share experiences and best practices. Contribute: Modify existing test cases or create new ones to suit your needs. Share your contributions with the community to enhance LISA's capabilities. Conclusion LISA offers open-source collaborative testing solutions designed to operate across diverse environments and scenarios, effectively narrowing the gap between enterprise demands and community-led innovation. By leveraging LISA, customers can ensure their Linux deployments are reliable and optimized for performance. Its comprehensive testing capabilities, combined with the flexibility and support of an active community, make LISA an indispensable tool for anyone involved in Linux quality assurance and testing. Your feedback is invaluable, and we would greatly appreciate your insights.169Views1like0CommentsFrom Compliance to Auto-Remediation: Azure's Latest Linux Security Innovations
We are pleased to announce that the Azure security baseline through Azure Policy and Machine Configuration for Linux has moved to public preview, and we are expanding the capabilities with built-in auto-remediation feature (limited public preview). Customers face increasing pressure to comply with requirements set by governments, regulatory bodies, or specific industries. As their environments become more complex and hybrid, achieving and maintaining compliance on a large scale remains challenging and problematic. Failing to meet compliance goals can result in substantial business harm, including financial penalties and the potential loss of customers. Introducing enhanced audit and the new auto-remediation experience: Recognizing the above-mentioned challenges, Microsoft has developed a solution to help customers navigate these complexities at ease. The Azure security baseline for Linux offers compliance and built-in auto-remediation (limited public preview) features via Azure Policy’s Machine Configuration and Microsoft’s open-source Azure-OSconfig engine. The combination of these capabilities will ensure that security is embedded by design and compliance requirements are upheld, whether workloads operate in the cloud, on-premises, or in another CSP environment, through the Azure Arc platform. Thanks to the new approach we provide detailed information about the state of compliance and more accurate results with detailed descriptions with direct reference to the CIS rule definitions. Furthermore, the new architecture has enabled us to implement and provide automatic remediation capabilities against the security baseline providing a Linux-native experience for our customers when it comes to hardening. Microsoft has implemented a streamlined version of Linux security best practices, primarily based on the latest CIS (Center for Internet Security) Distribution Independent Linux benchmark. All the audit and remediation results are available and can be queried within the Azure Resource Graph Explorer for reporting and monitoring purposes. As security is Microsoft’s top priority, we will provide these capabilities at no additional cost to our customers, with charges only applying to the Azure Arc managed workloads hosted on-premises or other CSP environments. What’s next: At Microsoft we strive to continuously improve customer satisfaction - understanding that a one-size-fits-all approach is not feasible for hardening and security, we are committed to working with our customers throughout the preview process to improve the end-to-end experience. In addition to that, Microsoft is committed to evolve and further develop and deliver new security baseline contents to be fully aligned with the latest CIS standards across various Linux distributions and will collaborate with the relevant standard bodies to contribute to the standards, benefiting both the broader community and the wider industry. Stay tuned in this space for more information - exciting news to come in the upcoming months! What happens with the existing Azure security baseline for Linux capability: Every VM customer which has the “Linux machines should meet requirements for the Azure compute security baseline” policy definition assigned will be auto migrated by the Azure team in the upcoming months to the new policy definition. (audit only) We are going to do a gradual rollout of this enhanced capability. For the time being approximately 3-6 months post announcement, the existing policy will still be available and then it will be deprecated and removed from the Azure portal. Learn more: Sign-up form for the auto-remediation capability Read more about Azure Arc Check out the Azure osconfig’s GitHub repo Comparison between old and new baseline is attached to the blog List of supported operating systems (check the Linux distros in the table)912Views0likes6CommentsIntroducing Azure Local: cloud infrastructure for distributed locations enabled by Azure Arc
Today at Microsoft Ignite 2024 we're introducing Azure Local, cloud-connected infrastructure that can be deployed at your physical locations and under your operational control. With Azure Local, you can run the foundational Azure compute, networking, storage, and application services locally on hardware from your preferred vendor, providing flexibility to meet your requirements and budget.65KViews21likes23CommentsAKS Arc - Optimized for AI Workloads
Overview Azure is the world’s AI supercomputer providing the most comprehensive AI capabilities ranging from infrastructure, platform services to frontier models. We’ve seen emerging needs among Azure customers to use the same Azure-based solution for AI/ML on the edge with minimized latencies while staying compliant with industry regulation or government requirement. Azure Kubernetes Service enabled by Azure Arc (AKS Arc) is a managed Kubernetes service that empowers customers to deploy and manage containerized workload whether they are in data centers or at edge locations. We want to ensure AKS Arc provides optimal experience for AI/ML workload on the edge, throughout the whole development lifecycle from AI infrastructure, Model deployment, Inference, Fine-tuning, and Application. AI infrastructure AKS Arc supports Nvidia A2, A16, and T4 for compute-intensive workload such as machine learning, deep learning, model training. When GPUs are enabled in Azure Local; AKS Arc customers can provision GPU node pools from Azure and host AI/ML workload in the Kubernetes cluster on the edge. For more details, please visit instructions from GPU Nodepool in AKS Arc. Model deployment and fine tuning Use KAITO for language model deployment, inference and fine tuning Kubernetes AI Toolchain Operator (KAITO) is an open-source operator that automates and simplifies the management of model deployments on a Kubernetes cluster. With KAITO, you can deploy popular open-source language models such as Phi-3 and Falcon, and host them in the cloud or on the edge. Along with the currently supported models from KAITO, you can also onboard and deploy custom language models following this guidance in just a few steps. AKS Arc has been validated with the latest KAITO operator via helm-based installation, and customers can now use KAITO in the edge to: Deploy language models such as Falcon, Phi-3, or their custom models Automate and optimize AI/ML model inferencing for cost-effective deployments, Fine-tune a model directly in a Kubernetes cluster, Perform parameter efficient fine tuning using low-rank adaptation (LoRA) Perform parameter efficient fine tuning using quantized adaptation (QLoRA) You can get started by installing KAITO and deploying a model for inference on your edge GPU nodes with KAITO Quickstart Guidance. You may also refer to KAITO experience in AKS in cloud: Deploy an AI model with the AI toolchain operator (Preview) Use Arc-enabled Machine Learning to train and deploy models in the edge For customers who are already familiar with Azure Machine Learning (AML), Azure Arc-enabled ML extends AML in Azure and enables customers to target any Arc enabled Kubernetes cluster for model training, evaluation and inferencing. With Arc ML extension running in AKS Arc, customers can meet data-residency requirements by storing data on premises during model training and deploy models in the cloud for global service access. To get started with Arc ML extension, please view instructions from Azure Machine Learning document . In addition, AML extension can now be used for a fully automated deployment of a curated list of pre-validated language and traditional AI models to AKS clusters, perform CPU and GPU-based inferencing, and subsequently manage them via Azure ML Studio. This experience is currently in gated preview, please view another Ignite blog for more details. Use Azure AI Services with disconnected container in the edge Azure AI services enable customers to rapidly create cutting-edge AI applications with out-of-the-box and customizable APIs and models. It simplified the developer experience to use APIs and embed the ability to see, hear, speak, search, understand and accelerate decision-making into the application. With disconnected Azure AI service containers, customers can now download the container to an offline environment such as AKS Arc and use the same APIs available from Azure. Containers enable you to run Azure AI services APIs in your own environment and are great for your specific security and data governance requirements. Disconnected containers enable you to use several of these APIs disconnected from the internet. Currently, the following containers can be run in this manner: Speech to text Custom Speech to text Neural Text to speech Text Translation (Standard) Azure AI Vision - Read Document Intelligence Azure AI Language Sentiment Analysis Key Phrase Extraction Language Detection Summarization Named Entity Recognition Personally Identifiable Information (PII) detection To get started with disconnected container, please view instructions at Use Docker containers in disconnected environments . Build and deploy data and machine learning pipelines with Flyte Flyte is an open-source orchestrator that facilitates building production-grade data and ML pipelines. It is a Kubernetes native workflow automation tool. Customers can focus on experimentation and providing business value without being an expert in infrastructure and resource management. Data scientists and ML engineers can use Flyte to create data pipelines for processing petabyte-scale data, building analytics workflow for business or finance, or leveraging it as ML pipeline for industry applications. AKS Arc has been validated with the latest Flyte operator via helm-based installation, customers are welcome to use Flyte for building data or ML pipelines. For more information, please view instructions from Introduction to Flyte - Flyte and Build and deploy data and machine learning pipelines with Flyte on Azure Kubernetes Service (AKS). AI-powered edge applications with cloud-connected control plane Azure AI Video Indexer, enabled by Azure Arc Azure AI Video Indexer enabled by Arc enables video and audio analysis, generative AI on edge devices. It runs as Azure Arc extension on AKS Arc and supports many video formats including MP4 and other common formats. It also supports several languages in all basic audio-related models. The Phi 3 language model is included and automatically connected with your Video Indexer extension. With Arc enabled VI, you can bring AI to the content for cases when indexed content can’t move to the cloud due to regulation or data store being too large. Other use cases include using on-premises workflow to lower the indexing duration latency or pre-indexing before uploading to the cloud. You can find more details from What is Azure AI Video Indexer enabled by Arc (Preview) Search on-premises data with a language model via Arc extension Retrieval Augmented Generation (RAG) is emerging to augment language models with private data, and this is especially important for enterprise use cases. Cloud services like Azure AI Search and Azure AI Studio simplify how customers can use RAG to ground language models in their enterprise data in cloud. The same experience is coming to the edge and now customers can deploy an Arc extension and ask questions about on-premises data within a few clicks. Please note this experience is currently in gated preview and please see another Ignite blog for more details. Conclusion Developing and running AI workload at distributed edges brings clear benefits such as using cloud as universal control plane, data residency, reduced network bandwidth, and low latency. We hope the products and features we developed above can benefit and enable new scenarios in Retail, Manufacturing, Logistics, Energy, and more. As Microsoft-managed Kubernetes on the edge, AKS Arc not only can host critical edge applications but also optimized for AI workload from hardware, runtime to application. Please share your valuable feedback with us ([email protected]) and we would love to hear from you regarding your scenarios and business impact.811Views2likes1Comment