Latest version of the Netflix Cloud Architecture story was given at Gluecon May 23rd 2012. Gluecon rocks, and lots of Van Halen references were added for the occasion. There tradeoff between developer driven high functionality AWS based PaaS, and operations driven low cost portable PaaS is discussed. The three sections cover the developer view, the operator view and the builder view.
This document provides an introduction to Docker. It discusses why Docker is useful for isolation, being lightweight, simplicity, workflow, and community. It describes the Docker engine, daemon, and CLI. It explains how Docker Hub provides image storage and automated builds. It outlines the Docker installation process and common workflows like finding images, pulling, running, stopping, and removing containers and images. It promotes Docker for building local images and using host volumes.
What Is A Docker Container? | Docker Container Tutorial For Beginners| Docker...Simplilearn
This presentation on Docker Container will help you understand what is Docker, the architecture of Docker, what is a Docker Container, how to create a Docker Container, benefits of Docker Container, basic commands of Containers and you will also see a demo on creating Docker Container. Docker is a very lightweight software container and containerization platform. Docker containers provide a way to run software in isolation. It is an open source platform that helps to package an application and its dependencies into a Docker container for the development and deployment of software and a Docker COntainer is a portable executable package which includes applications and their dependencies. With Docker Containers, applications can work efficiently in different computer environments.
Below DevOps tools are explained in this Docker Container presentation:
1. What is Docker?
2. The architecture of Docker?
3. What is a Docker Container?
4. How to create a Docker Container?
5. Benefits of Docker Containers
6. Basic commands of Containers
Simplilearn's DevOps Certification Training Course will prepare you for a career in DevOps, the fast-growing field that bridges the gap between software developers and operations. You’ll become an expert in the principles of continuous development and deployment, automation of configuration management, inter-team collaboration and IT service agility, using modern DevOps tools such as Git, Docker, Jenkins, Puppet and Nagios. DevOps jobs are highly paid and in great demand, so start on your path today.
Why learn DevOps?
Simplilearn’s DevOps training course is designed to help you become a DevOps practitioner and apply the latest in DevOps methodology to automate your software development lifecycle right out of the class. You will master configuration management; continuous integration deployment, delivery and monitoring using DevOps tools such as Git, Docker, Jenkins, Puppet and Nagios in a practical, hands-on and interactive approach. The DevOps training course focuses heavily on the use of Docker containers, a technology that is revolutionizing the way apps are deployed in the cloud today and is a critical skillset to master in the cloud age.
After completing the DevOps training course you will achieve hands-on expertise in various aspects of the DevOps delivery model. The practical learning outcomes of this Devops training course are:
An understanding of DevOps and the modern DevOps toolsets
The ability to automate all aspects of a modern code delivery and deployment pipeline using:
1. Source code management tools
2. Build tools
3. Test automation tools
4. Containerization through Docker
5. Configuration management tools
6. Monitoring tools
DevOps jobs are the third-highest tech role ranked by employer demand on Indeed.com but have the second-highest talent deficit.
Learn more at https://www.simplilearn.com/cloud-computing/devops-practitioner-certification-training
The document discusses Kubernetes networking. It describes how Kubernetes networking allows pods to have routable IPs and communicate without NAT, unlike Docker networking which uses NAT. It covers how services provide stable virtual IPs to access pods, and how kube-proxy implements services by configuring iptables on nodes. It also discusses the DNS integration using SkyDNS and Ingress for layer 7 routing of HTTP traffic. Finally, it briefly mentions network plugins and how Kubernetes is designed to be open and customizable.
Kubernetes for Beginners: An Introductory GuideBytemark
Kubernetes is an open-source tool for managing containerized workloads and services. It allows for deploying, maintaining, and scaling applications across clusters of servers. Kubernetes operates at the container level to automate tasks like deployment, availability, and load balancing. It uses a master-slave architecture with a master node controlling multiple worker nodes that host application pods, which are groups of containers that share resources. Kubernetes provides benefits like self-healing, high availability, simplified maintenance, and automatic scaling of containerized applications.
Vertical thinking for a simple architecture!
Micro Services are a new way of architectural thinking in web platforms. The key idea is strongly aligned on the unix philosophy: Create small services which are only responsible for one thing and make them work together. With this in mind, you get simple applications, which can be developed, deployed and scaled independent from each other.
The key challenge in using micro services is to decompose applications vertically, by their functional domains. Only with this, you are able to reduce dependencies and create simple applications.
On a technical side, micro services are backed by a wide support in different programming languages and open source frameworks. Especially the state of the art deployment mechanisms make this approach possible at all.
This document summarizes an upcoming presentation on architecting microservices on AWS. The presentation will:
- Review microservices architecture and how it differs from monolithic and service-oriented architectures.
- Cover key microservices design principles like independent deployment of services that communicate via APIs and using the right tools for each job.
- Provide example design patterns for implementing microservices on AWS using services like EC2, ECS, Lambda, API Gateway and more.
- Include a demo of microservices on AWS.
- Conclude with a question and answer session.
This document discusses the transition from monolithic architecture to microservices architecture. It begins by outlining challenges with monolithic systems like long development cycles and difficulties scaling. It then defines microservices as loosely coupled services that have bounded contexts. The document provides examples of how to evolve a monolith to microservices by starting with existing services and gradually decomposing the monolith. It acknowledges challenges in distributed systems and eventual consistency that come with microservices. Overall, the document presents microservices as enabling faster innovation, increased agility and delighted customers compared to monolithic systems.
Serverless computing allows running applications without managing infrastructure. Google Cloud Platform offers serverless options like Cloud Functions, Cloud Run, and App Engine. Common serverless patterns include publish-subscribe using PubSub, triggering functions from events, and data pipelines with Dataflow. Serverless applications are built using containers, functions, and fully managed services to focus on code and reduce operational overhead.
Kubernetes Concepts And Architecture Powerpoint Presentation SlidesSlideTeam
The document provides an overview of Kubernetes concepts and architecture. It begins with an introduction to containers and microservices architecture. It then discusses what Kubernetes is and why organizations should use it. The remainder of the document outlines Kubernetes components, nodes, development processes, networking, and security measures. It provides descriptions and diagrams explaining key aspects of Kubernetes such as architecture, components like Kubelet and Kubectl, node types, and networking models.
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...Splunk
With the acceleration of customer and business demands, site reliability engineers and IT Ops analysts now require operational visibility into their entire architecture, something that traditional APM tools, dev logging tools, and SRE tools aren’t equipped to provide. Observability enables you to inspect and understand your IT stack on premises and in the cloud(s); It’s no longer about whether your system works (monitoring), but being able to task why it is not working? (Observability). This presentation will outline key steps to take to move from monitoring to observability.
Netflix on Cloud - combined slides for Dev and OpsAdrian Cockcroft
This document contains slides from a presentation given by Adrian Cockcroft on Netflix's use of cloud computing on Amazon Web Services (AWS). The summary includes:
1) Netflix moved most of its infrastructure to AWS to leverage AWS's scale and features rather than building its own datacenters, as capacity growth was unpredictable and datacenters were inflexible.
2) Netflix uses many AWS services including EC2, S3, EBS, EMR and more. It deployed a large movie encoding farm on EC2, stores content on S3, uses EMR/Hadoop for log analysis, and a CDN for content delivery.
3) Netflix has learned that cloud tools don't always scale for large
A Comprehensive Introduction to Kubernetes. This slide deck serves as the lecture portion of a full-day Workshop covering the architecture, concepts and components of Kubernetes. For the interactive portion, please see the tutorials here:
https://github.com/mrbobbytables/k8s-intro-tutorials
Kubernetes is an open-source system for managing containerized applications across multiple hosts. It includes key components like Pods, Services, ReplicationControllers, and a master node for managing the cluster. The master maintains state using etcd and schedules containers on worker nodes, while nodes run the kubelet daemon to manage Pods and their containers. Kubernetes handles tasks like replication, rollouts, and health checking through its API objects.
SCS 4120 - Software Engineering IV
BACHELOR OF SCIENCE HONOURS IN COMPUTER SCIENCE
BACHELOR OF SCIENCE HONOURS IN SOFTWARE ENGINEERING
All in One Place Lecture Notes
Distribution Among Friends Only
All copyrights belong to their respective owners
Viraj Brian Wijesuriya
[email protected]
Observability has emerged as one of the hottest topics on the DevOps landscape. Organizations seek to improve visibility into their cloud infrastructure and applications and identify production issues that may negatively impact #customerexperience.
➡️ But what are some of the best practices for scaling observability for modernapplications?
➡️ What challenges are #cloudplatforms facing?
Explore how to overcome the challenges and unlock speed, observability, and automation across your DevOps lifecycle.
MicroServices at Netflix - challenges of scaleSudhir Tonse
Microservices at Netflix have evolved over time from a single monolithic application to hundreds of fine-grained services. While this provides benefits like independent delivery, it also introduces complexity and challenges around operations, testing, and availability. Netflix addresses these challenges through tools like Hystrix for fault tolerance, Eureka for service discovery, Ribbon for load balancing, and RxNetty for asynchronous communication between services.
Micro Focus Software Delivery and Testing Jan De Coster Presentation on the Journey to DevOps in the recent Micro Focus #DevDay Copenhagen.
Micro Focus enables enterprise software organizations to build innovative software and accelerate application delivery to meet the needs of the business. Whatever the challenges and infrastructures, our core principle—of reusing what already works to minimize business risk while supporting modern software practices—has positioned our customers to be better prepared to support the digital transformation of the business.
Build, test and deliver innovative software faster with less risk.
April 2017.
This document provides an overview of Kubernetes including:
1) Kubernetes is an open-source platform for automating deployment, scaling, and operations of containerized applications. It provides container-centric infrastructure and allows for quickly deploying and scaling applications.
2) The main components of Kubernetes include Pods (groups of containers), Services (abstract access to pods), ReplicationControllers (maintain pod replicas), and a master node running key components like etcd, API server, scheduler, and controller manager.
3) The document demonstrates getting started with Kubernetes by enabling the master on one node and a worker on another node, then deploying and exposing a sample nginx application across the cluster.
Cloud computing :
Accessibility: Cloud computing facilitates the access of applications and data from any location worldwide and from any device with an internet connection.
Cost savings: Cloud computing offers businesses scalable computing resources hence saving them on the cost of acquiring and maintaining them.
Security: Cloud providers especially those offering private cloud services, have strived to implement the best security standards and procedures in order to protect client’s data saved in the cloud.
Disaster recovery: Cloud computing offers the most efficient means for small, medium, and even large enterprises to backup and restore their data and applications in a fast and reliable way.
This document discusses using Terraform to provision Datadog monitoring tools. Terraform allows for infrastructure as code to manage cloud services. Datadog provides dashboards and alerts to monitor infrastructure and applications. The document outlines installing Terraform, using Terraform providers like Datadog, creating template variables, and implementing basic Datadog resources like dashboards and monitors through Terraform.
What Is Kubernetes | Kubernetes Introduction | Kubernetes Tutorial For Beginn...Edureka!
***** Kubernetes Certification Training: https://www.edureka.co/kubernetes-certification *****
This Edureka tutorial on "What is Kubernetes" will give you an introduction to one of the most popular Devops tool in the market - Kubernetes, and its importance in today's IT processes. This tutorial is ideal for beginners who want to get started with Kubernetes & DevOps. The following topics are covered in this training session:
1. Need for Kubernetes
2. What is Kubernetes and What it's not
3. How does Kubernetes work?
4. Use-Case: Kubernetes @ Pokemon Go
5. Hands-on: Deployment with Kubernetes
DevOps Tutorial Blog Series: https://goo.gl/P0zAfF
This document provides an introduction to Docker and discusses how it helps address challenges in the modern IT landscape. Some key points:
- Applications are increasingly being broken up into microservices and deployed across multiple servers and environments, making portability and scalability important.
- Docker containers help address these issues by allowing applications to run reliably across different infrastructures through package dependencies and resources together. This improves portability.
- Docker provides a platform for building, shipping and running applications. It helps bridge the needs of developers who want fast innovation and operations teams who need security and control.
This lecture covers an introduction to cloud computing. It discusses key topics like cloud types, architecture, services, platforms, security, and applications. Specifically, it defines cloud computing, compares delivery models like Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). It also discusses using major cloud platforms from Amazon Web Services, Google, and Microsoft and exploring concepts like virtualization, capacity planning, and establishing identity/security in the cloud. The lecture concludes by discussing mobile cloud integration and streaming media/video applications in cloud computing.
APM is a tool that monitors application performance and user experience by tracking metrics like load and KPIs. It allows seeing how applications are used by real users and identifying problems that impact sales or brand experience. Observability aggregates data from logs, metrics, and traces to assess overall system health, while APM directly focuses on gauging user experience. Both ensure good user experience but in different ways - APM actively collects data related to response time, while observability passively examines various data sources. Monitoring tracks predefined metrics over time to understand system status, but observability analyzes related data to determine the root cause of issues.
Kubernetes is an open-source container management platform. It has a master-node architecture with control plane components like the API server on the master and node components like kubelet and kube-proxy on nodes. Kubernetes uses pods as the basic building block, which can contain one or more containers. Services provide discovery and load balancing for pods. Deployments manage pods and replicasets and provide declarative updates. Key concepts include volumes for persistent storage, namespaces for tenant isolation, labels for object tagging, and selector matching.
This document provides an introduction to microservices. It begins by outlining the challenges of monolithic architecture such as long build/release cycles and difficulty scaling. It then introduces microservices as a way to decompose monolithic applications into independently deployable services. Key benefits of microservices include improved agility, scalability, and innovation. The document discusses microservice design principles like communicating over APIs, using the right tools for each service, securing services, and being a good citizen in the ecosystem. It provides examples of how to implement a restaurant microservice using AWS services like API Gateway, Lambda, DynamoDB and containers.
Cloud Architecture Tutorial - Why and What (1of 3) Adrian Cockcroft
Introduction to the Netflix Cloud Architecture Tutorial - discusses the why and what of cloud including the thinking behind Netflix choice of AWS, and the product features that Netflix runs in the cloud.
Netflix
has
built
and
deployed
a
scalable
global
Platorm
as
a
Service.
Key
components
of
the
Netflix
PaaS
are
being
released
as
Open
Source
projects
so
you
can
build
your
own
custom
PaaS
Serverless computing allows running applications without managing infrastructure. Google Cloud Platform offers serverless options like Cloud Functions, Cloud Run, and App Engine. Common serverless patterns include publish-subscribe using PubSub, triggering functions from events, and data pipelines with Dataflow. Serverless applications are built using containers, functions, and fully managed services to focus on code and reduce operational overhead.
Kubernetes Concepts And Architecture Powerpoint Presentation SlidesSlideTeam
The document provides an overview of Kubernetes concepts and architecture. It begins with an introduction to containers and microservices architecture. It then discusses what Kubernetes is and why organizations should use it. The remainder of the document outlines Kubernetes components, nodes, development processes, networking, and security measures. It provides descriptions and diagrams explaining key aspects of Kubernetes such as architecture, components like Kubelet and Kubectl, node types, and networking models.
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...Splunk
With the acceleration of customer and business demands, site reliability engineers and IT Ops analysts now require operational visibility into their entire architecture, something that traditional APM tools, dev logging tools, and SRE tools aren’t equipped to provide. Observability enables you to inspect and understand your IT stack on premises and in the cloud(s); It’s no longer about whether your system works (monitoring), but being able to task why it is not working? (Observability). This presentation will outline key steps to take to move from monitoring to observability.
Netflix on Cloud - combined slides for Dev and OpsAdrian Cockcroft
This document contains slides from a presentation given by Adrian Cockcroft on Netflix's use of cloud computing on Amazon Web Services (AWS). The summary includes:
1) Netflix moved most of its infrastructure to AWS to leverage AWS's scale and features rather than building its own datacenters, as capacity growth was unpredictable and datacenters were inflexible.
2) Netflix uses many AWS services including EC2, S3, EBS, EMR and more. It deployed a large movie encoding farm on EC2, stores content on S3, uses EMR/Hadoop for log analysis, and a CDN for content delivery.
3) Netflix has learned that cloud tools don't always scale for large
A Comprehensive Introduction to Kubernetes. This slide deck serves as the lecture portion of a full-day Workshop covering the architecture, concepts and components of Kubernetes. For the interactive portion, please see the tutorials here:
https://github.com/mrbobbytables/k8s-intro-tutorials
Kubernetes is an open-source system for managing containerized applications across multiple hosts. It includes key components like Pods, Services, ReplicationControllers, and a master node for managing the cluster. The master maintains state using etcd and schedules containers on worker nodes, while nodes run the kubelet daemon to manage Pods and their containers. Kubernetes handles tasks like replication, rollouts, and health checking through its API objects.
SCS 4120 - Software Engineering IV
BACHELOR OF SCIENCE HONOURS IN COMPUTER SCIENCE
BACHELOR OF SCIENCE HONOURS IN SOFTWARE ENGINEERING
All in One Place Lecture Notes
Distribution Among Friends Only
All copyrights belong to their respective owners
Viraj Brian Wijesuriya
[email protected]
Observability has emerged as one of the hottest topics on the DevOps landscape. Organizations seek to improve visibility into their cloud infrastructure and applications and identify production issues that may negatively impact #customerexperience.
➡️ But what are some of the best practices for scaling observability for modernapplications?
➡️ What challenges are #cloudplatforms facing?
Explore how to overcome the challenges and unlock speed, observability, and automation across your DevOps lifecycle.
MicroServices at Netflix - challenges of scaleSudhir Tonse
Microservices at Netflix have evolved over time from a single monolithic application to hundreds of fine-grained services. While this provides benefits like independent delivery, it also introduces complexity and challenges around operations, testing, and availability. Netflix addresses these challenges through tools like Hystrix for fault tolerance, Eureka for service discovery, Ribbon for load balancing, and RxNetty for asynchronous communication between services.
Micro Focus Software Delivery and Testing Jan De Coster Presentation on the Journey to DevOps in the recent Micro Focus #DevDay Copenhagen.
Micro Focus enables enterprise software organizations to build innovative software and accelerate application delivery to meet the needs of the business. Whatever the challenges and infrastructures, our core principle—of reusing what already works to minimize business risk while supporting modern software practices—has positioned our customers to be better prepared to support the digital transformation of the business.
Build, test and deliver innovative software faster with less risk.
April 2017.
This document provides an overview of Kubernetes including:
1) Kubernetes is an open-source platform for automating deployment, scaling, and operations of containerized applications. It provides container-centric infrastructure and allows for quickly deploying and scaling applications.
2) The main components of Kubernetes include Pods (groups of containers), Services (abstract access to pods), ReplicationControllers (maintain pod replicas), and a master node running key components like etcd, API server, scheduler, and controller manager.
3) The document demonstrates getting started with Kubernetes by enabling the master on one node and a worker on another node, then deploying and exposing a sample nginx application across the cluster.
Cloud computing :
Accessibility: Cloud computing facilitates the access of applications and data from any location worldwide and from any device with an internet connection.
Cost savings: Cloud computing offers businesses scalable computing resources hence saving them on the cost of acquiring and maintaining them.
Security: Cloud providers especially those offering private cloud services, have strived to implement the best security standards and procedures in order to protect client’s data saved in the cloud.
Disaster recovery: Cloud computing offers the most efficient means for small, medium, and even large enterprises to backup and restore their data and applications in a fast and reliable way.
This document discusses using Terraform to provision Datadog monitoring tools. Terraform allows for infrastructure as code to manage cloud services. Datadog provides dashboards and alerts to monitor infrastructure and applications. The document outlines installing Terraform, using Terraform providers like Datadog, creating template variables, and implementing basic Datadog resources like dashboards and monitors through Terraform.
What Is Kubernetes | Kubernetes Introduction | Kubernetes Tutorial For Beginn...Edureka!
***** Kubernetes Certification Training: https://www.edureka.co/kubernetes-certification *****
This Edureka tutorial on "What is Kubernetes" will give you an introduction to one of the most popular Devops tool in the market - Kubernetes, and its importance in today's IT processes. This tutorial is ideal for beginners who want to get started with Kubernetes & DevOps. The following topics are covered in this training session:
1. Need for Kubernetes
2. What is Kubernetes and What it's not
3. How does Kubernetes work?
4. Use-Case: Kubernetes @ Pokemon Go
5. Hands-on: Deployment with Kubernetes
DevOps Tutorial Blog Series: https://goo.gl/P0zAfF
This document provides an introduction to Docker and discusses how it helps address challenges in the modern IT landscape. Some key points:
- Applications are increasingly being broken up into microservices and deployed across multiple servers and environments, making portability and scalability important.
- Docker containers help address these issues by allowing applications to run reliably across different infrastructures through package dependencies and resources together. This improves portability.
- Docker provides a platform for building, shipping and running applications. It helps bridge the needs of developers who want fast innovation and operations teams who need security and control.
This lecture covers an introduction to cloud computing. It discusses key topics like cloud types, architecture, services, platforms, security, and applications. Specifically, it defines cloud computing, compares delivery models like Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). It also discusses using major cloud platforms from Amazon Web Services, Google, and Microsoft and exploring concepts like virtualization, capacity planning, and establishing identity/security in the cloud. The lecture concludes by discussing mobile cloud integration and streaming media/video applications in cloud computing.
APM is a tool that monitors application performance and user experience by tracking metrics like load and KPIs. It allows seeing how applications are used by real users and identifying problems that impact sales or brand experience. Observability aggregates data from logs, metrics, and traces to assess overall system health, while APM directly focuses on gauging user experience. Both ensure good user experience but in different ways - APM actively collects data related to response time, while observability passively examines various data sources. Monitoring tracks predefined metrics over time to understand system status, but observability analyzes related data to determine the root cause of issues.
Kubernetes is an open-source container management platform. It has a master-node architecture with control plane components like the API server on the master and node components like kubelet and kube-proxy on nodes. Kubernetes uses pods as the basic building block, which can contain one or more containers. Services provide discovery and load balancing for pods. Deployments manage pods and replicasets and provide declarative updates. Key concepts include volumes for persistent storage, namespaces for tenant isolation, labels for object tagging, and selector matching.
This document provides an introduction to microservices. It begins by outlining the challenges of monolithic architecture such as long build/release cycles and difficulty scaling. It then introduces microservices as a way to decompose monolithic applications into independently deployable services. Key benefits of microservices include improved agility, scalability, and innovation. The document discusses microservice design principles like communicating over APIs, using the right tools for each service, securing services, and being a good citizen in the ecosystem. It provides examples of how to implement a restaurant microservice using AWS services like API Gateway, Lambda, DynamoDB and containers.
Cloud Architecture Tutorial - Why and What (1of 3) Adrian Cockcroft
Introduction to the Netflix Cloud Architecture Tutorial - discusses the why and what of cloud including the thinking behind Netflix choice of AWS, and the product features that Netflix runs in the cloud.
Netflix
has
built
and
deployed
a
scalable
global
Platorm
as
a
Service.
Key
components
of
the
Netflix
PaaS
are
being
released
as
Open
Source
projects
so
you
can
build
your
own
custom
PaaS
SV Forum Platform Architecture SIG - Netflix Open Source PlatformAdrian Cockcroft
Architecture overview of Netflix Cloud Architecture with a focus on the Open Source components that Netflix has put and is planning to release on http://netflix.github.com
The Netflix recipe for migrating your organization from building a datacenter based product to a cloud based product. First presented at the Silicon Valley Cloud Computing Meetup "Speak Cloudy to Me" on Saturday April 30th, 2011
Yow Conference Dec 2013 Netflix Workshop Slides with NotesAdrian Cockcroft
This document provides an overview and agenda for a workshop on patterns for continuous delivery, high availability, DevOps and cloud native development using NetflixOSS open source tools and frameworks. The presenter introduces himself and his background. The content covers Netflix's architecture evolution from monolithic to microservices, how Netflix scales on AWS, and principles and outcomes that enable cloud native development. The workshop then dives into specific NetflixOSS projects like Eureka, Cassandra, Zuul and Hystrix that help with service discovery, data storage, routing and availability. Tools for deployment, configuration, cost analysis and developer productivity are also discussed.
This document introduces OpenStack, an open source cloud operating system. It discusses how OpenStack provides a common platform for both private and public clouds by automating resource control and management. It highlights how OpenStack originated from Rackspace and NASA to address the lack of an existing solution that meets their needs. The document also summarizes key OpenStack components, stats on its community and adoption, and how Rackspace can help organizations deploy and support OpenStack clouds.
Continuously Design your Continuous DeploymentMichael Elder
Whether your applications are cloud-native, cloud-ready, or just evolving towards cloud-based deployment, you can capture the complete stack as an OpenStack Heat template. In this session, we’ll present a web-based editing experience that enables you to capture each aspect of your architecture in a ready-to-deploy and easy-to-update design based on HOT. We'll show you those Heat templates in either a rich diagram editor or a simple but powerful text editor -- all in your web browser!
Advanced features like autoscaling, load balancing, deployment ordering, and object storage will all be captured as part of your application design — right along side the critical software that defines the business behavior of your workload.
And it’s not just about the first time you deploy, it’s about deploying every time thereafter. We’ll show you how you can manage your software deployment pipeline as part of your Heat templates.
Maybe you’re not sure what cloud to deploy to? Interested in OpenStack, but already have investments in other clouds? We’ll also demonstrate how we’ve extended the Heat language to design cloud-portable templates.
So come on this journey with us, where we’ll leverage the cloud to help you build better software for your end users.
- Define full stack application workloads using OpenStack HOT
- Deploy and update infrastructure and application changes as part of your release pipeline
- Design templates with autoscaling, load balancing, deployment ordering, and object storage as part of your application architecture
Getting Started with MariaDB with DockerMariaDB plc
This document discusses deploying MariaDB databases with Docker from development to production. It recommends using Docker containers to encapsulate dependencies and isolate processes for easy deployment on-premise, in the cloud, or in hybrid environments. It highlights challenges like orchestration complexity and outlines requirements for data durability, self-discovery, self-healing, and application discovery of database clusters. It demonstrates building a Python/Flask app in Docker, deploying it to a Swarm cluster, and scaling the web tier behind HAProxy. It also shows deploying a 3-node Galera MariaDB cluster and 2-node MaxScale proxy for high availability.
This document discusses Ne0lix's cloud architecture and use of AWS. Ne0lix built its own scalable Java-oriented PaaS to run on AWS due to limited PaaS options when they started. They moved most applications to SaaS and the cloud for improved business agility and faster scaling. Ne0lix chose AWS for its scale, features, automation and global availability despite AWS also being a competitor in some areas. Their cloud architecture focuses on speed, scalability, and meeting goals around latency and capacity.
Current State of Affairs – Cloud Computing - Indicthreads Cloud Computing Con...IndicThreads
Session presented at the 2nd IndicThreads.com Conference on Cloud Computing held in Pune, India on 3-4 June 2011.
http://CloudComputing.IndicThreads.com
Abstract: Cloud Computing has had phenomenal growth over the past year and continues to entrench itself in all facets of IT. Cloud Computing is definitely more than just a buzz word or a passing trend. Now the heavy weights like IBM, HP and SAP are ready lock horns with existing players like Amazon, Salesforce and Microsoft whose offerings have matured over a period of time. Besides these big players, a lot of start ups are coming up with innovative offerings in this space.
The talk is about the current state of affairs in the cloud computing. It will cover the products, services and offerings that have been making a lot of noise in the cloud computing space.
Following are the main points that will be covered in the talk:
1. New Players: A lot of enterprise market giants are now coming to the cloud party offering infrastructure and platform services. IBM has come out with its SmartCloud for private as well as public clouds. Oracle has released its Cloud-in-a-box solution. The talk will cover all the new offerings by these enterprise giants.
2. Old Players, New offerings – Amazon being the leader in the Cloud Infrastructure space has rolled out a lot of new products and services, strengthening its hold in the market and expanding into the PaaS segment. Amazon Beanstalk, Amazon CloudFormation and EC2 Dedicated instances most notably have the power to be game changers. SalesForce the leader in the Cloud SaaS space released database.com, enterprise cloud database and its “PaaS” offering similar to GAE – VMforce.com This section will cover the new offerings by the players.
3 .Interesting Players in the cloud ecosystem: There have been a lot of new players who are leveraging the cloud to build some exciting products like Scalable API platforms, Cloud-based logging, Java in the Cloud. etc eg. Apigee, PiCloud, Loggly,Cumulogic, Cloudbees being some of them. This section will cover most of the exciting platforms and technologies these companies are working on.
4. Current Trends and Future: This section will cover the current trends(where a lot of startups are investing in) and how the future will look like in the cloud space.
Finally, the talk plans to “arm” developers and architects with the latest and cutting edge platforms, products and technologies in the cloud that have been developed and made available over the last year, helping them to leverage the cloud and make better choices leading to higher ROI and lesser TCO.
Speaker:
Chirag Jog, is the CTO at Clogeny Technologies where the main focus is on Innovation in the Cloud Computing, Scalable Applications and Storage space. He is the chief geek at Clogeny who talks “Cloud” and works on architecting exciting ideas in the cloud space. He has previously spoken at IndicThreads, CloudCamp and other cloud related events.
[Presented at All Things Open 2015 in Raleigh, NC, USA]
OpenStack is one of the fastest-growing and exciting open source projects of our time. OpenStack has drawn together technologists from all over the world to create a cloud operating system and a huge, diverse community behind it. This talk will provide an introduction to OpenStack for newcomers to the project of those who just want to know more. We’ll take a brief look at OpenStack’s history, get a technical overview of the project, learn how to contribute, and check out a few emerging trends and hot topics in the OpenStack world.
RightScale User Conference: Why RightScale?Erik Osterman
RightScale provides a framework for operations that standardizes infrastructure management and allows operations to evolve alongside engineering. It treats infrastructure like software development with reusable components, simplifying operations and reducing technical debt. This framework allows organizations to build infrastructure consistently across clouds, commoditize resources, and empower engineers to take on operational roles through a modern DevOps approach.
LinuxFest NW 2013: Hitchhiker's Guide to Open Source Cloud ComputingMark Hinkle
Presented on April 27th, 2013 at LinuxFest NW
Imagine it’s eight o’clock on a Thursday morning and you awake to see a bulldozer out your window ready to plow over your data center. Normally you may wish to consult the Encyclopedia Galáctica to discern the best course of action but your copy is likely out of date. And while the Hitchhiker’s Guide to the Galaxy (HHGTTG) is a wholly remarkable book it doesn’t cover the nuances of cloud computing. That’s why you need the Hitchhiker’s Guide to Cloud Computing (HHGTCC) or at least to attend this talk understand the state of open source cloud computing. Specifically this talk will cover infrastructure-as-a-service, platform-as-a-service and developments in big data and how to more effectively take advantage of these technologies using open source software. Technologies that will be covered in this talk include Apache CloudStack, Chef, CloudFoundry, NoSQL, OpenStack, Puppet and many more.
Specific topics for discussion will include:
Infrastructure-as-a-Service - The Systems Cloud - Get a comparision of the open source cloud platforms including OpenStack, Apache CloudStack, Eucalyptus, OpenNebula
Platform-as-a-Service - The Developers Cloud - Find out what tools are availble to build portable auto-scaling applications including CloudFoundry, OpenShift, Stackato and more.
Data-as-a-Service - The Analytics Cloud - Want to figure out the who, what , where , when and why of big data ? You get an overview of open source NoSQL databases and technologies like MapReduce to help crunch massive data sets in the cloud.
Finally you'll get a overview of the tools that can help you really take advantage of the cloud? Want to auto-scale virtual machiens to serve millions of web pages or want to automate the configuration of cloud computing environments. You'll learn how to combine these tools to provide continous deployment systems that will help you earn DevOps cred in any data center.
[Finally, for those of you that are Douglas Adams fans please accept the deepest apologies for bad analogies to the HHGTTG.]
Lessons Learned Running The Largest OpenStack CloudsKenneth Hui
Presentation at OpenStack Days Mountain West sharing lessons Rackspace has learned building and operating the world's largest OpenStack public cloud and some of the world's largest private clouds.
The challenge of application distribution - Introduction to Docker (2014 dec ...Sébastien Portebois
Live recording with the demos: https://www.youtube.com/watch?v=0XRcmJEiZOM
Contents
- The application distribution challenge
- The current solutions
- Introduction to Docker, Containers, and the Matrix from Hell
- Why people care: Separation of Concerns
- Technical Discussion
- Ecosystem, momentum
- How to build Docker images
- How to make containers talk to each other, how to handle data persistence
- Demo 1: isolation
- Demo 2: real case - installing Go Math! Academy, tail –f containers, unit tests
The document discusses how cloud services are impacting the work of Oracle technology experts. It notes that many database administrator and fusion middleware administrator roles will transition to cloud providers as more systems move to the cloud. It outlines a roadmap for technology experts that includes trialing cloud services, ongoing learning, and using hybrid cloud environments. It concludes that while some tasks will shift to cloud providers, technology experts still have opportunities consulting on cloud services, developing cloud software, and supporting hybrid environments that incorporate both cloud and on-premises systems.
The document discusses how cloud services are impacting the work of Oracle technology experts. It notes that many database administrator and fusion middleware administrator roles will transition to cloud providers as more systems move to the cloud. It outlines a roadmap for technology experts that includes trialing cloud services, ongoing learning, and adopting a hybrid approach using both on-premises and cloud systems. It concludes that while some tasks will shift to cloud providers, technology experts still have opportunities consulting on cloud services, developing cloud software, and supporting hybrid environments.
[Srijan Wednesday Webinars] How to Build a Cloud Native Platform for Enterpri...Srijan Technologies
Drupal has been a consistent leader in the Gartner Magic Quadrant for Web Content Management. However, enterprises leveraging Drupal have traditionally relied on PaaS providers for their hosting, scaling and lifecycle management. And that usually leads to enterprise applications being locked-in with a particular cloud or vendor.
As container and container orchestration technologies disrupt the cloud and platform landscape, there’s a clear way to avoid this state of affairs. In this webinar, we discuss why it's important to build a cloud-native Drupal platform, and exactly how to do that.
Join the webinar to understand how you can avoid vendor lock-in, and create a secure platform to manage, operate and scale your Drupal applications in a multi-cloud portable manner.
Key Takeaways:
- Why you need a cloud-native Drupal platform and how to build one
- How to craft an idiomatic development workflow
- Understanding infrastructure and cloud engineering - under the hood
- Demystifying the art and science of Docker and Kubernetes: deep dive into scaling the LAMP stack
- Exploring cost optimization and cloud governance
- Understand portability of applications
- A hands-on demo of how the platform works
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...Adrian Cockcroft
Flowcon keynote was a few days before CMG, a few tweaks and some extra content added at the start and end. Opening Keynote talk for both conferences on how Speed Wins and how Netflix is doing Continuous Delivery
This document provides an overview of a workshop on cloud native, capacity, performance and cost optimization tools and techniques. It begins with introducing the difference between a presentation and workshop. It then discusses introducing attendees, presenting on various cloud native topics like migration paths and operations tools, and benchmarking Cassandra performance at scale across AWS regions. The goal is to explore cloud native techniques while discussing specific problems attendees face.
Bottleneck analysis - Devopsdays Silicon Valley 2013Adrian Cockcroft
The document analyzes bottle delivery response time data over various intervals. Summary statistics show the response times have a mean of 3.086 seconds and standard deviation of 1.94 seconds. A chp analysis reveals the system is well-behaved with low lock contention.
Netflix Global Applications - NoSQL Search RoadshowAdrian Cockcroft
This document summarizes Netflix's approach to building cloud native applications. It discusses how Netflix uses microservices, replicates components across availability zones, and implements automated testing like Chaos Monkey to make applications resilient to failures. It also describes how Netflix uses Apache Cassandra and other open source tools to build highly available storage and handle large volumes of data in the cloud.
A collection of information taken from previous presentations that was used as drill down for supporting discussion of specific topics during the tutorial.
Same basic flow as the keynote, but with a lot more detail, and we had a lot more interactive discussion rather than a presentation format. See part 2 for some more specific detail and links to other presentations.
1) Cloud native applications are built to take advantage of cloud computing resources like dynamically provisioned micro-services and distributed ephemeral components.
2) Netflix has transitioned to being a cloud native application built on an open source platform using AWS for scalable infrastructure, but also uses other providers for services not fully supported by AWS like content delivery and DNS.
3) What has changed is developers are freed from being the bottleneck through decentralization and automation of operations, allowing for greater agility, innovation, and business competitiveness in the cloud native model.
Adrian Cockcroft discusses the challenges of building reliable cloud services in an imperfect environment. He describes Netflix's approach of using microservices, continuous delivery, and automation to create stability. Cockcroft also introduces NetflixOSS, an open source platform that provides libraries and tools to help other companies adopt this "cloud native" architecture. The talk outlines opportunities to improve portability and foster an ecosystem around NetflixOSS.
The document discusses Netflix's use of open source technologies in its cloud architecture. It summarizes how Netflix leverages open source software to build cloud native applications that are highly scalable and available on AWS. Key aspects include building stateless microservices, using Cassandra for data storage in a quorum across multiple availability zones, and tools like Edda for configuration management and monitoring. The document advocates for open sourcing Netflix's best practices to help drive innovation.
Introduction to the Netflix Open Source Software project, explains why Netflix is doing this, how all the parts fit together and what is planned to come next. Presented at the inaugural NetflixOSS Meetup February 6th 2013 at Netflix headquarters in Los Gatos.
AWS Re:Invent - High Availability Architecture at NetflixAdrian Cockcroft
Slides from my talk at AWS Re:Invent November 2012. Describes the architecture, how to make highly available application code and data stores, a taxonomy of failure modes, and actual failures and effects. Ends with a summary of @NetflixOSS projects so others can easily leverage this architecture.
Architecture talk aimed at a well informed developer audience (i.e. QConSF Real Use Cases for NoSQL track), focused mainly on availability. Skips the Netflix cloud migration stuff that is in other talks.
Summary of past Cassandra benchmarks performed by Netflix and description of how Netflix uses Cassandra interspersed with a live demo automated using Jenkins and Jmeter that created two 12 node Cassandra clusters from scratch on AWS, one with regular disks and one with SSDs. Both clusters were scaled up to 24 nodes each during the demo.
This document provides an overview of a presentation on cloud architecture and anti-architecture patterns. The presentation discusses moving a company's primary data store from a centralized SQL database to a distributed Cassandra database in the cloud. An initial prototype backup solution was overengineered, becoming complex and taking too long to implement fully. This highlighted the importance of defining anti-architecture constraints upfront to guide development in a simpler direction. The presentation concludes with a discussion of differences between the company's existing datacenter architecture and goals for a cloud architecture, focusing on replacing centralized components with distributed and decoupled alternatives.
Cloud Architecture Tutorial - Running in the Cloud (3of3)Adrian Cockcroft
Part 3 of the talk covers how to transition to cloud, how to bootstrap developers, how to run cloud services including Cassandra, capacity planning and workload analysis, and organizational structure
Slides from QConSF Nov 19th, 2011 focusing this time on describing the globally distributed and scaled industrial strength Java Platform as a Service that Netflix has built and run on top of AWS and Cassandra. Parts of that platform are being released as open source - Curator, Priam and Astyanax.
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...Adrian Cockcroft
The document discusses Netflix replacing its Oracle database with Apache Cassandra on AWS to support its transition to becoming a global cloud-based service. Key points include migrating data from Oracle to Cassandra for improved scalability and availability across regions; using AWS services like S3, EC2 and SimpleDB during the transition; and addressing challenges around backups, disaster recovery and analytics with the new architecture.
Migrating Netflix from Datacenter Oracle to Global CassandraAdrian Cockcroft
Netflix is migrating its datacenter infrastructure from Oracle databases to a globally distributed Apache Cassandra database on AWS. This will allow Netflix to scale more easily and deploy new features faster without being limited by the capacity of its own datacenters. The migration involves transitionally replicating data between Oracle and AWS services like SimpleDB while new services are deployed directly on Cassandra. This will cut Netflix's dependence on its existing datacenters and allow it to fully leverage the elasticity of the public cloud.
Netflix has moved nearly 100% of its infrastructure to the AWS public cloud to gain the scalability and agility needed to support its rapid international expansion and unpredictable growth. Netflix leverages AWS's massive global infrastructure and services like EC2, S3, and ELB to easily scale its streaming workload from thousands to millions of customers per hour. By using the cloud, Netflix avoids the lengthy process of building its own datacenters and can instead focus on delivering new features to customers around the world.
This document summarizes a presentation on performance architecture for cloud computing given by Adrian Cockcroft of Netflix. Some key points:
- Netflix has moved nearly 100% of its infrastructure to the cloud on Amazon Web Services (AWS) to gain agility and scale. However, tools built for data centers do not work well for the cloud.
- Netflix uses a model-driven architecture approach where everything is pre-baked into Amazon Machine Images (AMIs) and managed by auto-scalers. This enables automated security and performance monitoring at scale.
- Capacity planning is challenging in the cloud where capacity is expensive and inflexible. Traditional metrics like utilization are not very useful. Netflix has developed its
Combining Lexical and Semantic Search with Milvus 2.5Zilliz
In short, lexical search is a way to search your documents based on the keywords they contain, in contrast to semantic search, which compares the similarity of embeddings. We’ll be covering:
Why, when, and how should you use lexical search
What is the BM25 distance metric
How exactly does Milvus 2.5 implement lexical search
How to build an improved hybrid lexical + semantic search with Milvus 2.5
Caching for Performance Masterclass: Caching at ScaleScyllaDB
Weighing caching considerations for use cases with different technical requirements and growth expectations.
- Request coalescing
- Negative sharding
- Rate limiting
- Sharding and scaling
Caching for Performance Masterclass: The In-Memory DatastoreScyllaDB
Understanding where in-memory data stores help most and where teams get into trouble.
- Where in the stack to cache
- Memcached as a tool
- Modern cache primitives
Not a Kubernetes fan? The state of PaaS in 2025Anthony Dahanne
Kubernetes won the containers orchestration war. But has it made deploying your apps easier?
Let's explore some of Kubernetes extensive app developer tooling, but mainly what the PaaS space looks like in 2025; 18 years after Heroku made it popular.
Is Heroku still around? What about Cloud Foundry?
And what are those new comers (fly.io, railway, porter.sh, etc.) worth?
Did the Cloud giants replace them all?
Mastering ChatGPT & LLMs for Practical Applications: Tips, Tricks, and Use CasesSanjay Willie
Our latest session with Astiostech covered how to unlock the full potential of ChatGPT and LLMs for real-world use!
✅ Key Takeaways:
🔹 Effective Prompting: Crafting context-specific, high-quality prompts for optimal AI responses.
🔹 Advanced ChatGPT Features: Managing system prompts, conversation memory, and file uploads.
🔹 Optimizing AI Outputs: Refining responses, handling large texts, and knowing when fine-tuning is needed.
🔹 Competitive Insights: Exploring how ChatGPT compares with other AI tools.
🔹 Business & Content Use Cases: From summarization to SEO, sales, and audience targeting.
💡 The session provided hands-on strategies to make AI a powerful tool for content creation, decision-making, and business growth.
🚀 Are you using AI effectively in your workflow? Let’s discuss how it can improve efficiency and creativity!
#AI #ChatGPT #PromptEngineering #ArtificialIntelligence #LLM #Productivity #Astiostech
EaseUS Partition Master Crack 2025 + Serial Keykherorpacca127
https://ncracked.com/7961-2/
Note: >> Please copy the link and paste it into Google New Tab now Download link
EASEUS Partition Master Crack is a professional hard disk partition management tool and system partition optimization software. It is an all-in-one PC and server disk management toolkit for IT professionals, system administrators, technicians, and consultants to provide technical services to customers with unlimited use.
EASEUS Partition Master 18.0 Technician Edition Crack interface is clean and tidy, so all options are at your fingertips. Whether you want to resize, move, copy, merge, browse, check, convert partitions, or change their labels, you can do everything with a few clicks. The defragmentation tool is also designed to merge fragmented files and folders and store them in contiguous locations on the hard drive.
Bedrock Data Automation (Preview): Simplifying Unstructured Data ProcessingZilliz
Bedrock Data Automation (BDA) is a cloud-based service that simplifies the process of extracting valuable insights from unstructured content—such as documents, images, video, and audio. Come learn how BDA leverages generative AI to automate the transformation of multi-modal data into structured formats, enabling developers to build applications and automate complex workflows with greater speed and accuracy.
Getting Started with AWS - Enterprise Landing Zone for Terraform Learning & D...Chris Wahl
Recording: https://youtu.be/PASG0NTKUQA?si=1Ih7O9z0Lk0IzX9n
Welcome innovators! In this comprehensive tutorial, you will learn how to get started with AWS Cloud and Terraform to build an enterprise-like landing zone for a secure, low-cost environment to develop with Terraform. We'll guide you through setting up AWS Control Tower, Identity and Access Management, and creating a sandbox account, ensuring you have a safe and controlled area for learning and development. You'll also learn about budget management, single sign-on setup, and using AWS organizations for policy management. Plus, dive deep into Terraform basics, including setting up state management, migrating local state to remote state, and making resource modifications using your new infrastructure as code skills. Perfect for beginners looking to master AWS and Terraform essentials!
UiPath Automation Developer Associate Training Series 2025 - Session 1DianaGray10
Welcome to UiPath Automation Developer Associate Training Series 2025 - Session 1.
In this session, we will cover the following topics:
Introduction to RPA & UiPath Studio
Overview of RPA and its applications
Introduction to UiPath Studio
Variables & Data Types
Control Flows
You are requested to finish the following self-paced training for this session:
Variables, Constants and Arguments in Studio 2 modules - 1h 30m - https://academy.uipath.com/courses/variables-constants-and-arguments-in-studio
Control Flow in Studio 2 modules - 2h 15m - https:/academy.uipath.com/courses/control-flow-in-studio
⁉️ For any questions you may have, please use the dedicated Forum thread. You can tag the hosts and mentors directly and they will reply as soon as possible.
Blockchain is revolutionizing industries by enhancing security, transparency, and automation. From supply chain management and finance to healthcare and real estate, blockchain eliminates inefficiencies, prevents fraud, and streamlines operations.
What You'll Learn in This Presentation:
1. How blockchain enables real-time tracking & fraud prevention
2. The impact of smart contracts & decentralized finance (DeFi)
3. Why businesses should adopt secure and automated blockchain solutions
4. Real-world blockchain applications across multiple industries
Explore the future of blockchain and its practical benefits for businesses!
Predictive vs. Preventive Maintenance — Which One is Right for Your FactoryDiagsense ltd
Efficient maintenance is the backbone of any manufacturing operation. It ensures that machinery runs smoothly, minimizes downtime and optimizes overall productivity. Earlier, factories have relied on preventive maintenance but with advancements in technology, Manufacturing PdM Solutions is gaining traction. The question is—which one is the right fit for your factory? Let’s break it down.
UiPath Document Understanding - Generative AI and Active learning capabilitiesDianaGray10
This session focus on Generative AI features and Active learning modern experience with Document understanding.
Topics Covered:
Overview of Document Understanding
How Generative Annotation works?
What is Generative Classification?
How to use Generative Extraction activities?
What is Generative Validation?
How Active learning modern experience accelerate model training?
Q/A
❓ If you have any questions or feedback, please refer to the "Women in Automation 2025" dedicated Forum thread. You can find there extra details and updates.
5 Best Agentic AI Frameworks for 2025.pdfSoluLab1231
AI chatbots use generative AI to develop answers from a single interaction. When someone asks a question, the chatbot responds using a natural language process (NLP). Agentic AI, the next wave of artificial intelligence, goes beyond this by solving complicated multistep problems on its way by using advanced reasoning and iterative planning. Additionally, it is expected to improve operations and productivity across all sectors.
AI Trends and Fun Demos – Sotheby’s Rehoboth PresentationEthan Holland
Ethan B. Holland explores the impact of artificial intelligence on real estate and digital transformation. Covering key AI trends such as multimodal AI, agency, co-pilots, and AI-powered computer usage, the document highlights how emerging technologies are reshaping industries. It includes real-world demonstrations of AI in action, from automated real estate insights to AI-generated voice and video applications. With expertise in digital transformation, Ethan shares insights from his work optimizing workflows with AI tools, automation, and large language models. This presentation is essential for professionals seeking to understand AI’s role in business, automation, and real estate.
Webinar: LF Energy GEISA: Addressing edge interoperability at the meterDanBrown980551
This webinar will introduce the Grid Edge Security and Interoperability Alliance, or GEISA, an effort within LF Energy to address application interoperability at the very edge of the utility network: meters and other distribution automation devices. Over the last decade platform manufacturers have introduced the ability to run applications on electricity meters and other edge devices. Unfortunately, while many of these efforts have been built on Linux, they haven’t been interoperable. APIs and execution environment have varied from one manufacturer to the next making it impossible for utilities to obtain applications that they can run across a fleet of different devices. For utilities that want to minimize their supply chain risk by obtaining equipment from multiple suppliers, they are forced to run and maintain multiple separate management systems. Applications available for one device may need to be ported to run on another, or they may not be available at all.
GEISA addresses this by creating a vendor neutral specification for utility edge computing environments. This webinar will discuss why GEISA is important to utilities, the specific issues GEISA will solve and the new opportunities it creates for utilities, platform vendors, and application vendors.
How to teach M365 Copilot and M365 Copilot Chat prompting to your colleagues. Presented at the Advanced Learning Institute's "Internal Communications Strategies with M365" event on February 27, 2025. Intended audience: Internal Communicators, User Adoption Specialists, IT.
Understanding Traditional AI with Custom Vision & MuleSoft.pptxshyamraj55
Understanding Traditional AI with Custom Vision & MuleSoft.pptx | ### Slide Deck Description:
This presentation features Atul, a Senior Solution Architect at NTT DATA, sharing his journey into traditional AI using Azure's Custom Vision tool. He discusses how AI mimics human thinking and reasoning, differentiates between predictive and generative AI, and demonstrates a real-world use case. The session covers the step-by-step process of creating and training an AI model for image classification and object detection—specifically, an ad display that adapts based on the viewer's gender. Atulavan highlights the ease of implementation without deep software or programming expertise. The presentation concludes with a Q&A session addressing technical and privacy concerns.
Revolutionizing Field Service: How LLMs Are Powering Smarter Knowledge Access...Earley Information Science
Revolutionizing Field Service with LLM-Powered Knowledge Management
Field service technicians need instant access to accurate repair information, but outdated knowledge systems often create frustrating delays. Large Language Models (LLMs) are changing the game—enhancing knowledge retrieval, streamlining troubleshooting, and reducing technician dependency on senior staff.
In this webinar, Seth Earley and industry experts Sanjay Mehta, and Heather Eisenbraun explore how LLMs and Retrieval-Augmented Generation (RAG) are transforming field service operations. Discover how AI-powered knowledge management is improving efficiency, reducing downtime, and elevating service quality.
LLMs for Instant Knowledge Retrieval – How AI-driven search dramatically cuts troubleshooting time.
Structured Data & AI – Why high-quality, organized knowledge is essential for LLM success.
Real-World Implementation – Lessons from deploying LLM-powered knowledge tools in field service.
Business Impact – How AI reduces service delays, optimizes workflows, and enhances technician productivity.
Empower your field service teams with AI-driven knowledge access. Watch the webinar to see how LLMs are revolutionizing service efficiency.
Revolutionizing Field Service: How LLMs Are Powering Smarter Knowledge Access...Earley Information Science
Netflix Architecture Tutorial at Gluecon
1. Cloud
Architecture
Tutorial
Construc2ng
Cloud
Architecture
the
Ne5lix
Way
Gluecon
May
23rd,
2012
Adrian
Cockcro7
@adrianco
#ne:lixcloud
h=p://www.linkedin.com/in/adriancockcro7
3. Tutorial
Abstract
–
Set
Context
• Dispensing
with
the
usual
quesMons:
“Why
Ne:lix,
why
cloud,
why
AWS?”
as
they
are
old
hat
now.
• This
tutorial
explains
how
developers
use
the
Ne:lix
cloud,
and
how
it
is
built
and
operated.
• The
real
meat
of
the
tutorial
comes
when
we
look
at
how
to
construct
an
applicaMon
with
a
host
of
important
properMes:
elasMc,
dynamic,
scalable,
agile,
fast,
cheap,
robust,
durable,
observable,
secure.
Over
the
last
three
years
Ne:lix
has
figured
out
cloud
based
soluMons
with
these
properMes,
deployed
them
globally
at
large
scale
and
refined
them
into
a
global
Java
oriented
Pla:orm
as
a
Service.
The
PaaS
is
based
on
low
cost
open
source
building
blocks
such
as
Apache
Tomcat,
Apache
Cassandra,
and
Memcached.
Components
of
this
pla:orm
are
in
the
process
of
being
open-‐sourced
by
Ne:lix,
so
that
other
companies
can
get
a
start
on
building
their
own
customized
PaaS
that
leverages
advanced
features
of
AWS
and
supports
rapid
agile
development.
• The
architecture
is
described
in
terms
of
anM-‐pa=erns
-‐
things
to
avoid
in
the
datacenter
to
cloud
transiMon.
A
scalable
global
persistence
Mer
based
on
Cassandra
provides
a
highly
available
and
durable
under-‐pinning.
Lessons
learned
will
cover
soluMons
to
common
problems,
availability
and
robustness,
observability.
A=endees
should
leave
the
tutorial
with
a
clear
understanding
of
what
is
different
about
the
Ne:lix
cloud
architecture,
how
it
empowers
and
supports
developers,
and
a
set
of
flexible
and
scalable
open
source
building
blocks
that
can
be
used
to
construct
their
own
cloud
pla:orm.
4. PresentaMon
vs.
Tutorial
• PresentaMon
– Short
duraMon,
focused
subject
– One
presenter
to
many
anonymous
audience
– A
few
quesMons
at
the
end
• Tutorial
– Time
to
explore
in
and
around
the
subject
– Tutor
gets
to
know
the
audience
– Discussion,
rat-‐holes,
“bring
out
your
dead”
5. Cloud
Tutorial
SecMons
Intro:
Who
are
you,
what
are
your
quesMons?
Part
1
–
WriMng
and
Performing
Developer
Viewpoint
Part
2
–
Running
the
Show
Operator
Viewpoint
Part
3
–
Making
the
Instruments
Builder
Viewpoint
6. Adrian
Cockcro7
• Director,
Architecture
for
Cloud
Systems,
Ne:lix
Inc.
– Previously
Director
for
PersonalizaMon
Pla:orm
• DisMnguished
Availability
Engineer,
eBay
Inc.
2004-‐7
– Founding
member
of
eBay
Research
Labs
• DisMnguished
Engineer,
Sun
Microsystems
Inc.
1988-‐2004
– 2003-‐4
Chief
Architect
High
Performance
Technical
CompuMng
– 2001
Author:
Capacity
Planning
for
Web
Services
– 1999
Author:
Resource
Management
– 1995
&
1998
Author:
Sun
Performance
and
Tuning
– 1996
Japanese
EdiMon
of
Sun
Performance
and
Tuning
•
SPARC
&
Solarisパフォーマンスチューニング (サンソフトプレスシリーズ)
• Heavy
Metal
Bass
Guitarist
in
“Black
Tiger”
1980-‐1982
– Influenced
by
Van
Halen,
Yesterday
&
Today,
AC/DC
• More
– Twi=er
@adrianco
–
Blog
h=p://perfcap.blogspot.com
– PresentaMons
at
h=p://www.slideshare.net/adrianco
7. A=endee
IntroducMons
• Who
are
you,
where
do
you
work
• Why
are
you
here
today,
what
do
you
need
• “Bring
out
your
dead”
– Do
you
have
a
specific
problem
or
quesMon?
– One
sentence
elevator
pitch
• What
instrument
do
you
play?
15. Keeping
up
with
Developer
Trends
In
producMon
at
Ne:lix
• Big
Data/Hadoop
2009
• Cloud
2009
• ApplicaMon
Performance
Management
2010
• Integrated
DevOps
PracMces
2010
• ConMnuous
IntegraMon/Delivery
2010
• NoSQL
2010
• Pla:orm
as
a
Service
2010
• Social
coding,
open
development/github
2011
17. Portability
vs.
FuncMonality
• Portability
–
the
OperaMons
focus
– Avoid
vendor
lock-‐in
– Support
datacenter
based
use
cases
– Possible
operaMons
cost
savings
• FuncMonality
–
the
Developer
focus
– Less
complex
test
and
debug,
one
mature
supplier
– Faster
Mme
to
market
for
your
products
– Possible
developer
cost
savings
18. Portable
PaaS
• Portable
IaaS
Base
-‐
some
AWS
compaMbility
– Eucalyptus
–
AWS
licensed
compaMble
subset
– CloudStack
–
Citrix
Apache
project
– OpenStack
–
Rackspace,
Cloudscaling,
HP
etc.
• Portable
PaaS
– Cloud
Foundry
-‐
run
it
yourself
in
your
DC
– AppFog
and
Stackato
–
Cloud
Foundry/Openstack
– Vendor
opMons:
Rightscale,
Enstratus,
Smartscale
19. FuncMonal
PaaS
• IaaS
base
-‐
all
the
features
of
AWS
– Very
large
scale,
mature,
global,
evolving
rapidly
– ELB,
Autoscale,
VPC,
SQS,
EIP,
EMR,
DynamoDB
etc.
– Large
files
and
mulMpart
writes
in
S3
• FuncMonal
PaaS
–
based
on
Ne:lix
features
– Very
large
scale,
mature,
flexible,
customizable
– Asgard
console,
Monkeys,
Big
data
tools
– Cassandra/Zookeeper
data
store
automaMon
20. Developers
choose
FuncMonal
Don’t
let
the
roadie
write
the
set
list!
(yes
you
do
need
all
those
guitars
on
tour…)
21. Freedom
and
Responsibility
• Developers
leverage
cloud
to
get
freedom
– Agility
of
a
single
organizaMon,
no
silos
• But
now
developers
are
responsible
– For
compliance,
performance,
availability
etc.
“As
far
as
my
rehab
is
concerned,
it
is
within
my
ability
to
change
and
change
for
the
beNer
-‐
Eddie
Van
Halen”
22. Amazon Cloud Terminology Reference
See http://aws.amazon.com/ This is not a full list of Amazon Web Service features
• AWS
–
Amazon
Web
Services
(common
name
for
Amazon
cloud)
• AMI
–
Amazon
Machine
Image
(archived
boot
disk,
Linux,
Windows
etc.
plus
applicaMon
code)
• EC2
–
ElasMc
Compute
Cloud
– Range
of
virtual
machine
types
m1,
m2,
c1,
cc,
cg.
Varying
memory,
CPU
and
disk
configuraMons.
– Instance
–
a
running
computer
system.
Ephemeral,
when
it
is
de-‐allocated
nothing
is
kept.
– Reserved
Instances
–
pre-‐paid
to
reduce
cost
for
long
term
usage
– Availability
Zone
–
datacenter
with
own
power
and
cooling
hosMng
cloud
instances
– Region
–
group
of
Avail
Zones
–
US-‐East,
US-‐West,
EU-‐Eire,
Asia-‐Singapore,
Asia-‐Japan,
SA-‐Brazil,
US-‐Gov
• ASG
–
Auto
Scaling
Group
(instances
booMng
from
the
same
AMI)
• S3
–
Simple
Storage
Service
(h=p
access)
• EBS
–
ElasMc
Block
Storage
(network
disk
filesystem
can
be
mounted
on
an
instance)
• RDS
–
RelaMonal
Database
Service
(managed
MySQL
master
and
slaves)
• DynamoDB/SDB
–
Simple
Data
Base
(hosted
h=p
based
NoSQL
datastore,
DynamoDB
replaces
SDB)
• SQS
–
Simple
Queue
Service
(h=p
based
message
queue)
• SNS
–
Simple
NoMficaMon
Service
(h=p
and
email
based
topics
and
messages)
• EMR
–
ElasMc
Map
Reduce
(automaMcally
managed
Hadoop
cluster)
• ELB
–
ElasMc
Load
Balancer
• EIP
–
ElasMc
IP
(stable
IP
address
mapping
assigned
to
instance
or
ELB)
• VPC
–
Virtual
Private
Cloud
(single
tenant,
more
flexible
network
and
security
constructs)
• DirectConnect
–
secure
pipe
from
AWS
VPC
to
external
datacenter
• IAM
–
IdenMty
and
Access
Management
(fine
grain
role
based
security
keys)
23. Ne:lix
Deployed
on
AWS
2009
2009
2010
2010
2010
2011
Content
Logs
Play
WWW
API
CS
Content
S3
InternaMonal
Management
DRM
Sign-‐Up
Metadata
CS
lookup
Terabytes
EC2
Device
DiagnosMcs
EMR
CDN
rouMng
Search
Config
&
AcMons
Encoding
S3
Movie
TV
Movie
Customer
Hive
&
Pig
Bookmarks
Choosing
Choosing
Call
Log
Petabytes
Business
Social
Logging
RaMngs
Facebook
CS
AnalyMcs
Intelligence
CDNs
ISPs
Terabits
Customers
24. Datacenter
to
Cloud
TransiMon
Goals
“Go
ahead
and
Jump
–
Van
Halen”
• Faster
– Lower
latency
than
the
equivalent
datacenter
web
pages
and
API
calls
– Measured
as
mean
and
99th
percenMle
– For
both
first
hit
(e.g.
home
page)
and
in-‐session
hits
for
the
same
user
• Scalable
– Avoid
needing
any
more
datacenter
capacity
as
subscriber
count
increases
– No
central
verMcally
scaled
databases
– Leverage
AWS
elasMc
capacity
effecMvely
• Available
– SubstanMally
higher
robustness
and
availability
than
datacenter
services
– Leverage
mulMple
AWS
availability
zones
– No
scheduled
down
Mme,
no
central
database
schema
to
change
• ProducMve
– OpMmize
agility
of
a
large
development
team
with
automaMon
and
tools
– Leave
behind
complex
tangled
datacenter
code
base
(~8
year
old
architecture)
– Enforce
clean
layered
interfaces
and
re-‐usable
components
25. Datacenter
AnM-‐Pa=erns
What
do
we
currently
do
in
the
datacenter
that
prevents
us
from
meeMng
our
goals?
“Me
Wise
Magic
–
Van
Halen”
26. Ne:lix
Datacenter
vs.
Cloud
Arch
Central
SQL
Database
Distributed
Key/Value
NoSQL
SMcky
In-‐Memory
Session
Shared
Memcached
Session
Cha=y
Protocols
Latency
Tolerant
Protocols
Tangled
Service
Interfaces
Layered
Service
Interfaces
Instrumented
Code
Instrumented
Service
Pa=erns
Fat
Complex
Objects
Lightweight
Serializable
Objects
Components
as
Jar
Files
Components
as
Services
27. The
Central
SQL
Database
• Datacenter
has
a
central
database
– Everything
in
one
place
is
convenient
unMl
it
fails
• Schema
changes
require
downMme
– Customers,
movies,
history,
configuraMon
AnS-‐paNern
impacts
scalability,
availability
28. The
Distributed
Key-‐Value
Store
• Cloud
has
many
key-‐value
data
stores
– More
complex
to
keep
track
of,
do
backups
etc.
– Each
store
is
much
simpler
to
administer
DBA
– Joins
take
place
in
java
code
– No
schema
to
change,
no
scheduled
downMme
• Minimum
Latency
for
Simple
Requests
– Memcached
is
dominated
by
network
latency
<1ms
– Cassandra
cross
zone
replicaMon
around
one
millisecond
– DynamoDB
replicaMon
and
auth
overheads
around
5ms
– SimpleDB
higher
replicaMon
and
auth
overhead
>10ms
29. The
SMcky
Session
• Datacenter
SMcky
Load
Balancing
– Efficient
caching
for
low
latency
– Tricky
session
handling
code
• Encourages
concentrated
funcMonality
– one
service
that
does
everything
– Middle
Mer
load
balancer
had
issues
in
pracMce
AnS-‐paNern
impacts
producSvity,
availability
30. Shared
Session
State
• ElasMc
Load
Balancer
– We
don’t
use
the
cookie
based
rouMng
opMon
– External
“session
caching”
with
memcached
• More
flexible
fine
grain
services
– Any
instance
can
serve
any
request
– Works
be=er
with
auto-‐scaled
instance
counts
31. Cha=y
Opaque
and
Bri=le
Protocols
• Datacenter
service
protocols
– Assumed
low
latency
for
many
simple
requests
• Based
on
serializing
exisMng
java
objects
– Inefficient
formats
– IncompaMble
when
definiMons
change
AnS-‐paNern
causes
producSvity,
latency
and
availability
issues
32. Robust
and
Flexible
Protocols
• Cloud
service
protocols
– JSR311/Jersey
is
used
for
REST/HTTP
service
calls
– Custom
client
code
includes
service
discovery
– Support
complex
data
types
in
a
single
request
• Apache
Avro
– Evolved
from
Protocol
Buffers
and
Thri7
– Includes
JSON
header
defining
key/value
protocol
– Avro
serializaMon
is
half
the
size
and
several
Mmes
faster
than
Java
serializaMon,
more
work
to
code
33. Persisted
Protocols
• Persist
Avro
in
Memcached
– Save
space/latency
(zigzag
encoding,
half
the
size)
– New
keys
are
ignored
– Missing
keys
are
handled
cleanly
• Avro
protocol
definiMons
– Less
bri=le
across
versions
– Can
be
wri=en
in
JSON
or
generated
from
POJOs
– It’s
hard,
needs
be=er
tooling
34. Tangled
Service
Interfaces
• Datacenter
implementaMon
is
exposed
– Oracle
SQL
queries
mixed
into
business
logic
• Tangled
code
– Deep
dependencies,
false
sharing
• Data
providers
with
sideways
dependencies
– Everything
depends
on
everything
else
AnS-‐paNern
affects
producSvity,
availability
35. Untangled
Service
Interfaces
• New
Cloud
Code
With
Strict
Layering
– Compile
against
interface
jar
– Can
use
spring
runMme
binding
to
enforce
– Fine
grain
services
as
components
• Service
interface
is
the
service
– ImplementaMon
is
completely
hidden
– Can
be
implemented
locally
or
remotely
– ImplementaMon
can
evolve
independently
36. Untangled
Service
Interfaces
Poundcake
–
Van
Halen
Two
layers:
• SAL
-‐
Service
Access
Library
– Basic
serializaMon
and
error
handling
– REST
or
POJO’s
defined
by
data
provider
• ESL
-‐
Extended
Service
Library
– Caching,
conveniences,
can
combine
several
SALs
– Exposes
faceted
type
system
(described
later)
– Interface
defined
by
data
consumer
in
many
cases
38. Service
Architecture
Pa=erns
• Internal
Interfaces
Between
Services
– Common
pa=erns
as
templates
– Highly
instrumented,
observable,
analyMcs
– Service
Level
Agreements
–
SLAs
• Library
templates
for
generic
features
– Instrumented
Ne:lix
Base
Servlet
template
– Instrumented
generic
client
interface
template
– Instrumented
S3,
SimpleDB,
Memcached
clients
39. CLIENT
Request
Start
Timestamp,
Client
Inbound
Request
End
outbound
deserialize
end
Timestamp
serialize
start
Mmestamp
Mmestamp
Inbound
Client
deserialize
outbound
start
serialize
end
Mmestamp
Mmestamp
Client
network
receive
Mmestamp
Service
Request
Client
Network
send
Mmestamp
Instruments
Every
Service
network
send
Mmestamp
Step
in
the
call
Service
Network
receive
Mmestamp
Service
Service
outbound
inbound
serialize
end
serialize
start
Mmestamp
Mmestamp
Service
Service
outbound
inbound
serialize
start
SERVICE
execute
serialize
end
request
start
Mmestamp
Mmestamp
Mmestamp,
execute
request
end
Mmestamp
40. Boundary
Interfaces
• Isolate
teams
from
external
dependencies
– Fake
SAL
built
by
cloud
team
– Real
SAL
provided
by
data
provider
team
later
– ESL
built
by
cloud
team
using
faceted
objects
• Fake
data
sources
allow
development
to
start
– e.g.
Fake
IdenMty
SAL
for
a
test
set
of
customers
– Development
solidifies
dependencies
early
– Helps
external
team
provide
the
right
interface
41. One
Object
That
Does
Everything
Can’t
Get
This
Stuff
No
More
–
Van
Halen
• Datacenter
uses
a
few
big
complex
objects
– Good
choice
for
a
small
team
and
one
instance
– ProblemaMc
for
large
teams
and
many
instances
• False
sharing
causes
tangled
dependencies
– Movie
and
Customer
objects
are
foundaMonal
– UnproducMve
re-‐integraMon
work
AnS-‐paNern
impacSng
producSvity
and
availability
42. An
Interface
For
Each
Component
• Cloud
uses
faceted
Video
and
Visitor
– Basic
types
hold
only
the
idenMfier
– Facets
scope
the
interface
you
actually
need
– Each
component
can
define
its
own
facets
• No
false-‐sharing
and
dependency
chains
– Type
manager
converts
between
facets
as
needed
– video.asA(PresentaMonVideo)
for
www
– video.asA(MerchableVideo)
for
middle
Mer
43. Stan
Lanning’s
Soap
Box
• Business
Level
Object
-‐
Level
Confusion
Listen
to
the
bearded
guru…
– Don’t
pass
around
IDs
when
you
mean
to
refer
to
the
BLO
• Using
Basic
Types
helps
the
compiler
help
you
– Compile
Mme
problems
are
be=er
than
run
Mme
problems
• More
readable
by
people
– But
beware
that
asA
operaMons
may
be
a
lot
of
work
• MulMple-‐inheritance
for
Java?
– Kinda-‐sorta…
44. Model
Driven
Architecture
• TradiMonal
Datacenter
PracMces
– Lots
of
unique
hand-‐tweaked
systems
– Hard
to
enforce
pa=erns
– Some
use
of
Puppet
to
automate
changes
• Model
Driven
Cloud
Architecture
– Perforce/Ivy/Jenkins
based
builds
for
everything
– Every
producMon
instance
is
a
pre-‐baked
AMI
– Every
applicaMon
is
managed
by
an
Autoscaler
Every
change
is
a
new
AMI
45. Ne:lix
PaaS
Principles
• Maximum
FuncMonality
– Developer
producMvity
and
agility
• Leverage
as
much
of
AWS
as
possible
– AWS
is
making
huge
investments
in
features/scale
• Interfaces
that
isolate
Apps
from
AWS
– Avoid
lock-‐in
to
specific
AWS
API
details
• Portability
is
a
long
term
goal
– Gets
easier
as
other
vendors
catch
up
with
AWS
46. Ne:lix
Global
PaaS
Features
• Supports
all
AWS
Availability
Zones
and
Regions
• Supports
mulMple
AWS
accounts
{test,
prod,
etc.}
• Cross
Region/Acct
Data
ReplicaMon
and
Archiving
• InternaMonalized,
Localized
and
GeoIP
rouMng
• Security
is
fine
grain,
dynamic
AWS
keys
• Autoscaling
to
thousands
of
instances
• Monitoring
for
millions
of
metrics
• ProducMve
for
100s
of
developers
on
one
product
• 25M+
users
USA,
Canada,
LaMn
America,
UK,
Eire
47. Basic
PaaS
EnMMes
• AWS
Based
EnMMes
– Instances
and
Machine
Images,
ElasMc
IP
Addresses
– Security
Groups,
Load
Balancers,
Autoscale
Groups
– Availability
Zones
and
Geographic
Regions
• Ne:lix
PaaS
EnMMes
– ApplicaMons
(registered
services)
– Clusters
(versioned
Autoscale
Groups
for
an
App)
– ProperMes
(dynamic
hierarchical
configuraMon)
48. Core
PaaS
Services
• AWS
Based
Services
– S3
storage,
to
5TB
files,
parallel
mulMpart
writes
– SQS
–
Simple
Queue
Service.
Messaging
layer.
• Ne:lix
Based
Services
– EVCache
–
memcached
based
ephemeral
cache
– Cassandra
–
distributed
persistent
data
store
• External
Services
– GeoIP
Lookup
interfaced
to
a
vendor
– Secure
Keystore
HSM
49. Instance
Architecture
Linux
Base
AMI
(CentOS
or
Ubuntu)
OpMonal
Apache
frontend,
Java
(JDK
6
or
7)
memcached,
non-‐java
apps
AppDynamics
Monitoring
appagent
monitoring
Tomcat
Log
rotaMon
ApplicaMon
war
file,
base
Healthcheck,
status
to
S3
GC
and
thread
servlet,
pla:orm,
interface
servlets,
JMX
interface,
AppDynamics
dump
logging
jars
for
dependent
services
Servo
autoscale
machineagent
Epic
50. Security
Architecture
• Instance
Level
Security
baked
into
base
AMI
– Login:
ssh
only
allowed
via
portal
(not
between
instances)
– Each
app
type
runs
as
its
own
userid
app{test|prod}
• AWS
Security,
IdenMty
and
Access
Management
– Each
app
has
its
own
security
group
(firewall
ports)
– Fine
grain
user
roles
and
resource
ACLs
• Key
Management
– AWS
Keys
dynamically
provisioned,
easy
updates
– High
grade
app
specific
key
management
support
51. ConMnuous
IntegraMon
/
Release
Lightweight
process
scales
as
the
organizaMon
grows
• No
centralized
two-‐week
sprint/release
“train”
• Thousands
of
builds
a
day,
tens
of
releases
• Engineers
release
at
their
own
pace
• Unit
of
release
is
a
web
service,
over
200
so
far…
• Dependencies
handled
as
excepMons
52. Hello
World?
Ge•ng
started
for
a
new
developer…
• Register
the
“helloadrian”
app
name
in
Asgard
• Get
the
example
helloworld
code
from
perforce
• Edit
some
properMes
to
update
the
name
etc.
• Check-‐in
the
changes
• Clone
a
Jenkins
build
job
• Build
the
code
• Bake
the
code
into
an
Amazon
Machine
Image
• Use
Asgard
to
setup
an
AutoScaleGroup
with
the
AMI
• Check
instance
healthcheck
is
“Up”
using
Asgard
• Hit
the
URL
to
get
“HTTP
200,
Hello”
back
53. Register
new
applicaMon
name
naming
rules:
all
lower
case
with
underscore,
no
spaces
or
dashes
59. Portals
and
Explorers
• Ne:lix
ApplicaMon
Console
(Asgard/NAC)
– Primary
AWS
provisioning/config
interface
• AWS
Usage
Analyzer
– Breaks
down
costs
by
applicaMon
and
resource
• Cassandra
Explorer
– Browse
clusters,
keyspaces,
column
families
• Base
Server
Explorer
– Browse
service
endpoints
configuraMon,
perf
61. Pla:orm
Services
• Discovery
–
service
registry
for
“ApplicaMons”
• IntrospecMon
–
Entrypoints
• Cryptex
–
Dynamic
security
key
management
• Geo
–
Geographic
IP
lookup
• ConfiguraMon
Service
–
Dynamic
properMes
• LocalizaMon
–
manage
and
lookup
local
translaMons
• Evcache
–
ephemeral
volaMle
cache
• Cassandra
–
Cross
zone/region
distributed
data
store
• Zookeeper
–
Distributed
CoordinaMon
(Curator)
• Various
proxies
–
access
to
old
datacenter
stuff
62. IntrospecMon
-‐
Entrypoints
• REST
API
for
tools,
apps,
explorers,
monkeys…
– E.g.
GET
/REST/v1/instance/$INSTANCE_ID
• AWS
Resources
– Autoscaling
Groups,
EIP
Groups,
Instances
• Ne:lix
PaaS
Resources
– Discovery
ApplicaMons,
Clusters
of
ASGs,
History
• Full
History
of
all
Resources
– Supports
Janitor
Monkey
cleanup
of
unused
resources
63. Entrypoints
Queries
MongoDB
used
for
low
traffic
complex
queries
against
complex
objects
Descrip2on
Range
expression
Find
all
acMve
instances.
all()
Find
all
instances
associated
with
a
group
%(cloudmonkey)
name.
Find
all
instances
associated
with
a
/^cloudmonkey$/discovery()
discovery
group.
Find
all
auto
scale
groups
with
no
instances.
asg(),-‐has(INSTANCES;asg())
How
many
instances
are
not
in
an
auto
count(all(),-‐info(eval(INSTANCES;asg())))
scale
group?
What
groups
include
an
instance?
*(i-‐4e108521)
What
auto
scale
groups
and
elasMc
load
filter(TYPE;asg,elb;*(i-‐4e108521))
balancers
include
an
instance?
What
instance
has
a
given
public
ip?
filter(PUBLIC_IP;174.129.188.{0..255};all())
64. Metrics
Framework
• System
and
ApplicaMon
– CollecMon,
AggregaMon,
Querying
and
ReporMng
– Non-‐blocking
logging,
avoids
log4j
lock
contenMon
– Honu-‐Streaming
-‐>
S3
-‐>
EMR
-‐>
Hive
• Performance,
Robustness,
Monitoring,
Analysis
– Tracers,
Counters
–
explicit
code
instrumentaMon
log
– SLA
–
service
level
response
Mme
percenMles
– Servo
annotated
JMX
extract
to
Cloudwatch
• Latency
TesMng
and
InspecMon
Infrastructure
– Latency
Monkey
injects
random
delays
and
errors
into
service
responses
– Base
Server
Explorer
Inspect
client
Mmeouts
– Global
property
management
to
change
client
Mmeouts
65. Interprocess
Communica2on
• Discovery
Service
registry
for
“applicaMons”
– “here
I
am”
call
every
30s,
drop
a7er
3
missed
– “where
is
everyone”
call
– Redundant,
distributed,
moving
to
Zookeeper
• NIWS
–
Ne:lix
Internal
Web
Service
client
– So7ware
Middle
Tier
Load
Balancer
– Failure
retry
moves
to
next
instance
– Many
opMons
for
encoding,
etc.
66. Security
Key
Management
• AKMS
– Dynamic
Key
Management
interface
– Update
AWS
keys
at
runMme,
no
restart
– All
keys
stored
securely,
none
on
disk
or
in
AMI
• Cryptex
-‐
Flexible
key
store
– Low
grade
keys
processed
in
client
– Medium
grade
keys
processed
by
Cryptex
service
– High
grade
keys
processed
by
hardware
(Ingrian)
67. AWS
Persistence
Services
• SimpleDB
– Got
us
started,
migrated
to
Cassandra
now
– NFSDB
-‐
Instrumented
wrapper
library
– Domain
and
Item
sharding
(workarounds)
• S3
– Upgraded/Instrumented
JetS3t
based
interface
– Supports
mulMpart
upload
and
5TB
files
– Global
S3
endpoint
management
68. Ne5lix
Pla5orm
Persistence
• Ephemeral
VolaMle
Cache
–
evcache
– Discovery-‐aware
memcached
based
backend
– Client
abstracMons
for
zone
aware
replicaMon
– OpMon
to
write
to
all
zones,
fast
read
from
local
• Cassandra
– Highly
available
and
scalable
(more
later…)
• MongoDB
– Complex
object/query
model
for
small
scale
use
• MySQL
– Hard
to
scale,
legacy
and
small
relaMonal
models
69. Priam
–
Cassandra
AutomaMon
Available
at
h=p://github.com/ne:lix
• Ne:lix
Pla:orm
Tomcat
Code
• Zero
touch
auto-‐configuraMon
• State
management
for
Cassandra
JVM
• Token
allocaMon
and
assignment
• Broken
node
auto-‐replacement
• Full
and
incremental
backup
to
S3
• Restore
sequencing
from
S3
• Grow/Shrink
Cassandra
“ring”
70. Astyanax
Available
at
h=p://github.com/ne:lix
• Cassandra
java
client
• API
abstracMon
on
top
of
Thri7
protocol
• “Fixed”
ConnecMon
Pool
abstracMon
(vs.
Hector)
– Round
robin
with
Failover
– Retry-‐able
operaMons
not
Med
to
a
connecMon
– Ne:lix
PaaS
Discovery
service
integraMon
– Host
reconnect
(fixed
interval
or
exponenMal
backoff)
– Token
aware
to
save
a
network
hop
–
lower
latency
– Latency
aware
to
avoid
compacMng/repairing
nodes
–
lower
variance
• Batch
mutaMon:
set,
put,
delete,
increment
• Simplified
use
of
serializers
via
method
overloading
(vs.
Hector)
• ConnecMonPoolMonitor
interface
for
counters
and
tracers
• Composite
Column
Names
replacing
deprecated
SuperColumns
71. Astyanax
Query
Example
Paginate
through
all
columns
in
a
row
ColumnList<String>
columns;
int
pageize
=
10;
try
{
RowQuery<String,
String>
query
=
keyspace
.prepareQuery(CF_STANDARD1)
.getKey("A")
.setIsPaginaMng()
.withColumnRange(new
RangeBuilder().setMaxSize(pageize).build());
while
(!(columns
=
query.execute().getResult()).isEmpty())
{
for
(Column<String>
c
:
columns)
{
}
}
}
catch
(ConnecMonExcepMon
e)
{
}
72. High
Availability
• Cassandra
stores
3
local
copies,
1
per
zone
– Synchronous
access,
durable,
highly
available
– Read/Write
One
fastest,
least
consistent
-‐
~1ms
– Read/Write
Quorum
2
of
3,
consistent
-‐
~3ms
• AWS
Availability
Zones
– Separate
buildings
– Separate
power
etc.
– Fairly
close
together
73. “TradiMonal”
Cassandra
Write
Data
Flows
Single
Region,
MulMple
Availability
Zone,
Not
Token
Aware
Cassandra
• Disks
• Zone
A
2
2
4
2
1. Client
Writes
to
any
Cassandra
3
3
Cassandra
If
a
node
goes
offline,
Cassandra
Node
• Disks
5 • Disks
5
hinted
handoff
2. Coordinator
Node
• Zone
C
1 • Zone
A
completes
the
write
replicates
to
nodes
when
the
node
comes
and
Zones
Non
Token
back
up.
3. Nodes
return
ack
to
Aware
coordinator
Clients
Requests
can
choose
to
4. Coordinator
returns
3
wait
for
one
node,
a
Cassandra
Cassandra
ack
to
client
• Disks
• Disks
5
quorum,
or
all
nodes
to
5. Data
wri=en
to
• Zone
C
• Zone
B
ack
the
write
internal
commit
log
disk
(no
more
than
Cassandra
SSTable
disk
writes
and
• Disks
10
seconds
later)
• Zone
B
compacMons
occur
asynchronously
74. Astyanax
-‐
Cassandra
Write
Data
Flows
Single
Region,
MulMple
Availability
Zone,
Token
Aware
Cassandra
• Disks
• Zone
A
1. Client
Writes
to
Cassandra
2
2
Cassandra
If
a
node
goes
offline,
nodes
and
Zones
• Disks
3 • Disks
3
hinted
handoff
2. Nodes
return
ack
to
• Zone
C
1 • Zone
A
completes
the
write
client
3. Data
wri=en
to
Token
when
the
node
comes
back
up.
internal
commit
log
Aware
disks
(no
more
than
Clients
2
Requests
can
choose
to
10
seconds
later)
Cassandra
Cassandra
wait
for
one
node,
a
• Disks
• Disks
3
quorum,
or
all
nodes
to
• Zone
C
• Zone
B
ack
the
write
Cassandra
SSTable
disk
writes
and
• Disks
• Zone
B
compacMons
occur
asynchronously
75. Data
Flows
for
MulM-‐Region
Writes
Token
Aware,
Consistency
Level
=
Local
Quorum
1. Client
writes
to
local
replicas
If
a
node
or
region
goes
offline,
hinted
handoff
2. Local
write
acks
returned
to
completes
the
write
when
the
node
comes
back
up.
Client
which
conMnues
when
Nightly
global
compare
and
repair
jobs
ensure
2
of
3
local
nodes
are
everything
stays
consistent.
commi=ed
3. Local
coordinator
writes
to
remote
coordinator.
Cassandra
100+ms
latency
4. When
data
arrives,
remote
Cassandra
• Disks
• Disks
• Zone
A
• Zone
A
coordinator
node
acks
and
Cassandra
2
2
Cassandra
Cassandra
4
Cassandra
6
6
3
5
Disks
6
copies
to
other
remote
zones
6
• Disks
• Disks
• Zone
C
• Zone
A
•
• Zone
C
4
Disks
A
•
• Zone
1
4
5. Remote
nodes
ack
to
local
US
EU
coordinator
Clients
Clients
Cassandra
2
Cassandra
Cassandra
5
Cassandra
6. Data
flushed
to
internal
• Disks
• Zone
C
• Disks
6
• Zone
B
• Disks
• Zone
C
• Disks
6
• Zone
B
commit
log
disks
(no
more
Cassandra
Cassandra
than
10
seconds
later)
• Disks
• Disks
• Zone
B
• Zone
B
77. Rules
of
the
Roadie
• Don’t
lose
stuff
• Make
sure
it
scales
• Figure
out
when
it
breaks
and
what
broke
• Yell
at
the
right
guy
to
fix
it
• Keep
everything
organized
78. Cassandra
Backup
• Full
Backup
Cassandra
Cassandra
Cassandra
– Time
based
snapshot
– SSTable
compress
-‐>
S3
Cassandra
Cassandra
• Incremental
S3
Backup
Cassandra
Cassandra
– SSTable
write
triggers
compressed
copy
to
S3
Cassandra
Cassandra
• Archive
Cassandra
Cassandra
– Copy
cross
region
A
79. ETL
for
Cassandra
• Data
is
de-‐normalized
over
many
clusters!
• Too
many
to
restore
from
backups
for
ETL
• SoluMon
–
read
backup
files
using
Hadoop
• Aegisthus
– h=p://techblog.ne:lix.com/2012/02/aegisthus-‐bulk-‐data-‐pipeline-‐out-‐of.html
– High
throughput
raw
SSTable
processing
– Re-‐normalizes
many
clusters
to
a
consistent
view
– Extract,
Transform,
then
Load
into
Teradata
80. Cassandra
Archive
A
Appropriate
level
of
paranoia
needed…
• Archive
could
be
un-‐readable
– Restore
S3
backups
weekly
from
prod
to
test,
and
daily
ETL
• Archive
could
be
stolen
– PGP
Encrypt
archive
• AWS
East
Region
could
have
a
problem
– Copy
data
to
AWS
West
• ProducMon
AWS
Account
could
have
an
issue
– Separate
Archive
account
with
no-‐delete
S3
ACL
• AWS
S3
could
have
a
global
problem
– Create
an
extra
copy
on
a
different
cloud
vendor….
81. Tools
and
AutomaMon
• Developer
and
Build
Tools
– Jira,
Perforce,
Eclipse,
Jenkins,
Ivy,
ArMfactory
– Builds,
creates
.war
file,
.rpm,
bakes
AMI
and
launches
• Custom
Ne:lix
ApplicaMon
Console
– AWS
Features
at
Enterprise
Scale
(hide
the
AWS
security
keys!)
– Auto
Scaler
Group
is
unit
of
deployment
to
producMon
• Open
Source
+
Support
– Apache,
Tomcat,
Cassandra,
Hadoop
– Datastax
support
for
Cassandra,
AWS
support
for
Hadoop
via
EMR
• Monitoring
Tools
– Alert
processing
gateway
into
Pagerduty
– AppDynamics
–
Developer
focus
for
cloud
h=p://appdynamics.com
82. Scalability
TesMng
• Cloud
Based
TesMng
–
fricMonless,
elasMc
– Create/destroy
any
sized
cluster
in
minutes
– Many
test
scenarios
run
in
parallel
• Test
Scenarios
– Internal
app
specific
tests
– Simple
“stress”
tool
provided
with
Cassandra
• Scale
test,
keep
making
the
cluster
bigger
– Check
that
tooling
and
automaMon
works…
– How
many
ten
column
row
writes/sec
can
we
do?
86. Chaos
Monkey
• Computers
(Datacenter
or
AWS)
randomly
die
– Fact
of
life,
but
too
infrequent
to
test
resiliency
• Test
to
make
sure
systems
are
resilient
– Allow
any
instance
to
fail
without
customer
impact
• Chaos
Monkey
hours
– Monday-‐Thursday
9am-‐3pm
random
instance
kill
• ApplicaMon
configuraMon
opMon
– Apps
now
have
to
opt-‐out
from
Chaos
Monkey
87. Responsibility
and
Experience
• Make
developers
responsible
for
failures
– Then
they
learn
and
write
code
that
doesn’t
fail
• Use
Incident
Reviews
to
find
gaps
to
fix
– Make
sure
its
not
about
finding
“who
to
blame”
• Keep
Mmeouts
short,
fail
fast
– Don’t
let
cascading
Mmeouts
stack
up
• Make
configuraMon
opMons
dynamic
– You
don’t
want
to
push
code
to
tweak
an
opMon
89. PaaS
OperaMonal
Model
• Developers
– Provision
and
run
their
own
code
in
producMon
– Take
turns
to
be
on
call
if
it
breaks
(pagerduty)
– Configure
autoscalers
to
handle
capacity
needs
• DevOps
and
PaaS
(aka
NoOps)
– DevOps
is
used
to
build
and
run
the
PaaS
– PaaS
constrains
Dev
to
use
automaMon
instead
– PaaS
puts
more
responsibility
on
Dev,
with
tools
90. What’s
Le7
for
Corp
IT?
• Corporate
Security
and
Network
Management
– Billing
and
remnants
of
streaming
service
back-‐ends
in
DC
• Running
Ne:lix’
DVD
Business
– Tens
of
Oracle
instances
Corp
WiFi
Performance
– Hundreds
of
MySQL
instances
– Thousands
of
VMWare
VMs
– Zabbix,
CacM,
Splunk,
Puppet
• Employee
ProducMvity
– Building
networks
and
WiFi
– SaaS
OneLogin
SSO
Portal
– Evernote
Premium,
Safari
Online
Bookshelf,
Dropbox
for
Teams
– Google
Enterprise
Apps,
Workday
HCM/Expense,
Box.com
– Many
more
SaaS
migraMons
coming…
91. ImplicaMons
for
IT
OperaMons
• Cloud
is
run
by
developer
organizaMon
– Product
group’s
“IT
department”
is
the
AWS
API
and
PaaS
– CorpIT
handles
billing
and
some
security
funcMons
Cloud
capacity
is
10x
bigger
than
Datacenter
– Datacenter
oriented
IT
didn’t
scale
up
as
we
grew
– We
moved
a
few
people
out
of
IT
to
do
DevOps
for
our
PaaS
• TradiMonal
IT
Roles
and
Silos
are
going
away
– We
don’t
have
SA,
DBA,
Storage,
Network
admins
for
cloud
– Developers
deploy
and
“run
what
they
wrote”
in
producMon
97. Jenkins
Architecture
x86_64
slave
11
x86_64
slave
1
x86_64
slave
buildnode01
slave
buildnode01
1
x86_64
slave
Standard
buildnode01
custom
slaves
custom
slaves
buildnode01
group
custom
slaves
misc.
architecture
custom
slaves
misc.
architecture
Amazon
Linux
misc.
architecture
custom
slaves
Single
Master
misc.
architecture
Ad-‐hoc
slaves
m1.xlarge
misc.
architecture
Red
Hat
Linux
misc.
O/S
&
2x
quad
core
x86_64
architectures
26G
RAM
x86_64
slave
11
x86_64
slave
slave
Custom
~40
custom
slaves
buildnode01
1
x86_64
slave
buildnode01
group
buildnode01
maintained
by
product
Amazon
Linux
teams
various
us-‐west-‐1
VPC
Ne:lix
data
center
Ne:lix
data
center
and
office
98. Other
Uses
of
Jenkins
Maintence
of
test
and
prod
Cassandra
clusters
Automated
integraMon
tests
for
bake
and
deploy
ProducMon
bake
and
deployment
Housekeeping
of
the
build
/
deploy
infrastructure
99. Ne:lix
Extensions
to
Jenkins
" Job
DSL
plugin:
allow
jobs
to
be
set
up
with
minimal
definiMon,
using
templates
and
a
Groovy-‐based
DSL
" Housekeeping
and
maintenance
processes
implemented
as
Jenkins
jobs,
system
Groovy
scripts
100. The
DynaSlave
Plugin
What
We
Have
" Exposes
a
new
endpoint
in
Jenkins
that
EC2
instances
in
VPC
use
for
registraMon
" Allows
a
slave
to
name
itself,
label
itself,
tell
Jenkins
how
many
executors
it
can
support
" EC2
==
Ephemeral.
Disconnected
nodes
that
are
gone
for
>
30
mins
are
reaped
" Sizing
handled
by
EC2
ASGs,
tweaks
passed
through
via
user
data
(labels,
names,
etc)
101. The
DynaSlave
Plugin
What’s
Next
" Enhanced
security/registraMon
of
nodes
" Dynamic
resource
management
" have
Jenkins
respond
to
build
demand
" Slave
groups
" Allows
us
to
create
specialized
pools
of
build
nodes
" Refresh
mechanism
for
slave
tools
" JDKs,
Ant
versions,
etc.
" Give
it
back
to
the
community
" watch
techblog.ne:lix.com!
102. The
Bakery
• Create
base
AMIs
– We
have
CentOS,
Ubuntu
and
Windows
base
AMIs
– All
the
generic
code,
apache,
tomcat
etc.
– Standard
system
and
applicaMon
monitoring
tools
– Update
~monthly
with
patches
and
new
versions
• Add
yummy
topping
and
bake
– Build
app
specific
AMI
including
all
code
etc.
– Bakery
mounts
EBS
snapshot,
installs
and
bakes
– One
bakery
per
region,
delivers
into
paastest
– Tweak
config
and
publish
AMI
to
paasprod
104. Accounts
Isolate
Concerns
• paastest
–
for
development
and
tesMng
– Fully
funcMonal
deployment
of
all
services
– Developer
tagged
“stacks”
for
separaMon
• paasprod
–
for
producMon
– Autoscale
groups
only,
isolated
instances
are
terminated
– Alert
rouMng,
backups
enabled
by
default
• paasaudit
–
for
sensiMve
services
– To
support
SOX,
PCI,
etc.
– Extra
access
controls,
audiMng
• paasarchive
–
for
disaster
recovery
– Long
term
archive
of
backups
– Different
region,
perhaps
different
vendor
105. ReservaMons
and
Billing
• Consolidated
Billing
– Combine
all
accounts
into
one
bill
– Pooled
capacity
for
bigger
volume
discounts
h=p://docs.amazonwebservices.com/AWSConsolidatedBilling/1.0/AWSConsolidatedBillingGuide.html
• ReservaMons
– Save
up
to
71%
on
your
baseline
load
– Priority
when
you
request
reserved
capacity
– Unused
reservaMons
are
shared
across
accounts
106. Cloud
Access
Gateway
• Datacenter
or
office
based
– A
separate
VM
for
each
AWS
account
– Two
per
account
for
high
availability
– Mount
NFS
shared
home
directories
for
developers
– Instances
trust
the
gateway
via
a
security
group
• Manage
how
developers
login
to
cloud
– Access
control
via
ldap
group
membership
– Audit
logs
of
every
login
to
the
cloud
– Similar
to
awsfabrictasks
ssh
wrapper
h=p://readthedocs.org/docs/awsfabrictasks/en/latest/
108. Now
Add
Code
Ne:lix
has
open
sourced
a
lot
of
what
you
need,
more
is
on
the
way…
109. Ne:lix
Open
Source
Strategy
• Release
PaaS
Components
git-‐by-‐git
– Source
at
github.com/ne:lix
–
we
build
from
it…
– Intros
and
techniques
at
techblog.ne:lix.com
– Blog
post
or
new
code
every
few
weeks
• MoMvaMons
– Give
back
to
Apache
licensed
OSS
community
– MoMvate,
retain,
hire
top
engineers
– “Peer
pressure”
code
cleanup,
external
contribuMons
110. Open
Source
Projects
and
Posts
Legend
Github
/
Techblog
Priam
Exhibitor
Servo
and
Autoscaling
Cassandra
as
a
Service
Zookeeper
as
a
Service
Scripts
Apache
ContribuMons
Astyanax
Honu
Curator
Techblog
Post
Cassandra
client
for
Log4j
streaming
to
Zookeeper
Pa=erns
Java
Hadoop
Coming
Soon
EVCache
CassJMeter
Circuit
Breaker
Memcached
as
a
Cassandra
test
suite
Robust
service
pa=ern
Service
Cassandra
Asgard
Discovery
Service
MulM-‐region
EC2
AutoScaleGroup
based
Directory
datastore
support
AWS
console
Aegisthus
ConfiguraMon
Chaos
Monkey
Hadoop
ETL
for
ProperMes
Service
Robustness
verificaMon
Cassandra
111. Asgard
Not
quite
out
yet…
• Runs
in
a
VM
in
our
datacenter
– So
it
can
deploy
to
an
empty
account
– Groovy/Grails/JVM
based
– Supports
all
AWS
regions
on
a
global
basis
• Hides
the
AWS
credenMals
– Use
AWS
IAM
to
issue
restricted
keys
for
Asgard
– Each
Asgard
instance
manages
one
account
– One
install
each
for
paastest,
paasprod,
paasaudit
112. “Discovery”
-‐
Service
Directory
• Map
an
instance
to
a
service
type
– Load
balance
over
clusters
of
instances
– Private
namespace,
so
DNS
isn’t
useful
– FoundaMon
service,
first
to
deploy
• Highly
available
distributed
coordinaMon
– Deploy
one
Apache
Zookeeper
instance
per
zone
– Ne:lix
Curator
includes
simple
discovery
service
– Ne:lix
Exhibitor
manages
Zookeeper
reliably
113. ConfiguraMon
ProperMes
Service
• Dynamic
hierarchical
&
propagates
in
seconds
– Client
Mmeouts,
feature
set
enables
– Region
specific
service
endpoints
– Cassandra
token
assignments
etc.
etc.
• Used
to
configure
everything
– So
everything
depends
on
it…
– Coming
soon
to
github
– Pluggable
backend
storage
interface
114. Persistence
services
• Use
SimpleDB
as
a
bootstrap
– Good
use
case
for
DynamoDB
or
SimpleDB
• Ne:lix
Priam
– Cassandra
automaMon
115. Monitoring,
alert
forwarding
• MulMple
monitoring
systems
– Internally
developed
data
collecMon
runs
on
AWS
– AppDynamics
APM
product
runs
as
external
SaaS
– When
one
breaks
the
other
is
usually
OK…
• Alerts
routed
to
the
developer
of
that
app
– Alert
gateway
combines
alerts
from
all
sources
– DeduplicaMon,
source
quenching,
rouMng
– Warnings
sent
via
email,
criMcal
via
pagerduty
116. Backups,
archives
• Cassandra
Backup
via
Priam
to
S3
bucket
– Create
versioned
S3
bucket
with
TTL
opMon
– Setup
service
to
encrypt
and
copy
to
archive
• Archive
Account
with
Read/Write
ACL
to
prod
– Setup
in
a
different
AWS
region
from
producMon
– Create
versioned
S3
bucket
with
TTL
opMon
117. Chaos
Monkey
• Install
it
on
day
1
in
test
and
producMon
• Prevents
people
from
doing
local
persistence
• Kill
anything
not
protected
by
an
ASG
• Supports
whitelist
for
temporary
do-‐not-‐kill
• Open
source
soon,
code
cleanup
in
progress…
118. You
take
it
from
here…
• Keep
watching
github
for
more
goodies
• Add
your
own
code
• Let
us
know
what
you
find
useful
• Bugs,
patches
and
addiMons
all
welcome
• See
you
at
AWS
Re:Invent?
119. Roadmap
for
2012
• More
resiliency
and
improved
availability
• More
automaMon,
orchestraMon
• “Hardening”
the
pla:orm,
code
clean-‐up
• Lower
latency
for
web
services
and
devices
• IPv6
support
• More
open
sourced
components
120. Wrap
Up
Answer
your
remaining
quesMons…
What
was
missing
that
you
wanted
to
cover?
121. Takeaway
NeVlix
has
built
and
deployed
a
scalable
global
PlaVorm
as
a
Service.
Key
components
of
the
NeVlix
PaaS
are
being
released
as
Open
Source
projects
so
you
can
build
your
own
custom
PaaS.
h=p://github.com/Ne:lix
h=p://techblog.ne:lix.com
h=p://slideshare.net/Ne:lix
h=p://www.linkedin.com/in/adriancockcro7
@adrianco
#ne:lixcloud
End
of
Part
3
of
3
123. You
want
an
Encore?
If
there
is
enough
Mme…
(there
wasn’t)
Something
for
the
hard
core
complex
adapMve
systems
people
to
digest.
125. Workload
CharacterisMcs
• A
quick
tour
through
a
taxonomy
of
workload
types
• Start
with
the
easy
ones
and
work
up
• Why
personalized
workloads
are
different
and
hard
• Some
examples
and
coping
strategies
5/15/12
Slide
254
126. Simple
Random
Arrivals
• Random
arrival
of
transacMons
with
fixed
mean
service
Mme
– Li=le’s
Law:
QueueLength
=
Throughput
*
Response
– UMlizaMon
Law:
UMlizaMon
=
Throughput
*
ServiceTime
• Complex
models
are
o7en
reduced
to
this
model
– By
averaging
over
longer
Mme
periods
since
the
formulas
only
work
if
you
have
stable
averages
– By
wishful
thinking
(i.e.
how
to
fool
yourself)
5/15/12
Slide
255
127. Mixed
random
arrivals
of
transacMons
with
stable
mean
service
Mmes
• Think
of
the
grocery
store
checkout
analogy
– Trolleys
full
of
shopping
vs.
baskets
full
of
shopping
– Baskets
are
quick
to
service,
but
get
stuck
behind
carts
– RelaMve
mixture
of
transacMon
types
starts
to
ma=er
• Many
transacMonal
systems
handle
a
mixture
– Databases,
web
services
• Consider
separaMng
fast
and
slow
transacMons
– So
that
we
have
a
“10
items
or
less”
line
just
for
baskets
– Separate
pools
of
servers
for
different
services
– The
old
rule
-‐
don’t
mix
OLTP
with
DSS
queries
in
databases
• Performance
is
o7en
thread-‐limited
– Thread
limit
and
slow
transacMons
constrains
maximum
throughput
• Model
mix
using
analyMcal
solvers
(e.g.
PDQ
perfdynamics.com)
5/15/12
Slide
256
128. Load
dependent
servers
–
varying
mean
service
Mmes
• Mean
service
Mme
may
increase
at
high
throughput
– Due
to
non-‐scalable
algorithms,
lock
contenMon
– System
runs
out
of
memory
and
starts
paging
or
frequent
GC
• Mean
service
Mme
may
also
decrease
at
high
throughput
– Elevator
seek
and
write
cancellaMon
opMmizaMons
in
storage
– Load
shedding
and
simplified
fallback
modes
• Systems
have
“Mpping
points”
if
the
service
Mme
increases
– Hysteresis
means
they
don’t
come
back
when
load
drops
– This
is
why
you
have
to
kill
catatonic
systems
– Best
designs
shed
load
to
be
stable
at
the
limit
–
circuit
breaker
pa=ern
– PracMcal
opMon
is
to
try
to
avoid
Mpping
points
by
reducing
variance
• Model
using
discrete
event
simulaMon
tools
– Behaviour
is
non-‐linear
and
hard
to
model
5/15/12
Slide
257
129. Self-‐similar
/
fractal
workloads
• Bursty
rather
than
random
arrival
rates
• Self-‐similar
– Looks
“random”
at
close
up,
stays
“random”
as
you
zoom
out
– Work
arrives
in
bursts,
transacMons
aren’t
independent
– Bursts
cluster
together
in
super-‐bursts,
etc.
• Network
packet
streams
tend
to
be
fractal
• Common
in
pracMce,
too
hard
to
model
– Probably
the
most
common
reason
why
your
model
is
wrong!
5/15/12
Slide
258
130. State
Dependent
Service
Workloads
• Personalized
services
that
store
user
state/history
– TransacMons
for
new
users
are
quick
– TransacMons
for
users
with
lots
of
state/history
are
slower
– As
user
base
builds
state
and
ages
you
get
into
trouble…
• Social
Networks,
RecommendaMon
Services
– Facebook,
Flickr,
Ne:lix,
Twi=er
etc.
• “Abandon
hope
all
ye
who
enter
here”
– Not
tractable
to
model,
repeatable
tests
are
tricky
– Long
fat
tail
response
Mme
distribuMon
and
Mmeouts
• Try
to
transform
workloads
to
more
tractable
forms
5/15/12
Slide
259
131. Example
-‐
Twi=er
Workload
• @adrianco
tweets
–
copy
to
4300
or
so
other
users
• @zoecello
tweets
many
Mmes
a
day
–
to
over
1M
users
• @barackobama
tweets
every
few
days
–
to
over
12M
users
• It’s
the
same
transacMon,
but
the
service
Mme
varies
by
several
orders
of
magnitude
• The
best
(most
acMve
and
connected
=
most
valuable)
users
trigger
a
“denial
of
service
a=ack”
on
the
systems
when
they
tweet
• Cascading
effect
as
many
others
re-‐tweet
5/15/12
Slide
260
132. Example
-‐
Ne:lix
Movie
Choosing
• “Pick
24
genres/subgenres
etc.
of
75
movies
each
for
me”
– used
by
TV
based
devices
like
Xbox360,
PS/3,
iPhone
app
• New
user
– No
history
of
what
they
have
rented
(DVD)
or
streamed
– No
star
raMngs
for
movies,
possibly
some
genre
raMngs
– Basic
demographic
info
– Fast
to
calculate,
easy
to
find
many
good
choices
to
return
• User
with
several
years
tenure
– Thousands
of
movies
rented
or
streamed,
“seen
it
already”
– Hundreds
to
thousands
of
star
raMngs,
lots
of
genre
raMngs
– Requests
may
Mme
out
and
return
fewer
or
worse
choices
5/15/12
Slide
261
133. Workload
Modelling
Survival
Methods
• Simplify
the
workload
algorithms
– move
from
hard
or
impossible
to
simpler
models
– decouple,
cache
and
pre-‐compute
to
get
constant
service
Mmes
• Stand
further
away
– averaging
is
your
friend
–
gets
rid
of
complex
fluctuaMons
• Minimalist
Models
– most
models
are
far
too
complex
–
the
classic
beginners
error…
– the
art
of
modelling
is
to
only
model
what
really
ma=ers
• Don’t
model
details
you
don’t
use
– model
peak
hour
of
the
week,
not
day
to
day
fluctuaMons
– e.g.
“Will
the
web
site
survive
next
Sunday
night?”
5/15/12
Slide
262