Interactive PipelineAWS Down
Drag trending issues into the Generator to commission an article.
[AWS Service Health Dashboard] Service disruption: Increased Error Rates
We are seeing positive signs of recovery for many of the EC2 APIs, such as Describes and AllocateAddress. We recognize that customers are still experiencing errors when attempting to call the AssociateAddress API, and are unable to disassociate addresses from resources that are affected by the underlying power issue. We continue to work on multiple parallel paths to mitigate both of these issues. We recommend continuing to retry requests wherever possible. We expect our current mitigation efforts for these specific issues to complete within the the two to three hours. As we progress with these mitigation efforts, customers will observe higher success rates for these operations. Additionally, we are investigating ways to speed up these specific mitigation efforts, but are ensuring we do so safely. As of this time, power restoration is still several hours away. We will provide another update by 5:30 PM PST, or sooner if we have additional information to share.
View Source[AWS Service Health Dashboard] Service impact: EC2 API Errors
We are investigating increased launch errors and API errors in the EU-NORTH-1 Region. Existing instances are not affected by this issue.
View Source[AWS Service Health Dashboard] Service impact: Increased Connectivity Issues and API Error Rates
We continue to work on a localized power issue affecting a single Availability Zone (mes1-az2) in the ME-SOUTH-1 Region. In the impacted Availability Zone, EC2 Instances, DB Instances, EBS Volumes, and other AWS Services are also experiencing elevated error rates and latencies for some workflows. As part of our recovery effort, we have shifted traffic away from the impacted Availability Zone for most services. We recommend customers utilize one of the other Availability Zones in the ME-SOUTH-1 Region, as existing instances in other AZs remain unaffected by this issue. We are actively working to restore power and connectivity, at which time we will begin recovering affected resources. Currently, we expect recovery to take many hours. We will provide an update by 2:30 AM PST, or sooner if we have additional information to share.
View Source[AWS Architecture Blog] Architecting AI-powered resilience framework on AWS
In this post, you’ll learn how to architect and implement a five-layer AI-powered resilience framework that automatically discovers dependencies, generates targeted experiments, and integrates with your existing Continuous Integration/Continuous Deployment (CI/CD) pipelines. First, we’ll explore the key challenges in resilience testing. Then, we’ll walk through the five-layer architecture that solves these challenges. Finally, we’ll show you how to implement this, with phased rollout guidance for pilot, expansion, and organization-wide deployment.
View Source[AWS Security Bulletins] Key Commitment Issues in S3 Encryption Clients
Bulletin ID: AWS-2025-032 Scope: AWS Content Type: Important (requires attention) Publication Date: 2025/12/17 12:15 PM PST We identify the following CVEs: CVE-2025-14763 - Key Commitment Issues in S3 Encryption Client in Java CVE-2025-14764 - Key Commitment Issues in S3 Encryption Client in Go CVE-2025-14759 - Key Commitment Issues in S3 Encryption Client in .NET CVE-2025-14760 - Key Commitment Issues in S3 Encryption Client in C++ - part of the AWS SDK for C++ CVE-2025-14761 - Key Commitment Issues in S3 Encryption Client in PHP - part of the AWS SDK for PHP CVE-2025-14762 - Key Commitment Issues in S3 Encryption Client in Ruby - part of the AWS SDK for Ruby Description: S3 Encryption Clients for Java, Go, .NET, C++, PHP, and Ruby are open-source client-side encryption libraries used to facilitate writing and reading encrypted records to S3. When the encrypted data key (EDK) is stored in an "Instruction File" instead of S3's metadata record, the EDK is exposed to an "Invisible Salamanders" attack, which could allow the EDK to be replaced with a new key. Resolution: - S3 Encryption Client Java: <= 3.5.0 - S3 Encryption Client Go: <= 3.1.0 - S3 Encryption Client .NET: <= 3.1 - AWS SDK for C++: <= 1.11.711 - AWS SDK for PHP: <= 3.367.0 - AWS SDK for Ruby: <= 1.207.0
View Source[AWS Security Bulletins] CVE-2026-11400 and CVE-2026-11401
Bulletin ID: 2026-039-AWS Scope: AWS Content Type: Important (requires attention) Publication Date: 06/025/2026 12:15 PM PDT Description: Amazon Aurora PostgreSQL a fully managed relational database engine that's compatible with PostgreSQL. We identified CVE-2026-11400(JDBC) and CVE-2026-11401(Go), an issue in AWS Wrappers for Amazon Aurora PostgreSQL will allow for privilege escalation to rds_superuser role. A low privilege authenticated user can create a crafted function that could be executed with permissions of other Amazon Relational Database Service (RDS) users. Impacted versions: - AWS Advanced JDBC Wrapper >=3.0.0 and < 4.0.1 - AWS Advanced Go Wrapper release 2026-04-06 Please refer to the article below for the most up-to-date and complete information related to this AWS Security Bulletin.
View Source[AWS Architecture Blog] Preventing data exfiltration in machine learning environments with Amazon SageMaker AI
In this post, we demonstrate how iBusiness implemented a three-layered security architecture using Amazon SageMaker AI, virtual private cloud (VPC) endpoints, and Amazon WorkSpaces Secure Browser to prevent data exfiltration while maintaining data scientist productivity. You can adapt this approach to build secure machine learning environments that balance strict data protection with team scalability.
View Source[AWS Security Bulletins] Security Findings in SageMaker Python SDK
Bulletin ID: 2026-004-AWS Scope: AWS Content Type: Important (requires attention) Publication Date: 2026/02/02 14:30 PM PST Description: CVE-2026-1777 - Exposed HMAC in SageMaker Python SDK SageMaker Python SDK’s remote functions feature uses a per‑job HMAC key to protect the integrity of serialized functions, arguments, and results stored in S3. We identified an issue where the HMAC secret key is stored in environment variables and disclosed via the DescribeTrainingJob API. This allows third parties with DescribeTrainingJob permissions to extract the key, forge cloud-pickled payloads with valid HMACs, and overwrite S3 objects. CVE-2026-1778 - Insecure TLS Configuration in SageMaker Python SDK SageMaker Python SDK is an open source library for training and deploying machine learning models on Amazon SageMaker. We identified an issue where SSL certificate verification was globally disabled in the Triton Python backend. This configuration was introduced to work around SSL errors during model downloads from public sources (e.g., TorchVision) and it affected all HTTPS connections when the Triton Python model was imported. Impacted versions: - HMAC Configuration in SageMaker Python SDK v3 < v3.2.0 - HMAC Configuration in SageMaker Python SDK v2 < v2.256.0 - Insecure TLS Configuration in SageMaker Python SDK v3 < v3.1.1 - Insecure TLS Configuration in SageMaker Python SDK v2 < v2.256.0 Please refer to the article below for the most up-to-date and complete information related to this AWS Security Bulletin.
View Source[AWS What's New] AgentCore harness is now generally available
Today, AWS announces the general availability of the managed agent harness in Amazon Bedrock AgentCore, taking teams from idea to working agents in minutes. An agent is more than a model. If the model is the brain, the harness is the body: everything the brain needs to get work done. It runs the orchestration loop, executes tools, manages the context window, persists state across turns, recovers from failures, and isolates each session. The harness shapes how well an agent performs as much as the model does, and building a durable one is where most teams spend their time today. AgentCore harness provides that layer as a managed capability. Instead of coding the loop, customers define an agent in configuration: the model it uses, the tools it calls, the skills it accesses, and the instructions it follows, and AgentCore assembles and runs that loop. From that single definition, a production-grade agent runs in minutes in its own isolated environment, with a filesystem and shell, memory across sessions, skills including the AWS-curated catalog, and web browsing. This is not a starter tool teams outgrow: the configuration they start with is what they operate at scale, and when custom orchestration is needed, the harness exports to code on the same platform without rebuilding anything. Besides speed, AgentCore decouples the harness from the model. Customers can choose any model and switch providers mid-session without losing context or touching agent logic, for example planning with one model and writing code with another. The harness is also one piece of a single platform, not a hosting layer wrapped around a framework. It reaches tools through the same gateway that enforces security policies, and connects the agent to organizational knowledge and web search. Identity, memory, and observability come from that same platform, so every agent action is governed and traced from the first call without additional wiring. When a use case needs custom orchestration, a single CLI command exports the harness to Strands-based code on the same compute and primitives, with Claude Agent SDK coming soon as an export target. The agent declared on day one is the agent that runs at the thousandth, on the same foundation throughout. AgentCore harness is generally available today in all AWS Commercial Regions where AgentCore is available. Learn more using the documentation.
View Source[AWS What's New] Claude Sonnet 5 is now available on AWS
AWS now offers Claude Sonnet 5 - Anthropic's most capable Sonnet model and the first Sonnet model of Anthropic’s latest generation - bringing top-tier intelligence at Sonnet pricing for coding, agents, and everyday professional work at scale. Claude Sonnet 5 delivers strong performance across coding, professional work, and agentic tasks while maintaining the balance of capability, cost, and speed that teams get from Sonnet. For coding, it navigates large codebases, lands multi-file changes, and carries debugging and refactoring tasks through to completion with fewer rounds of correction. For agents, it calls tools precisely, holds state across many steps, and recovers from errors so more runs finish correctly the first time. For knowledge work, it builds spreadsheets, drafts documents, and turns unstructured material into structured analysis. Customers have two ways to access Claude Sonnet 5: Amazon Bedrock and Claude Platform on AWS. Amazon Bedrock keeps your data within AWS infrastructure and provides access to Claude Sonnet 5 through a unified service with AWS-managed features like Guardrails, Knowledge Bases, and regional data residency. To learn more, see the Amazon Bedrock documentation and regional availability. Claude Platform on AWS gives you direct access to Anthropic's native platform experience and capabilities via the AWS Console. Build, test, and deploy with the same APIs, features, and console experience you'd get working with Anthropic directly, unified with AWS billing and authentication. To get started, see the Claude Platform on AWS documentation.
View Source[AWS Architecture Blog] Reducing SMS OTP fraud with Vonage network-powered solutions and Amazon Cognito
In this post, we show how Vonage network-powered solutions work with Amazon Cognito to enhance many mobile-first use cases with network-level identity verification. Vonage network-powered solutions are a composable stack of real-time mobile operator intelligence, silent authentication, and integrated fraud protection, which uses the CUSTOM_AUTH flow to complete identity verification in under 5 seconds, with zero user interaction.
View Source[AWS Security Bulletins] CVE-2025-11462 AWS ClientVPN macOS Client Local Privilege Escalation
Bulletin ID: AWS-2025-020 Scope: AWS Content Type: Important (requires attention) Publication Date: 2025/10/07 01:30 PM PDT Description: AWS Client VPN is a managed client-based VPN service that enables secure access to AWS and on-premises resources. The AWS Client VPN client software runs on end-user devices, supporting Windows, macOS, and Linux and provides the ability for end users to establish a secure tunnel to the AWS Client VPN Service. We have identified CVE-2025-11462, an issue in AWS Client VPN. The macOS version of the AWS VPN Client lacked proper validation checks on the log destination directory during log rotation. This allowed a non-administrator user to create a symlink from a client log file to a privileged location (e.g., Crontab). Triggering an internal API with arbitrary inputs would then write these inputs to the privileged location on log rotation, allowing execution with root privileges. This issue does not affect Windows or Linux devices. Affected versions: AWS Client VPN Client versions 1.3.2 through 5.2.0
View Source[AWS What's New] AWS Glue Data Catalog now supports business context and semantic search (Preview)
Today, AWS announces the preview of business context and semantic search for AWS Glue Data Catalog, helping you discover and understand data by semantic meaning. You can now enrich your Glue Data Catalog tables, including those backed by S3 Tables, with glossary terms and custom metadata fields. You can also add skills to the catalog that direct agents to additional context about your data. With business context indexed alongside technical metadata, you can use the new Glue Search API to find data by semantic meaning, and ground your AI agents in trusted definitions rather than inferred context. You can use the new search capability to find tables in the catalog both by their structure, such as schema and table format, and by the business meaning you attach through glossary terms and descriptive metadata fields. This means an analyst exploring data or an agent reasoning about it can retrieve a table's definition, what its data represents, and how to use it correctly, in a single step. Any MCP-compatible agent, including Claude Code, Kiro, Cursor, and Codex, can get started with virtually no setup using the aws-data-analytics plugin from the Agent Toolkit for AWS. Business context and semantic search for AWS Glue Data Catalog is available in preview in the following AWS Regions: US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Ireland). To learn more, visit the AWS Glue User Guide. To connect an AI agent to Glue Data Catalog, install the aws-data-analytics plugin from the Agent Toolkit for AWS repository on GitHub.
View Source[AWS Security Bulletins] CVE-2026-8686 - Heap out-of-bounds read in coreMQTT MQTT5 property parsing
Bulletin ID: 2026-032-AWS Scope: AWS Content Type: Important (requires attention) Publication Date: 05/15/2026 11:45 AM PDT Description: coreMQTT is a lightweight MQTT client library for embedded devices. We identified CVE-2026-8686, an issue where missing bounds validation in the MQTT v5.0 SUBACK and UNSUBACK property parser in coreMQTT before 5.0.1 allows an MQTT broker to cause a denial of service (crash via heap out-of-bounds read) by sending a crafted packet. Impacted versions: v5.0.0 Please refer to the article below for the most up-to-date and complete information related to this AWS Security Bulletin.
View Source[AWS Security Bulletins] CVE-2026-9291 - Insecure Deserialization in Amazon Braket SDK Job Results Processing
Bulletin ID: 2026-036-AWS Scope: AWS Content Type: Important (requires attention) Publication Date: 05/22/2026 11:15 AM PDT Description: Amazon Braket SDK is an open-source Python library for interacting with the Amazon Braket quantum computing service, including managing hybrid quantum jobs and retrieving job results. We identified CVE-2026-9291, an insecure deserialization issue (CWE-502) in the job results processing component. The SDK's deserialize_values() function trusts the dataFormat field from an untrusted JSON file to control whether pickle.loads() is called on the data payload. A remote authenticated user with S3 write access to the job output bucket can modify the dataFormat field in results.json from PLAINTEXT to pickled_v4 and replace data values with executable payloads, achieving arbitrary code execution on any machine that processes job results. Impacted versions: >= 1.10.0 AND < 1.117.0 Please refer to the article below for the most up-to-date and complete information related to this AWS Security Bulletin.
View SourceHN: Ask HN: Global Internet or AWS Outage?
Hacker News Alert. URL: N/A
View SourceHN: Don't use us-east-1, or 'Why didn't ngrok go down in last week's AWS outage?'
Hacker News Alert. URL: https://ngrok.com/blog/dont-use-us-east-1/
View SourceHN: Ask HN: Reasonable ways to avoid single points of failure in web services?
Hacker News Alert. URL: N/A
View SourceHN: Small SaaS banned by Cloudflare after 4 years of being paying customer
Hacker News Alert. URL: N/A
View SourceHN: Heroku and AWS users... You don't have a business.
Hacker News Alert. URL: N/A
View SourceReddit [r/devops]: How are teams finding PII across messy data environments without false positive overload?
How are teams finding and managing PII across Snowflake, S3/object storage, SaaS apps, and legacy sources without drowning in false positives? Are p...
View SourceReddit [r/devops]: Loaded Crossplane's full doc set into a 1M context model to speed up our evaluation
We've been evaluating Crossplane for about 8 weeks. Our Terraform setup covers 3 cloud providers, around 40 modules, and state management across teams...
View SourceReddit [r/devops]: Looking for feedback/suggestions on my DRP structure
Hello, I'm a solo junior SRE, and I started writing disaster recovery plans from scratch almost two years ago, and I've been continually improving an...
View SourceReddit [r/devops]: The eBPF Re-Platforming Thesis: An Investor’s Due Diligence Guide
eBPF Foundation released their investor due diligence report and there are three key parts that align with what I'm seeing in the market. First is th...
View SourceReddit [r/devops]: Moving 900+ DBs . . . Twice
So you are almost done migrating 900+ RDS DBs from one AWS account to another for a client. They get bought. New owners.... Move them all to Azure now...
View SourceHN: Ask HN: Starting up on AWS?
Hacker News Alert. URL: N/A
View SourceReddit [r/devops]: How do you test Logstash pipelines?
Recently, I've been doing quite a bit of work around Logstash. My biggest gripe with Logstash is the lack of built in testing. In an ideal world I cou...
View SourceReddit [r/devops]: Started job as azure engineer using azure DevOps, cert worth getting?
Getting good help from senior engineers. Prior to this job, only had experience with aws and gcp. Used Jenkins and GitHub actions for deployment. Bu...
View SourceReddit [r/devops]: Weekly Self Promotion Thread
Hey r/devops, welcome to our weekly self-promotion thread! Feel free to use this thread to promote any projects, ideas, or any repos you're wanting t...
View SourceReddit [r/devops]: meme Monday
submitted by /u/Dubinko [link] [comments]...
View Source