Overview

AWS Bedrock is a managed service from Amazon Web Services (AWS) that facilitates the development of generative AI applications by providing access to a selection of foundation models (FMs). These FMs include models from Amazon, such as Amazon Titan, and third-party models from companies like AI21 Labs, Anthropic, Cohere, Meta, and Stability AI (AWS Bedrock homepage). The service aims to simplify the process of discovering, experimenting with, and deploying FMs, abstracting away much of the underlying infrastructure management.

Developers interact with Bedrock through a unified API, which allows them to invoke different FMs, customize them with their own data, and build generative AI-powered features. The customization capabilities include fine-tuning models with proprietary datasets and creating knowledge bases for retrieval-augmented generation (RAG) architectures. This approach allows models to generate responses based on specific, up-to-date information without requiring retraining (AWS Knowledge Bases documentation).

Bedrock integrates with other AWS services, enabling developers to incorporate generative AI functionalities into existing cloud architectures. For instance, data for model customization or knowledge bases can be stored in Amazon S3, and application logic can be orchestrated using AWS Lambda. The service is designed for enterprise use, offering features such as data privacy, security controls, and compliance certifications like SOC 2 and HIPAA eligibility (AWS Bedrock compliance details). This makes it suitable for organizations requiring adherence to specific regulatory standards while developing AI solutions.

The platform supports various use cases, including text generation for marketing content, code generation, summarization of documents, and conversational AI agents. Its modular design allows users to select specific FMs based on their application requirements and budget. The availability of multiple FMs from different providers gives developers flexibility in model choice, similar to how Google Cloud Vertex AI offers a range of models, including those from Google and third-party providers (Google Cloud Vertex AI overview). This vendor-neutral approach within a single platform is a common strategy in the managed AI service market.

Key features

  • Foundation Model Access: Provides API access to a suite of FMs, including Amazon Titan, Anthropic Claude, AI21 Labs Jurassic-2, Cohere Command, Meta Llama 2, and Stability AI Stable Diffusion.
  • Knowledge Bases for Amazon Bedrock: Enables creation of knowledge bases to connect FMs to organizational data sources for retrieval-augmented generation (RAG), improving accuracy and relevance of generated content (AWS Knowledge Bases for Bedrock).
  • Agents for Amazon Bedrock: Facilitates building conversational agents that can perform multi-step tasks, integrate with company systems, and access external tools (AWS Agents for Bedrock).
  • Model Customization: Offers capabilities to fine-tune FMs with proprietary datasets to tailor their behavior and performance for specific use cases.
  • Guardrails for Amazon Bedrock: Allows setting policies to detect and prevent generation of harmful or off-topic content, ensuring responsible AI deployment (AWS Guardrails for Bedrock).
  • Data Privacy and Security: Ensures customer data used for customization or RAG remains private and is not used to train underlying FMs without explicit consent.
  • Integration with AWS Services: Seamlessly integrates with other AWS services like S3 for data storage, Lambda for serverless compute, and CloudWatch for monitoring.

Pricing

AWS Bedrock pricing is based on a pay-per-use model, varying by the specific foundation model used, the type of operation (e.g., inference, customization), and the AWS region. As of June 2026, costs are generally incurred for:

  • On-Demand Inference: Billed per input token and output token for model invocations.
  • Provisioned Throughput: Allows reserving a specific throughput for consistent performance, billed hourly.
  • Model Customization: Billed for training hours and storage of custom models.
  • Knowledge Bases: Billed for vector storage, data ingestion, and retrieval operations.
  • Agents: Billed for orchestration steps and data processing.
AWS Bedrock Pricing Summary (as of June 2026)
Service Component Billing Metric Notes
Foundation Model Inference Per input token / Per output token Rates vary significantly by model (e.g., Anthropic Claude, Amazon Titan).
Model Customization Per training hour / Per custom model storage Includes fine-tuning and continued pre-training.
Knowledge Bases Per vector storage unit / Per data ingestion / Per retrieval unit Costs for storing embeddings and processing queries.
Agents Per orchestration step / Per data processing unit Incurred for agent execution and interaction.
Guardrails Per request / Per throughput unit Billed for content moderation and policy enforcement.

A free tier is available for certain models and usage levels, allowing initial experimentation without charge. Detailed and up-to-date pricing information is available on the AWS Bedrock pricing page.

Common integrations

  • Amazon S3: Used for storing training data for model customization and documents for knowledge bases (AWS Knowledge Bases S3 integration).
  • AWS Lambda: Serverless compute service often used to orchestrate calls to Bedrock FMs or process their outputs within applications (AWS Agents Lambda functions).
  • Amazon CloudWatch: Provides monitoring and logging capabilities for Bedrock API calls and usage metrics (AWS Bedrock monitoring).
  • AWS Identity and Access Management (IAM): Manages permissions and access control for Bedrock resources, ensuring secure operations (AWS Bedrock security and IAM).
  • Amazon SageMaker: Can be used alongside Bedrock for more advanced machine learning workflows, such as data preparation or custom model deployment outside of Bedrock's managed FMs.
  • LangChain: A framework designed to simplify the development of applications powered by large language models (LLMs), offering integrations with Bedrock as an LLM provider (LangChain AWS Bedrock integration).

Alternatives

  • Google Cloud Vertex AI: A unified MLOps platform offering access to Google's FMs (e.g., Gemini) and third-party models, with tools for data preparation, model training, and deployment.
  • Microsoft Azure OpenAI Service: Provides access to OpenAI's models (e.g., GPT-3, GPT-4, DALL-E) within the Azure environment, with enterprise-grade security and compliance features.
  • Hugging Face: Offers a platform for building, training, and deploying machine learning models, including a vast repository of open-source FMs and tools like Transformers and Diffusers.
  • Aleph Alpha Luminous: A European AI company providing a suite of large language and multimodal models with a focus on explainability and data privacy.
  • Databricks Foundation Model API: Enables access to open-source and proprietary FMs directly within the Databricks Lakehouse Platform, with capabilities for fine-tuning and serving.

Getting started

To interact with AWS Bedrock using the AWS SDK for Python (Boto3), you typically configure your client and then invoke a model. This example demonstrates how to invoke the Anthropic Claude model to generate text.

import boto3
import json

# Initialize the Bedrock client
bedrock_runtime = boto3.client(service_name='bedrock-runtime', region_name='us-east-1')

# Define the model ID and prompt
model_id = 'anthropic.claude-v2'
prompt_text = "Write a short, engaging blog post about the benefits of cloud computing for startups."

# Construct the request body for Claude
body = json.dumps({
    "prompt": f"\n\nHuman: {prompt_text}\n\nAssistant:",
    "max_tokens_to_sample": 300,
    "temperature": 0.7,
    "top_p": 0.9
})

# Invoke the model
response = bedrock_runtime.invoke_model(
    body=body,
    modelId=model_id,
    accept='application/json',
    contentType='application/json'
)

# Parse the response
response_body = json.loads(response.get('body').read())
completion = response_body.get('completion')

print(completion)

This Python code snippet sets up a connection to the Bedrock runtime, specifies the Anthropic Claude model, and sends a prompt. The response is then parsed to extract the generated text. For detailed setup and more complex interactions, refer to the AWS Bedrock API examples.