
March 18, 2025
Harnessing DeepSeek-R1 on Amazon Bedrock: A Complete Guide to Setup, Fine-Tuning, and Local Deployment
Hi readers! Have you considered utilizing DeepSeek-R1 without complex systems? I wanted to do it, and Amazon Bedrock let it. The efficient DeepSeek-R1-Distill-LLaMA-8B model distills the LLaMA-3.1-8B model without losing performance. In Amazon Bedrock, I could perform difficult AI activities on a big scale while maintaining the ecosystem.
I will show you the whole process in this guide, from downloading the model and running it locally to fine-tuning it to make it work better for your needs. This tutorial will get you up and running quickly, no matter if you are interested in AI or a developer who wants to use it in production.
Prerequisites
You will need an AWS account that lets you access Amazon Bedrock before you can start. Installing the following packages should get your Python setup ready:
pip install huggingface_hub boto3
Setup Process
Downloading Model Weights
As the first step, get the DeepSeek-R1-Distill-Llama-8B type from Hugging Face. I used the huggingface_hub library for ease of use.
Here's how to do it:
from huggingface_hub import snapshot_download
# Download the DeepSeek-R1 model
snapshot_download(repo_id="deepseek-ai/DeepSeek-R1-Distill-Llama-8B",
local_dir="DeepSeek-R1-Distill-Llama-8B")
This command will pull all the necessary model files into a local directory.
Uploading to S3
After downloading the model, you must send it to an S3 bucket. Before you start, make sure you have set up a bucket in AWS.
Here's the command I used:
aws s3 cp DeepSeek-R1-Distill-Llama-8B s3://your-bucket/models/DeepSeek-R1-Distill-Llama-8B/ --recursive
This uploads the model weights while keeping the folder structure the same. Remember the S3 path because you will need it when you bring the model into Amazon Bedrock.
Importing to Amazon Bedrock
It is now time to put the model into Bedrock. Through the AWS Console, I found the process to be pretty easy:
- Open the AWS Console and go to Bedrock > Foundation Models > Imported Models.
- Click "Import Model" and name it something easy to remember, like my-DeepSeek-R1-Distill-Llama-8B.
- Give the S3 location of the model weights.
- Click "Import" and wait for the process to finish.
There will be a Model ARN once the transfer is complete. Write that down; you will need it for API calls and future launches.
Running the Model Locally
It was surprisingly easy to run DeepSeek-R1 locally. Using the transformers library, this is how I set up a simple inference pipeline:
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model and tokenizer
model_name = "deepseek-ai/DeepSeek-R1-Distill-Llama-8B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Generate a response
input_text = "Explain quantum computing in simple terms."
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
# Display the result
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
With this setup, I could test the model locally before uploading it on AWS. Testing prompts and outputs on my system by running inference was easy and quick.
Fine-Tuning DeepSeek-R1
The real power of DeepSeek-R1 shows up when you tweak it to do certain things. If I wanted the model to respond better to my business questions, I made a new dataset and tweaked the model locally.
Preparing the Dataset
I structured my dataset in a simple JSONL format:
{"prompt": "What is AI?", "response": "AI stands for artificial intelligence."}
This approach ensures that the model learns contextually relevant answers during training.
Running Fine-Tuning
Here's how I fine-tuned the model using the transformers library:
from transformers import TrainingArguments, Trainer, AutoModelForCausalLM
# Load the model and tokenizer
model = AutoModelForCausalLM.from_pretrained(model_name)
train_dataset = [{"input_ids": tokenizer.encode(item['prompt']),
"labels": tokenizer.encode(item['response'])}
for item in dataset]
# Define training parameters
training_args = TrainingArguments(
output_dir="./fine_tuned_model",
per_device_train_batch_size=2,
num_train_epochs=3,
save_steps=500
)
# Initialize the trainer
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset
)
trainer.train()
After a few training epochs, the model became noticeably more accurate with my specific prompts.
Uploading Fine-Tuned Model to S3
When I was done making the final changes, I updated the fine-tuned model to S3 for Bedrock deployment:
aws s3 cp fine_tuned_model/ s3://your-bucket/models/DeepSeek-R1-Finetuned/ --recursive
API Access on Amazon Bedrock
Using the Model ARN for API-driven inference, I imported the model into Bedrock. This is how I set up the boto3 API call:
import boto3
# Create a Bedrock client
client = boto3.client('bedrock-runtime')
# Make an inference request
response = client.invoke_model(
modelId="arn:aws:bedrock:your-region:your-account-id:model/my-DeepSeek-R1-Distill-Llama-8B",
body='{"input_text": "Explain the benefits of using Bedrock."}'
)
print(response['body'].read().decode("utf-8"))
I smoothly integrated the model into my processes with one single API request.
Conclusion
It changed everything to set up DeepSeek-R1 on Amazon Bedrock. The process went more smoothly than I thought, from downloading the model to running it locally to fine-tuning it for certain jobs. I now have a powerful AI tool that fits my needs due to Bedrock's growing ability and DeepSeek-R1's speed. Should you want to speed up your AI processes, I strongly suggest you try this setup.
171 views