Prompting DeepSeek. How smart it really is?
DeepSeek-R1 is a new and powerful AI model that is said to perform on par with leading models from companies like OpenAI but at a significantly lower cost. It also uses the new “Inference-Time Compute” (ITC) approach, like in OpenAI O1, which increases the model reasoning capabilities by performing more computations at the inference time, i.e., when LLM generates an answer.
This exploration will be based on my LLM intro article. We will feed LLM with text and asked it to generate a quiz with a set number of answers. Let’s dive in.
How to develop your first LLM app? Context and Promptengineering

Start Ollama Server
The new Inference-Time Compute feature requires considerably more computational resources, so unless you have an enthusiast-grade gaming or a professional GPU, we will need to rent a machine to do LLM inference.
We can do it cost-effectively with just a few clicks by leveraging cloudrift.ai. Here is how you can do this:
- Register at cloudrift.ai and add some balance — $20 will be more than enough for occasional use.
- Click “New” in the console and select “Container Mode.”

3. Select RTX 4090 — it is enough to run the 32b model.

4. Select “Recipes” and “DeepSeekR1–32b” from the list of available recipes.

The server will be ready in about ten seconds. Note the IP address, as we’ll be using that.

To check that the server is working — run the following command in the console, substituting the IP address with the rented machine address from the console page. It will take about 20 seconds to complete.
curl http://<IP_ADDRESS>:11434/v1/chat/completions \
-H "Content-Type: application/json" -d '{
"model": "deepseek-r1:32b",
"messages": [
{
"role": "system",
"content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
},
{
"role": "user",
"content": "Compose a poem that explains the concept of recursion in programming."
}
]
}'
Querying the LLM Server
Let’s use ollama-python for communicating with the remote LLM server (other option would be official OpenAI python client). The server address is be specified via the environment variable. Here is how it looks in the code
from ollama import Client
import os
LLM_SERVER_IP_ADDRESS = os.environ.get("LLM_SERVER_IP_ADDRESS")
client = Client(host=f"http://{LLM_SERVER_IP_ADDRESS}:11434")
After that, we can prompt the LLM. Separating context and the query makes the job easier for LLM which makes a big difference for smaller models like the one that we’re using.
# supply text to the model and get the context
prompt = f"""{context}"""
client = Client(host=f"http://{LLM_SERVER_IP_ADDRESS}:11434")
context = client.generate(model='deepseek-r1:32b', prompt=prompt)['context']
# generate a question based on the context
prompt = f"""
Generate one question with several answer choices based on the aforementioned text.
Separate the question and answer choices with a newline.
Do not provide any hints, explanations, or additional information.
Use asterisk "*" to denote the correct answer inline.
Don't use asterisk anywhere else and don't repeat the answer to simplify parsing.
"""
response = client.generate(model='deepseek-r1:32b', prompt=prompt, context=context)['response']
Parsing Response and Reasoning Section
The response of DeepSeek will contain an answer to our query and also the reasoning section. It is a text enclosed within <think> and </think> tags that allows us to peek into LLM thought process.
<think>
Alright, I’m trying to figure out what the doctor threatens Billy with if he doesn’t stop drinking. Let me go through the chapter again.
So, in Chapter I, Billy Bones is at the Admiral Benbow Inn, acting nervously around his sea chest and a parrot named Captain Flint. He drinks excessively and is confrontational. The landlord, Mr. Pew, describes him as rough and warns about his dangerous behavior…
</think>
What does Doctor Livesey threaten Billy with if he doesn’t stop drinking?
A) He threatens to report him to the authorities for his behavior.
B) He threatens to steal his parrot.
C) He threatens to have him monitored and dealt with if complaints arise.*
D) He threatens to take his treasure chest from him.
In production, it makes sense to ask LLM to use structured output like JSON and omit reasoning section. However, explaining your format to an LLM and getting it to follow it is a tedious process, so for simple tasks it can be easier to just rely on plain text output and simple parsing routine like the one below.
lines = response.split('\n')
think_mode = False
question = None
choices = []
correct_answer = 0
for i, line in enumerate(lines):
line = line.strip()
if line == '<think>':
think_mode = True
elif line == '</think>':
think_mode = False
elif line == '' or think_mode:
continue
elif question is None:
question = line
else:
if '*' in line:
correct_answer = len(choices)
line = line.replace('*', '')
choices.append(line)
return question, choices, correct_answer
Results
You’ll notice that even the powerful RTX 4090 takes nearly a minute to complete the two queries — much slower than LLama3 that we’ve been using in the previous tutorial, thanks to the inference-time compute approach.
However, the result is much nicer. In the previous tutorial, we needed to prompt the LLM several times to get a reasonable answer, but the DeepSeek model of the same size gets it right most of the time.
The reasoning capabilities of the model result in questions that are trickier and require high-level understanding of the text to properly answer.
For example, look at the reasoning section for the generated question below. The first chapter introduces a fictional sea song, “The fifteen men on the dead man’s Chest,” that Billy always sings. The model has identified the hidden meaning of this song and hasn’t simply asked the user to relate this phrase to a song that Billy is singing but instead inquired about the secret meaning of that song.
Okay, I need to generate a question with multiple-choice answers based on the provided text. The user wants one question, followed by several options, using an asterisk (*) for the correct answer.
First, I’ll read through the chapter to identify key points. The main themes are mystery around Billy’s past and his locked chest. There’s mention of enemies, the parrot, Doctor Livesey’s confrontation, and the song about the dead man’s chest.
I should focus on a central element that is clearly addressed in the text. The song “fifteen men on the dead man’s chest” seems important because it hints at treasure and connects to the locked chest Billy carries. This could be a good basis for the question.
Now, I’ll create a question about what the phrase refers to. The correct answer should tie into the idea of hidden treasure or something valuable, as indicated by Billy’s secretive behavior and enemies seeking him out.
Looking at the options:
A. A group of friends — Doesn’t fit; the context is more ominous.
B. An old song — While it’s a song, the significance lies beyond that.
C. Hidden treasure — This aligns with the themes of mystery and Billy guarding something valuable.
D. A dangerous parrot — The parrot is present but not the focus of this phrase.
So, the correct answer is C. I’ll structure the question to reflect this understanding.
Pretty impressive! Especially considering that this is a small 32b model.
Of course, there are some downsides. The model tends to overthink the problem and resulting questions are often ambiguous. It is also makes the model even less predictable and harder to control.
Conclusion
The new DeepSeek model with its reasoning capabilities is a step up from previous non-reasoning models. A model of the same size works much more reliably and can cleverly solve the problem. The downsides are significant computational cost, less predictability and more ambiguity. Note that inference-time compute feature can be disabled, but in that case the models behaves similarly to LLama and other popular non-ITC models of the same size.
Example code is available here.
If you enjoyed this article and want to support the team, please join our Discord and check out our GPU rental service — cloudrift.ai.
Prompting DeepSeek. How smart it really is? was originally published in Everyday AI on Medium, where people are continuing the conversation by highlighting and responding to this story.