Oh Happy Life

OpenAI API 대신 ollama API 서버 사용해보기

OPENAI는 경쟁자 대비 성능과 편의성에서 앞서 나가고 있기 때문에 후발주자는 OpenAI의 API형태를 산업표준처럼 지원하고 있다.

실제 서비스를 위해서는 성능 좋은 OpenAI API 서비스를 사용해야하지만 학습하는 입장에서API호출하는 방법을 연습하기 위해서 Ollama Server가 제공하는 호환 API로 동작테스트해본다.

일반적으로 설치한 Ollama 서버는 0.0.0.0:11434 port에서 통상 동작한다. 다르게 변경했다면 그에 맞게 수정필요하다.

"v1/chat/completions"

이 부분은 OpenAI 가 제공한 API endpoint인데 후발주자들도 이 형식을 지원한다.

Ollama 서버이용

import requests
import json

def classify_intent_llm(text):
    system_prompt = f"""You are a helpful assistant that classifies user intents based on a few examples.

Examples:
- User: 오늘 서울의 날씨는 어때? -> Intent: weather
- User: 가까운 커피숍 어디 있어? -> Intent: location_search
- User: 최근에 출시된 대출 상품에 대해 알려줘. -> Intent: product_info

Consideration: Only outputs the name of intent.
"""

    user_prompt = f"""Classify the following user input:
User: {text} -> Intent:
"""

    # Ollama API 엔드포인트
    url = "http://localhost:11434/v1/chat/completions"
    
    # API 요청 데이터
    payload = {
        # "model": "gemma2",
        # "model": "deepseek-r1:8b",
        "model": "llama3.1:8b",        
        "messages": [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ],
        "stream": False
    }

    headers = {"Content-Type": "application/json"}  # 헤더 추가

    try:
        response = requests.post(url, json=payload, headers=headers)
        response.raise_for_status()  # HTTP 에러 체크
        
        result = response.json()
        return result['choices'][0]['message']['content']

        
    except requests.exceptions.RequestException as e:
        print(f"API 요청 중 오류 발생: {e}")
        return None

# 테스트
text = "오늘 서울 날씨 어때?"
intent = classify_intent_llm(text)
print(f"Text: {text} -> Predicted Intent: {intent}")

OpenAI API 이용

def classify_intent_llm(text):
    system_prompt = f"""You are a helpful assistant that classifies user intents based on a few examples.

Examples:
- User: 오늘 서울의 날씨는 어때? -> Intent: weather
- User: 가까운 커피숍 어디 있어? -> Intent: location_search
- User: 최근에 출시된 대출 상품에 대해 알려줘. -> Intent: product_info

Consideration: Only outputs the name of intent.
"""

    user_prompt = """Classify the following user input:
User: {} -> Intent:
"""

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt.format(text)}
        ],
        max_tokens=15,
        temperature=0.,
    )
    return response.choices[0].message.content.strip()

목적

질문 "오늘 서울 날씨 어때?"문장의 의도가 무엇인지
LLM(model : gemma2, llama3.1:8b, deepseek-r1:8b)을 통해서
의도(intent)를 파악하여 분류했다.

p.s.

12월에는 GGUF 추가하는라 낑낑되었는데
2개월 지난 지금은 LLM 이나 Ollama 홈페이지에서 바로 다운로드 되는 쓸만한 Model이 추가되었고

편의면에서 LLM을 사용하는 LM Studio나 open webui 같은 frontpage UI까지 접근성이 너무 좋아진것 같다.

Ollama API로 entity 추출

import requests
import json

def extract_entities(text):
    system_prompt = """You are a helpful assistant to extract entities from the given user input.

    ###Entity Categories
    - DATETIME
    - LOCATION
    - PERSON

    Output Consideration: Only return extracted entities as dict type and entity type for key and extracted entity as value.
    """

    user_prompt = text

    # Ollama API 엔드포인트
    url = "http://localhost:11434/v1/chat/completions"
    
    # API 요청 데이터
    payload = {
        # "model": "gemma2",
        # "model": "deepseek-r1:8b",
        # "model": "llama3.1:8b",        
        "model": "exaone3.5:2.4b",
        "messages": [
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ],
        "stream": False
    }

    headers = {"Content-Type": "application/json"}  # 헤더 추가

    try:
        response = requests.post(url, json=payload, headers=headers)
        response.raise_for_status()  # HTTP 에러 체크
        
        result = response.json()
        return result['choices'][0]['message']['content']

        
    except requests.exceptions.RequestException as e:
        print(f"API 요청 중 오류 발생: {e}")
        return None

# 테스트
text = "내일 파리 근방에 있는 괜찮은 호텔 추천해줄 수 있어?"
entity = extract_entities(text)
print(f"Text: {text} -> extract entity: {entity}")

Ollama Model중 한국어 학습한 모델이라도 원하는 데이타 포맷으로 나오지 않는 경우가 종종 있다. OpenAI가 아니라서 그런것 같다

prompt를 조정해야하고 원하는 포맷에서 벗어나는 형태로 해주는 경우가 있으므로 LLM이 아니라 전통적인 방식(rule baed, nlp ner model)을 사용해서 추출하는게 정확한 output을 얻을 수 있는 것 같다.

   ###Output Instructions:
    1. Extract entities from the input text.
    2. Return ONLY a dict object with entity types as keys and extracted entities as values.
    3. For each entity type, extract only one value per one key (the most relevant or first occurrence)
    4. Do not include any explanations, markdown code blocks, or additional text.
    5. The output should be a valid dict object that can be directly parsed.
    

OpenAI API로 entity 추출

# LLM으로 엔티티 추출
system_prompt = """You are a helpful assistant to extract entities from the given user input.

###Entity Categories
- DATETIME
- LOCATION
- PERSON

Output Consideration: Only return extracted entities as dict type and entity type for key and extracted entity as value.
"""

user_prompt = text

response = client.chat.completions.create(
    model='gpt-4o-mini',
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ],
    temperature=0.,
)

print(response.choices[0].message.content)

Oh Happy Life

이 블로그 검색

Ollama 서버이용

OpenAI API 이용

Ollama API로 entity 추출

태그

댓글

이 블로그의 인기 게시물

OS가 설치된 PM981A (512GB)를 A440Pro(2TB)로 NVME 마이그레이션 과정

llama 계열 gguf 제공되는 경우 가져와서 사용하는 예제

TUF Z390-PLUS GAMING 보드 불편한점