Reason: Searching for lightweight LLMs that can be used in VS Code.
reference
https://github.com/aidatatools/ollama-benchmark
how to Run
llm_benchmark run --no-sendinfo --custombenchmark=C:\Python311\Lib\site-packages\llm_benchmark\data\custombenchmarkmodels.yml
C:\Python311\Lib\site-packages\llm_benchmark\data\custombenchmarkmodels.yml
# Author: Peter Lim
# License:
# Created: 2025-03-30
version: 1.0
models:
- model: deepseek-r1:1.5b
- model: deepseek-r1:8b
- model: deepseek-coder-v2
- model: gemma:2b
- model: gemma2:9b
- model: gemma3:4b
- model: phi:2.7b
- model: phi3:3.8b
- model: phi4:14b
- model: mistral:7b
model_name = deepseek-r1:1.5b
Average of eval rate: 111.882 tokens/s
model_name = deepseek-r1:8b
Average of eval rate: 54.822 tokens/s
model_name = deepseek-coder-v2
Average of eval rate: 22.67 tokens/s
model_name = gemma:2b
Average of eval rate: 134.618 tokens/s
model_name = gemma2:9b
Average of eval rate: 21.354 tokens/s
model_name = gemma3:4b
Average of eval rate: 68.084 tokens/s
model_name = phi:2.7b
Average of eval rate: 114.992 tokens/s
model_name = phi3:3.8b
Average of eval rate: 95.154 tokens/s
model_name = phi4:14b
Average of eval rate: 8.296 tokens/s
model_name = mistral:7b
Average of eval rate: 69.11 tokens/s
결론
나의 Local PC CPU/GPU환경에 맞는 것을 사용하자
HW 업그레이드 없다면, 사용하고 있는 LLM model을 교체해야겠다.
테스트환경
Total memory size : 15.93 GB
cpu_info: AMD Ryzen 5 3600 6-Core Processor
gpu_info: NVIDIA GeForce RTX 2070 SUPER
os_version: Microsoft Windows 11 Pro
ollama_version: 0.6.2
댓글