Hardware to Local LLM

Hardware

GPU Selection

The principal is finding GPU with the latest architecture and largest RAM within budget ¹. According to Ollama: “You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models.” ² Tested on a GTX970, 4GB of VRAM is sufficient for a 3.7B model (phi3).

model	RAM	TDP	price	comment
RTX2080Ti	22	250	2800	modified VRAM
RTX3090	24	350	8000	second-hand
RTX4060Ti	16	160	4000	brand new
RTX4070Ti	12	285	7000	brand new

Consider RAM, TDP and price, the winner is RTX4060Ti.

Benchmark

task	GTX970	RTX4060Ti
stable-diffusion-webui	01:53	00:11
transcript-with-whisper	24:34+24:42	00:49+02:21
ollama-with-phi3	798t/45s	639t/8s

stable-diffusion-webui: text-to-image generation with the following prompt: “sunflower, starry night, van gogh dancing with firefly”, resolution=512x512, batch count=4
transcript-with-whisper: transcript text recording with insanely-fast-whisper package, whisper-medium model, diarization enabled
ollama-with-phi3: chat with the following prompt: “tell me about llama the animal”, “tell me about llama the language model”, phi3 model

Application

personal finance analysis:

I have some financial records in the `beancount` format that I'd like analyzed. Could you review the data and provide suggestions to optimize my income and expenses? Here's the information:

<records>
</records>

This analysis is very important to me, so I appreciate your thorough and detailed suggestions.

meeting transcript and summary:

Optimize the provided transcription from a meeting by dividing it into three sections in Simplified Chinese:
1. Summary: Extract the key information from the meeting and present it in a bullet-point list. Focus on the main topics discussed, decisions made, and any significant viewpoints expressed.
2. Conclusion: Write a concise paragraph summarizing the overall outcome and general consensus achieved in the meeting. Highlight any major conclusions or strategic decisions that were reached.
3. Todos: Identify and list all actionable items or tasks that were decided upon during the meeting. Format these as bullet points, being specific about the tasks, responsible individuals, and any deadlines associated with these actions.

Ensure the transcription is accurate and fully captures the essence of the meeting, and present it clearly in Simplified Chinese.

Following is the transcription content:

<transcription>
</transcription>

April 30, 2024 · llm